Visible to Intel only — GUID: GUID-14B8C96A-82BC-4935-BE59-F645CB53DB81
Visible to Intel only — GUID: GUID-14B8C96A-82BC-4935-BE59-F645CB53DB81
Constant Tensor Cache
The oneDNN Graph component supports the constant tensor cache feature, which is used to cache processed constant tensors such as reordered constant weights and folded constant scales to reduce redundant computation and improve performance. The feature is disabled by default. Users can use the graph API or environment variable to set or get specific cache capacity for different engine kinds (CPU and GPU).
Build-Time Controls
Build-time controls to enable or disable the constant tensor cache feature are not supported. Only run-time controls through the graph API or environment variables are supported. Refer to the following section.
Run-Time Controls
Constant Tensor Cache Capacity Control API
oneDNN Graph provides users with a pair of APIs to control the constant tensor cache feature. To enable the constant tensor cache and set the capacity to a specific engine kind, call the setter API. The unit of setter capacity API is megabytes (MB). New tensors won’t be cached when capacity is reached. To query the current capacity for a specific engine kind, call the getter API.
// setter API
@ref dnnl_graph_set_constant_tensor_cache_capacity
// getter API
@ref dnnl_graph_get_constant_tensor_cache_capacity
Environment Variable
In addition to a programmable API, oneDNN Graph also provides users with an environment variable named ONEDNN_GRAPH_CONSTANT_TENSOR_CACHE_CAPACITY to control the capacity. It accepts values in the form engine_kind:size or engine_kind1:size1;engine_kind2:size2. The first example below means the user can set capacity for one engine kind (cpu). The second example is that the capacity of cpu and gpu are set to 1024 MB and 2048 MB separately.
Environment variable |
Value(string) |
Description |
---|---|---|
ONEDNN_GRAPH_CONSTANT_TENSOR_CACHE_CAPACITY |
“cpu:size1;gpu:size2” |
Set cpu constant cache capacity size to size1 and gpu to size2 |
export ONEDNN_GRAPH_CONSTANT_TENSOR_CACHE_CAPACITY="cpu:1024"
export ONEDNN_GRAPH_CONSTANT_TENSOR_CACHE_CAPACITY="cpu:1024;gpu:2048"