Visible to Intel only — GUID: GUID-AE4DBFD6-44F6-4CEF-94DC-475558487740
Visible to Intel only — GUID: GUID-AE4DBFD6-44F6-4CEF-94DC-475558487740
OpenCL interoperability API
Overview
API extensions to interact with the underlying OpenCL run-time. More…
// namespaces
namespace dnnl::graph::ocl_interop;
// typedefs
typedef void* (*dnnl_graph_ocl_allocate_f)(
size_t size,
size_t alignment,
cl_device_id device,
cl_context context
);
typedef void (*dnnl_graph_ocl_deallocate_f)(
void *buf,
cl_device_id device,
cl_context context,
cl_event event
);
// global functions
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_allocator_create(
dnnl_graph_allocator_t* allocator,
dnnl_graph_ocl_allocate_f ocl_malloc,
dnnl_graph_ocl_deallocate_f ocl_free
);
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_with_allocator(
dnnl_engine_t* engine,
cl_device_id device,
cl_context context,
const_dnnl_graph_allocator_t alloc
);
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_from_cache_blob_with_allocator(
dnnl_engine_t* engine,
cl_device_id device,
cl_context context,
const_dnnl_graph_allocator_t alloc,
size_t size,
const uint8_t* cache_blob
);
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_compiled_partition_execute(
const_dnnl_graph_compiled_partition_t compiled_partition,
dnnl_stream_t stream,
size_t num_inputs,
const_dnnl_graph_tensor_t* inputs,
size_t num_outputs,
const_dnnl_graph_tensor_t* outputs,
const cl_event* deps,
int ndeps,
cl_event* return_event
);
Detailed Documentation
API extensions to interact with the underlying OpenCL run-time.
Typedefs
typedef void* (*dnnl_graph_ocl_allocate_f)(
size_t size,
size_t alignment,
cl_device_id device,
cl_context context
)
Allocation call-back function interface for OpenCL.
OpenCL allocator should be used for OpenCL GPU runtime. The call-back should return a USM device memory pointer.
Parameters:
size |
Memory size in bytes for requested allocation |
alignment |
The minimum alignment in bytes for the requested allocation |
device |
A valid OpenCL device used to allocate |
context |
A valid OpenCL context used to allocate |
Returns:
The memory address of the requested USM allocation.
typedef void (*dnnl_graph_ocl_deallocate_f)(
void *buf,
cl_device_id device,
cl_context context,
cl_event event
)
Deallocation call-back function interface for OpenCL.
OpenCL allocator should be used for OpenCL runtime. The call-back should deallocate a USM device memory returned by dnnl_graph_ocl_allocate_f. The event should be completed before deallocate the USM.
Parameters:
buf |
The USM allocation to be released |
device |
A valid OpenCL device the USM associated with |
context |
A valid OpenCL context used to free the USM allocation |
event |
A event which the USM deallocation depends on |
Global Functions
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_allocator_create(
dnnl_graph_allocator_t* allocator,
dnnl_graph_ocl_allocate_f ocl_malloc,
dnnl_graph_ocl_deallocate_f ocl_free
)
Creates an allocator with the given allocation and deallocation call-back function pointers.
Parameters:
allocator |
Output allocator |
ocl_malloc |
A pointer to OpenCL malloc function |
ocl_free |
A pointer to OpenCL free function |
Returns:
dnnl_success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_with_allocator(
dnnl_engine_t* engine,
cl_device_id device,
cl_context context,
const_dnnl_graph_allocator_t alloc
)
This API is a supplement for existing oneDNN engine API: dnnl_status_t DNNL_API dnnl_ocl_interop_engine_create( dnnl_engine_t *engine, cl_device_id device, cl_context context);.
Parameters:
engine |
Output engine. |
device |
Underlying OpenCL device to use for the engine. |
context |
Underlying OpenCL context to use for the engine. |
alloc |
Underlying allocator to use for the engine. |
Returns:
dnnl_success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_from_cache_blob_with_allocator(
dnnl_engine_t* engine,
cl_device_id device,
cl_context context,
const_dnnl_graph_allocator_t alloc,
size_t size,
const uint8_t* cache_blob
)
This API is a supplement for existing oneDNN engine API: dnnl_status_t DNNL_API dnnl_ocl_interop_engine_create_from_cache_blob( dnnl_engine_t *engine, cl_device_id device, cl_context context, size_t size, const uint8_t *cache_blob);.
Parameters:
engine |
Output engine. |
device |
The OpenCL device that this engine will encapsulate. |
context |
The OpenCL context (containing the device) that this engine will use for all operations. |
alloc |
Underlying allocator to use for the engine. |
size |
Size of the cache blob in bytes. |
cache_blob |
Cache blob of size size. |
Returns:
dnnl_success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_compiled_partition_execute(
const_dnnl_graph_compiled_partition_t compiled_partition,
dnnl_stream_t stream,
size_t num_inputs,
const_dnnl_graph_tensor_t* inputs,
size_t num_outputs,
const_dnnl_graph_tensor_t* outputs,
const cl_event* deps,
int ndeps,
cl_event* return_event
)
Execute a compiled partition with OpenCL runtime.
Parameters:
compiled_partition |
The handle of target compiled_partition. |
stream |
The stream used for execution |
num_inputs |
The number of input tensors |
inputs |
A list of input tensors |
num_outputs |
The number of output tensors |
outputs |
A non-empty list of output tensors |
deps |
Optional handle of list with cl_event dependencies. |
ndeps |
Number of dependencies. |
return_event |
The handle of cl_event. |
Returns:
dnnl_success on success and a status describing the error otherwise.