Visible to Intel only — GUID: GUID-9C1AC15C-E19F-444A-95F9-DFC3FA65E6B7
Visible to Intel only — GUID: GUID-9C1AC15C-E19F-444A-95F9-DFC3FA65E6B7
Graph Operations
The graph API provides optimized kernels for the following computationally intense routines:
Routine | Description |
---|---|
Compute a (masked) matrix-vector product |
|
Compute a (masked) vector-matrix product |
|
Compute a (masked) matrix-matrix product |
|
Compute a (masked) transpose of a matrix |
Graph operations (except mkl_graph_transpose) support the following modes:
Single-stage mode. Single-stage execution computes the output object in a single call to a graph operation with an appropriate value for the parameter of type mkl_graph_request_t. See Graph API Glossary for a list of all possible options.
If the output object is sparse and the size of the corresponding arrays is likely not known in advance, the memory for the output object will be allocated inside the graph operation and can be deallocated only by calling an appropriate mkl_graph_<object>_destroy routine. To allocate all memory for the output on the user’s side, use multistage execution instead.
Multistage mode. Multistage execution constructs the output object over several calls to a graph operation, with each call requesting a specific stage. Unlike the single-stage mode, multistage execution allows you to allocate all memory for the output object. Only temporary memory will be allocated internally inside the graph routine. You must pass pointers to the allocations by calling an mkl_graph_<object>_set_<format> routine before each stage. These calls also specify the format of the final output object. The stage is specified through the parameter of type mkl_graph_request_t. See Graph API Glossary for a list of all possible options.
For choosing the best (performance-wise) format for the output, you can specify a method to be used for computations with an appropriate value for the parameter of type mkl_graph_method_t. For each graph operation which supports it, a desirable output format is described for a given configuration of input arguments. If you specify a format which is not considered to be the best inside the graph operation, your specified format will still be used internally.
As an example, consider computing a non-masked matrix-matrix product using mkl_graph_mxm in the multistage mode. Assume also that you want the output in CSR format (which is a preferred choice if both input matrices are also in CSR and the Gustavson algorithm is set for the method). Then you can have the following workflow shown in pseudo-code:
// Prepare the input matrices A and B. // Create an empty matrix object for the output. mkl_graph_matrix_create(&C) // Allocate a rows_start buffer of chosen type for the output. // Set the user-allocated rows_start in the output matrix object. mkl_graph_matrix_set_csr(C, nrows, ncols, rows_start, rows_start_type, NULL, …) // Fill rows_start for the output. mkl_graph_mxm(C,…, A, B, …, MKL_GRAPH_REQUEST_FILL_NNZ, …) // Use rows_start to deduce the number of nonzero entries nnz. // Allocate buffers for the column indices and values to hold nnz entries of the desired // types. // Set the allocated buffers for column indices and values in the output matrix object. mkl_graph_matrix_set_csr(C, …, col_indx, col_indx_type, values, values_type) // Fill buffers col_indx and values with calculated column indices and values mkl_graph_mxm(C, …, A, B, …, MKL_GRAPH_REQUEST_FILL_ENTRIES, …)
For full working code using multistage mode, refer to graphc_mxm_multistage.c in the examples for graph functionality.
Product and Performance Information |
---|
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex. Notice revision #20201201 |
- mkl_graph_mxv
Computes a (masked) graph matrix-vector product. - mkl_graph_vxm
Computes a (masked) graph vector-matrix product. - mkl_graph_mxm
Computes a (masked) graph matrix-matrix product. - mkl_graph_transpose
Computes a (masked) transpose of a graph matrix.