Visible to Intel only — GUID: GUID-E78D7C03-079A-4048-A6B8-ED7374D7039C
Visible to Intel only — GUID: GUID-E78D7C03-079A-4048-A6B8-ED7374D7039C
Configuring Data Layouts
The DPC++ interface provides the configuration parameter oneapi::mkl::dft::config_param::FWD_STRIDES (resp. oneapi::mkl::dft::config_param::BWD_STRIDES) to define the data layout locating entries (or parts thereof) of relevant data sequences in the forward (resp. backward) domain. In case of batched transforms, i.e., if the configuration value for oneapi::mkl::dft::config_param::NUMBER_OF_TRANSFORMS is set to an integer larger than , the value set for configuration parameter oneapi::mkl::dft::config_param::FWD_DISTANCE (resp. oneapi::mkl::dft::config_param::BWD_DISTANCE) completes the description of the data layout by specifying the distances between successive data sequences in the forward (resp. backward) domain.
This topic leverages the general notations from the introduction, and uses the superscript (resp. ) for data sequences in forward (resp. backward) domain. A placeholder label is also used to capture a possible distinction between the real (if is ) and imaginary (if is ) parts of a complex data entry; naturally, that placeholder label is relevant only for data layouts that distinguish real and imaginary parts of complex data entries.
A non-redundant entry (or its real or imaginary part, if relevant) is stored at index of the appropriate data container (sycl::buffer object or device-accessible USM allocation) provided to the compute function, the base data type of which is (possibly implicitly re-interpreted as) documented in the table below. That index value is defined as
wherein
, represents the offset and generalized strides defining the locations of relevant values within each -dimensional data sequence in the forward (resp. backward) domain if (resp. if ), counted in number of elements of the relevant implicitly-assumed elementary data type;
represents the distance between successive -dimensional data sequences in the forward (resp. backward) domain if (resp. if ), counted in number of elements of the relevant implicitly-assumed elementary data type;
the relation simplifies into the identity in all recommended use cases or if , i.e., is either irrelevant or unused in such cases. However, for some one-dimensional real descriptors using unrecommended configurations, the real and imaginary parts of entry in backward domain, are to be considered separately from one another and the corresponding indices are denoted by and , respectively.
In this page, it is assumed that only non-redundant data sequence entries are of interest, i.e., that , and that (resp. ) for entries that do (resp. do not) belong the backward domain of a real DFT.
Note that all elements accessed as a value stored at index of a given user-provided data container must belong to the same block allocation.
Implicitly-assumed elementary data type
When reading or writing an element at index of any user-provided data container used at compute time, a descriptor object may first re-interpret the base data type of that data container into an implicitly-assumed elementary data type. That implicitly-assumed data type depends on the object type; that is, on the specialization values used for the template parameters when instantiating the descriptor class, and on other configuration value(s). The table below lists the implicitly-assumed data type in either domain (last 2 columns) based on the object type and its configuration value(s).
Type of descriptor and relevant configuration values |
Implicitly-assumed elementary data type in forward domain |
Implicitly-assumed elementary data type in backward domain |
---|---|---|
Complex descriptor with DFTI_COMPLEX_COMPLEX set for config_param::COMPLEX_STORAGE |
std::complex<fp_type> |
std::complex<fp_type> |
Complex descriptor with DFTI_REAL_REAL set for config_param::COMPLEX_STORAGE |
fp_type |
fp_type |
Real descriptor with DFTI_COMPLEX_COMPLEX set for config_param::CONJUGATE_EVEN_STORAGE |
fp_type |
std::complex<fp_type> |
Real descriptor with DFTI_COMPLEX_REAL set for config_param::CONJUGATE_EVEN_STORAGE |
fp_type |
fp_type |
Descriptors that implicitly assume an elementary data type of float or double (resp. std::complex<float> or std::complex<double>) in a domain are referred to as “descriptors expecting real (resp. complex) data” in that domain.
Configuring strides in forward and backward domains
The values defining are to be communicated as contiguous std::int64_t elements of an array, passed as the configuration value for oneapi::mkl::dft::config_param::FWD_STRIDES if (resp. oneapi::mkl::dft::config_param::BWD_STRIDES if ). The element represents an absolute offset (or “displacement”) in the data sets while the subsequent elements are generalized strides to be considered along dimensions .
When created, the descriptors are default-configured for unbatched, in-place transforms using a unit stride along the last dimension, no offset and the default configuration settings documented in the above table. For real descriptors, minimal padding is used in forward domain, aligning with the data layout requirements for in-place transforms.
In other words, the default stride values are , and, for -dimensional transforms with ,
for complex descriptors;
, and for real descriptors;
if , for (for and ).
The usage of these default strides for unbatched, in-place transforms is illustrated in the usage examples.
Configuring batched transforms
The value completing the definition of is to be set as an std::int64_t configuration value for oneapi::mkl::dft::config_param::FWD_DISTANCE if (resp. oneapi::mkl::dft::config_param::BWD_DISTANCE if ). This value is irrelevant for unbatched transforms, i.e., for descriptors set to handle a number of transforms equal to (default behavior).
In case of batched transforms, the desired number of DFTs must be set explicitly as an std::int64_t configuration value oneapi::mkl::dft::config_param::NUMBER_OF_TRANSFORMS. In that case, oneapi::mkl::dft::config_param::FWD_DISTANCE and oneapi::mkl::dft::config_param::BWD_DISTANCEmust also be set explicitly since their default configuration values of would break the data layout requirements for any .
The configuration of batched transforms is illustrated in the usage examples.
About
All complex descriptors and all real descriptors expecting complex data in backward domain use the straightforward identity relation , i.e., is irrelevant in that case. Every default behavior and recommended usage falls into this category; the reader is advised to consult the usage examples for more details and illustrations about the resulting layouts and default (or otherwise recommended) strides and distances.
For real descriptors expecting real data in backward domain (not recommended usage, supported for 1D real DFTs on CPU only), the relation takes a more intricate form. In backward domain, such descriptors expect real data in the sense that the real and imaginary parts of the data sequence entries are not necessarily stored contiguously in memory (or not even stored at all). The specific form of depends on the value set for oneapi::mkl::dft::config_param::PACKED_FORMAT. For real descriptors expecting real data in backward domain, three different values (documented below) are possible for that configuration parameter: DFTI_CCS_FORMAT, DFTI_PACK_FORMAT and DFTI_PERM_FORMAT. Given the limited support for 1D transforms on CPUs, is used in the rest of this section to simplify the presentation. Illustrations are also given for unbatched cases; that is, , so the then-superfluous batch index is omitted in this section’s illustrative tables, too.
DFTI_CCS_FORMAT value set for oneapi::mkl::dft::config_param::PACKED_FORMAT
If the DFTI_CCS_FORMAT configuration value is used, then
;
.
Given that all non-redundant entries in backward domain are captured by , the range of relevant values for is (resp. ) if is even (resp. odd), in this case.
This format is illustrated in the table below for , and .
Stored value |
DFTI_PACK_FORMAT value set for oneapi::mkl::dft::config_param::PACKED_FORMAT
If the DFTI_PACK_FORMAT configuration value is used, then
;
does not exist (-valued imaginary parts are not stored explicitly);
for any ;
for any . holds for if is odd; does not exist if is even (-valued imaginary parts are not stored explicitly).
Given that all non-redundant entries in backward domain are captured by , the range of relevant values for is in this case (regardless of whether is even or odd).
This format is illustrated in the tables below for , and .
Stored value |
Stored value |
DFTI_PERM_FORMAT value set for oneapi::mkl::dft::config_param::PACKED_FORMAT
If the DFTI_PERM_FORMAT configuration value is used, the relation differs whether is even or odd.
If is even, then
and does not exist (-valued imaginary parts are not stored explicitly);
and does not exist (-valued imaginary parts are not stored explicitly);
for any ;
for any .
If is odd, then (this format is equivalent to DFTI_PACK_FORMAT if is odd)
and does not exist (-valued imaginary parts are not stored explicitly);
for any ;
for any .
Given that all non-redundant entries in backward domain are captured by , the range of relevant values for is in this case (regardless of whether is even or odd).
This format is illustrated in the tables below for , and .
Stored value |
Stored value |
The value set for oneapi::mkl::dft::config_param::PACKED_FORMAT must be set explicitly (to either DFTI_CCS_FORMAT, DFTI_PACK_FORMAT or DFTI_PERM_FORMAT) for real descriptors expecting real data in backward domain as it further specifies the descriptor’s behavior in that case (see explanations above). Real descriptors expecting real data in backward domain are supported for 1D real DFTs on CPU only.
Data layout requirements
In general, the distances and strides must be set so that every value of corresponds to a unique value relevant to the data sequences under consideration. In other words, there must not be one value of corresponding to two different -tuples that would both be within relevant ranges.
Additionally, for in-place transforms (configuration value DFTI_INPLACE set for oneapi::mkl::dft::config_param::PLACEMENT), the following “consistency requirements” apply:
descriptors expecting the same data type in either domain (e.g., complex descriptors) must use the same offset, stride(s), and distance values in forward and backward domains;
for real descriptors expecting complex data in backward domain (default behavior for real descriptors), the memory address(es) of leading entry(ies) along the last dimension must be identical in forward and backward domains. Specifically, that requirement translates into the conditions as well as, if , . Note that this requirement leads to some data padding to be used in forward domain if unit strides are used along dimension in forward and backward domains (recommended usage, as set by default).
Configuring strides for input and output data [deprecated]
Instead of specifying strides by domain, one may choose to specify the strides for input and output data sequences. Let be the stride values for input (resp. output) data sequences if (resp. ). Such values may be communicated as contiguous std::int64_t elements of an array, passed as the configuration value for oneapi::mkl::dft::config_param::INPUT_STRIDES if (resp. oneapi::mkl::dft::config_param::OUTPUT_STRIDES if ).
The values of and are to be used and considered by oneMKL if and only if : This will happen automatically if oneapi::mkl::dft::config_param::INPUT_STRIDES and oneapi::mkl::dft::config_param::OUTPUT_STRIDES are set and oneapi::mkl::dft::config_param::FWD_STRIDES and oneapi::mkl::dft::config_param::BWD_STRIDES are not (see the note below). In such a case, descriptor objects must consider the data layouts corresponding to the two compute directions separately. As detailed above, relevant data sequence entries are accessed as elements of data containers (sycl::buffer objects or device-accessible USM allocations) provided to the compute function, the base data type of which is (possibly implicitly re-interpreted) as documented in this table. If using input and output strides, the index to be used when accessing a data sequence entry - or part thereof - in forward domain is
where (resp. ) for forward (resp. backward) DFTs. Similarly, the index to be used when accessing a data sequence entry - or part thereof - in backward domain is
where (resp. ) for forward (resp. backward) DFTs.
As a consequence, configuring descriptor objects using these deprecated configuration parameters makes their configuration direction-dependent when different stride values are used in forward and backward domains. Since the intended compute direction is unknown to the descriptor object when committing it, every direction that results in a legitimate data layout in forward and backward domains must be supported by successfully committed descriptor objects.
Setting either of oneapi::mkl::dft::config_param::INPUT_STRIDES or oneapi::mkl::dft::config_param::OUTPUT_STRIDES triggers any (default or previously-set) values for oneapi::mkl::dft::config_param::FWD_STRIDES and oneapi::mkl::dft::config_param::BWD_STRIDES to reset to values, and vice versa. This default behavior prevents mix-and-matching usage of either of oneapi::mkl::dft::config_param::INPUT_STRIDES or oneapi::mkl::dft::config_param::OUTPUT_STRIDES with either of oneapi::mkl::dft::config_param::FWD_STRIDES or oneapi::mkl::dft::config_param::BWD_STRIDES, which is not supported by oneMKL. If such a configuration is attempted, an exception is thrown at commit time due to invalid configuration, as the stride values that were implicitly reset invalidate the data layout requirements for any non-trivial DFT.
If specifying the data layout strides using these deprecated configuration parameters and if the strides differ in forward and backward domain, the descriptor must be re-configured and re-committed for computing the DFT in the reverse direction as shown below.
// ...
desc.set_value(config_param::INPUT_STRIDES, fwd_domain_strides);
desc.set_value(config_param::OUTPUT_STRIDES, bwd_domain_strides);
desc.commit(queue);
compute_forward(desc, ...);
// ...
desc.set_value(config_param::INPUT_STRIDES, bwd_domain_strides);
desc.set_value(config_param::OUTPUT_STRIDES, fwd_domain_strides);
desc.commit(queue);
compute_backward(desc, ...);
The oneapi::mkl::dft::config_param::INPUT_STRIDES and oneapi::mkl::dft::config_param::OUTPUT_STRIDES parameters have been deprecated since oneMKL2024.1. A warning message “INPUT_STRIDES and OUTPUT_STRIDES are deprecated: please use FWD_STRIDES and BWD_STRIDES, instead.” is reported to applications using these configuration parameters.
Summary table
The configuration parameters pertaining to the definition of data layouts are listed in the following table along with the expected type of their corresponding configuration values or the possible valid named constants to use as such. Default values are also documented therein.
Configuration parameter |
Description |
Data type, or valid named constant(s) for the associated configuration value |
---|---|---|
COMPLEX_STORAGE |
Determines the elementary data type to be considered in either domain by complex descriptors (irrelevant parameter for real descriptors). |
DFTI_COMPLEX_COMPLEX or DFTI_REAL_REAL |
REAL_STORAGE |
Determines the elementary data type to be considered in forward domain by real descriptors (irrelevant parameter for complex descriptors). |
DFTI_REAL_REAL |
CONJUGATE_EVEN_STORAGE |
Determines the elementary data type to be considered in backward domain by real descriptors (irrelevant parameter for complex descriptors). |
DFTI_COMPLEX_COMPLEX or DFTI_COMPLEX_REAL |
PACKED_FORMAT |
Determines how the backward domain’s conjugate-even data sequences of real descriptors are to be stored for real descriptors (irrelevant parameter for complex descriptors). The only valid value is DFTI_CCE_FORMAT if DFTI_COMPLEX_COMPLEX is set for config_param::CONJUGATE_EVEN_STORAGE. The named constants DFTI_CCS_FORMAT, DFTI_PACK_FORMAT and DFTI_PERM_FORMAT are valid values if DFTI_COMPLEX_REAL is set for config_param::CONJUGATE_EVEN_STORAGE (detailed above). |
DFTI_CCE_FORMAT, DFTI_CCS_FORMAT, DFTI_PACK_FORMAT or DFTI_PERM_FORMAT. |
FWD_STRIDES |
Offset and generalized strides defining the layout within a given data sequence in the forward domain. |
std::int64_t* |
BWD_STRIDES |
Offset and generalized strides defining the layout within a given data sequence in the backward domain. |
std::int64_t* |
INPUT_STRIDES |
Offset and generalized strides defining the layout within a given input data sequence. |
std::int64_t* |
OUTPUT_STRIDES |
Offset and generalized strides defining the layout within a given output data sequence. |
std::int64_t* |
NUMBER_OF_TRANSFORMS |
Specifies the number M of d-dimensional data sequences for batched DFTs. |
std::int64_t |
FWD_DISTANCE |
Distance in number of elements of implicitly-assumed data type separating entries (or parts thereof) of identical k_1, k_2, ..., k_d indices belonging to successive data sequences in forward domain. This is relevant (and must be set) for batched DFT(s), i.e., if M > 1. |
std::int64_t |
BWD_DISTANCE |
Distance in number of elements of implicitly-assumed data type separating entries (or parts thereof) of identical k_1, k_2, ..., k_d indices belonging to successive data sequences in backward domain. This is relevant (and must be set) for batched DFT(s), i.e., if M > 1. |
std::int64_t |
Supported layouts on GPU devices
On GPU devices, oneMKL requires
the rank of the transform to be no greater than ;
the offset values and to be ;
either or for batched, two-dimensional real transforms (for and);
either (along with if ) or (along with if ) for three-dimensional real transforms (for and);
real descriptors to use DFTI_COMPLEX_COMPLEX for oneapi::mkl::dft::config_param:CONJUGATE_EVEN_STORAGE and DFTI_CCE_FORMAT for oneapi::mkl::dft::config_param:PACKED_FORMAT.