Visible to Intel only — GUID: GUID-9D9201DB-C043-40D5-8FDB-A05F56B2399E
Visible to Intel only — GUID: GUID-9D9201DB-C043-40D5-8FDB-A05F56B2399E
Configuring data layouts
This page describes how to configure a descriptor object for a specific data layout. When non-native, all the relevant types and enumerations mentioned below belong to the oneapi::mkl::dft namespace and are declared in oneapi/mkl/dft.hpp (file to be included). The usage of prepended namespace specifiers oneapi::mkl::dft is omitted below for conciseness.
The DPC++ interface provides the configuration parameter config_param::FWD_STRIDES (resp. config_param::BWD_STRIDES) to define the data layout locating entries (or parts thereof) of relevant data sequences in the forward (resp. backward) domain. In case of batched transforms, i.e., if the configuration value for config_param::NUMBER_OF_TRANSFORMS is set to an integer larger than , the value set for configuration parameter config_param::FWD_DISTANCE (resp. config_param::BWD_DISTANCE) completes the description of the data layout by specifying the distances between successive data sequences in the forward (resp. backward) domain.
This topic leverages the general notations from the introduction, and uses the superscript (resp. ) for data sequences in forward (resp. backward) domain. A placeholder label is also used to capture a possible distinction between the real (if is ) and imaginary (if is ) parts of a complex data entry; naturally, that placeholder label is relevant only for data layouts that distinguish real and imaginary parts of complex data entries.
A non-redundant entry (or its real or imaginary part, if relevant) is stored at index of the appropriate data container (sycl::buffer object or device-accessible USM allocation) provided to a compute function, the base data type of which is (possibly implicitly re-interpreted as) documented in the table below. That index value is defined as
wherein
, represents the offset and generalized strides defining the locations of relevant values within each -dimensional data sequence in the forward (resp. backward) domain if (resp. if ), counted in number of elements of the relevant implicitly-assumed elementary data type;
represents the distance between successive -dimensional data sequences in the forward (resp. backward) domain if (resp. if ), counted in number of elements of the relevant implicitly-assumed elementary data type;
the relation simplifies into the identity in all recommended use cases or if , i.e., is either irrelevant or unused in such cases. However, for some one-dimensional real descriptors using deprecated configurations, the real and imaginary parts of entry in backward domain, are to be considered separately from one another and the corresponding indices are denoted by and , respectively.
In this page, it is assumed that only non-redundant data sequence entries are of interest, i.e., that , and that (resp. ) for entries that do (resp. do not) belong the backward domain of a real DFT.
Note that all elements accessed as a value stored at index of a given user-provided data container must belong to the same block allocation.
Implicitly-assumed elementary data type
When reading or writing an element at index of any user-provided data container used at compute time, a descriptor object may first re-interpret the base data type of that data container into an implicitly-assumed elementary data type. That implicitly-assumed data type depends on the object type; that is, on the specialization values used for the template parameters when instantiating the descriptor class, and on other configuration value(s). The table below lists the implicitly-assumed data type in either domain (last 2 columns) based on the object type and its configuration value(s).
Type of descriptor and relevant configuration values |
Implicitly-assumed elementary data type in forward domain |
Implicitly-assumed elementary data type in backward domain |
---|---|---|
Complex descriptor with config_value::COMPLEX_COMPLEX set for config_param::COMPLEX_STORAGE |
std::complex<fp_type> |
std::complex<fp_type> |
Complex descriptor with config_value::REAL_REAL set for config_param::COMPLEX_STORAGE |
fp_type |
fp_type |
Real descriptor with config_value::COMPLEX_COMPLEX set for config_param::CONJUGATE_EVEN_STORAGE |
fp_type |
std::complex<fp_type> |
Real descriptor with config_value::COMPLEX_REAL set for config_param::CONJUGATE_EVEN_STORAGE |
fp_type |
fp_type |
Descriptors that implicitly assume an elementary data type of float or double (resp. std::complex<float> or std::complex<double>) in a domain are referred to as “descriptors expecting real (resp. complex) data” in that domain.
Configuring strides in forward and backward domains
The values defining are to be communicated as elements (in that order) of a std::vector<std::int64_t> object of size , passed as the configuration value for config_param::FWD_STRIDES if (resp. config_param::BWD_STRIDES if ) using the relevant configuration-setting member function. The element represents an absolute offset (or “displacement”) in the data sets while the subsequent elements are generalized strides to be considered along dimensions .
When created, the descriptors are default-configured for unbatched, in-place transforms using a unit stride along the last dimension, no offset and the default configuration settings documented in the above table. For real descriptors, minimal padding is used in forward domain, aligning with the data layout requirements for in-place transforms.
In other words, the default stride values are , and, for -dimensional transforms with ,
for complex descriptors;
, and for real descriptors;
if , for (for and ).
The usage of these default strides for unbatched, in-place transforms is illustrated in the usage examples.
Configuring batched transforms
The value completing the definition of is to be set as an std::int64_t configuration value for config_param::FWD_DISTANCE if (resp. config_param::BWD_DISTANCE if ) using the relevant configuration-setting member function. This value is irrelevant for unbatched transforms, i.e., for descriptors set to handle a number of transforms equal to (default behavior).
In case of batched transforms, the desired number of DFTs must be set explicitly as an std::int64_t configuration value config_param::NUMBER_OF_TRANSFORMS using the relevant configuration-setting member function. In that case, config_param::FWD_DISTANCE and config_param::BWD_DISTANCEmust also be set explicitly since their default configuration values of would break the data layout requirements for any .
The configuration of batched transforms is illustrated in the usage examples.
Deprecated layouts in backward domain of one-dimensional real transforms
All complex descriptors and all real descriptors expecting complex data in backward domain use the straightforward identity relation , i.e., is irrelevant in that case. Every default behavior and recommended usage falls into this category; the reader is referred to the usage examples for more details and illustrations about the resulting layouts and default (or otherwise recommended) strides and distances.
For real descriptors expecting real data in backward domain (deprecated usage, supported for 1D real DFTs on CPU only), the relation takes a more intricate form. In backward domain, such descriptors expect real data in the sense that the real and imaginary parts of the data sequence entries are not necessarily stored contiguously in memory (or not even stored at all). The specific form of depends on the value set for config_param::PACKED_FORMAT. For real descriptors expecting real data in backward domain, three different values (documented below) are possible for that configuration parameter: config_value::CCS_FORMAT, config_value::PACK_FORMAT and config_value::PERM_FORMAT. Given the limited support for 1D transforms on CPUs, is used in the rest of this section to simplify the presentation. Illustrations are also given for unbatched cases; that is, , so the then-superfluous batch index is omitted in this section’s illustrative tables, too.
config_value::CCS_FORMAT value set for config_param::PACKED_FORMAT
If the configuration value config_value::CCS_FORMAT is used, then
;
.
Given that all non-redundant entries in backward domain are captured by , the range of relevant values for is (resp. ) if is even (resp. odd), in this case.
This format is illustrated in the table below for , and .
Stored value |
config_value::PACK_FORMAT value set for config_param::PACKED_FORMAT
If the configuration value config_value::PACK_FORMAT is used, then
;
does not exist (-valued imaginary parts are not stored explicitly);
for any ;
for any . holds for if is odd; does not exist if is even (-valued imaginary parts are not stored explicitly).
Given that all non-redundant entries in backward domain are captured by , the range of relevant values for is in this case (regardless of whether is even or odd).
This format is illustrated in the tables below for , and .
Stored value |
Stored value |
config_value::PERM_FORMAT value set for config_param::PACKED_FORMAT
If the configuration value config_value::PERM_FORMAT is used, the relation differs according to whether is even or odd.
If is even, then
and does not exist (-valued imaginary parts are not stored explicitly);
and does not exist (-valued imaginary parts are not stored explicitly);
for any ;
for any .
If is odd, then (this format is equivalent to config_value::PACK_FORMAT if is odd)
and does not exist (-valued imaginary parts are not stored explicitly);
for any ;
for any .
Given that all non-redundant entries in backward domain are captured by , the range of relevant values for is in this case (regardless of whether is even or odd).
This format is illustrated in the tables below for , and .
Stored value |
Stored value |
The value set for config_param::PACKED_FORMAT must be set explicitly (to either config_value::CCS_FORMAT, config_value::PACK_FORMAT or config_value::PERM_FORMAT) for real descriptors expecting real data in backward domain as it further specifies the descriptor’s behavior in that case (see explanations above). Real descriptors expecting real data in backward domain are supported for 1D real DFTs on CPU only. Their support is deprecated.
Data layout requirements
In general, the distances and strides must be set so that
values of are non-negative for all -tuples within relevant ranges;
every value of corresponds to a unique value relevant to the data sequences under consideration. In other words, there must not be one value of corresponding to two different -tuples that would both be within relevant ranges.
Additionally, for in-place transforms (configuration value config_value::INPLACE set for config_param::PLACEMENT), the following “consistency requirements” apply:
descriptors expecting the same data type in either domain (e.g., complex descriptors) must use the same offset, stride(s), and distance values in forward and backward domains;
for real descriptors expecting complex data in backward domain (default behavior for real descriptors), the memory address(es) of leading entry(ies) along the last dimension must be identical in forward and backward domains. Specifically, that requirement translates into the conditions as well as, if , . Note that this requirement leads to some data padding to be used in forward domain if unit strides are used along dimension in forward and backward domains (recommended usage, as set by default).
- Support for negative strides with a sufficiently large (positive) offset index guaranteeing non-negativeness of all is not enabled yet (unimplemented);
One-dimensional real descriptors expecting real data in backward domain and using configuration value config_value::CCS_FORMAT for config_param::PACKED_FORMAT also require .
Configuring strides for input and output data [deprecated]
Instead of specifying strides by domain, one may choose to specify the strides for input and output data sequences. Let be the stride values for input (resp. output) data sequences if (resp. ). Such values may be communicated as elements (in that order) of a std::vector<std::int64_t> object of size , passed as the configuration value for config_param::INPUT_STRIDES if (resp. config_param::OUTPUT_STRIDES if ) using the relevant configuration-setting member function.
The values of and are to be used and considered by oneMKL if and only if . This will happen automatically if config_param::INPUT_STRIDES and config_param::OUTPUT_STRIDES are set and config_param::FWD_STRIDES and config_param::BWD_STRIDES are not (see the note below). In such a case, descriptor objects must consider the data layouts corresponding to the two compute directions separately. As detailed above, relevant data sequence entries are accessed as elements of data containers (sycl::buffer objects or device-accessible USM allocations) provided to the compute function, the base data type of which is (possibly implicitly re-interpreted) as documented in the above table. If using input and output strides, the index to be used when accessing a data sequence entry – or part thereof – in forward domain is
where (resp. ) for forward (resp. backward) DFTs. Similarly, the index to be used when accessing a data sequence entry – or part thereof – in backward domain is
where (resp. ) for forward (resp. backward) DFTs.
As a consequence, configuring descriptor objects using these deprecated configuration parameters makes their configuration direction-dependent when different stride values are used in forward and backward domains. Since the intended compute direction is unknown to the descriptor object when committing it, every direction that results in a legitimate data layout in forward and backward domains must be supported by successfully committed descriptor objects.
Setting either of config_param::INPUT_STRIDES or config_param::OUTPUT_STRIDES triggers any (default or previously-set) values for config_param::FWD_STRIDES and config_param::BWD_STRIDES to reset to -valued vectors, and vice versa. This implicit behavior prevents mix-and-matching either of config_param::INPUT_STRIDES or config_param::OUTPUT_STRIDES with either of config_param::FWD_STRIDES or config_param::BWD_STRIDES, which is not supported by oneMKL. If such a configuration is attempted, an exception is thrown at commit time due to invalid configuration, as the stride values that were implicitly reset invalidate the data layout requirements for any non-trivial DFT.
If specifying the data layout strides using these deprecated configuration parameters and if the strides differ in forward and backward domain, the descriptor must be re-configured and re-committed for computing the DFT in the reverse direction as shown below.
// ...
desc.set_value(config_param::INPUT_STRIDES, fwd_domain_strides);
desc.set_value(config_param::OUTPUT_STRIDES, bwd_domain_strides);
desc.commit(queue);
compute_forward(desc, ...);
// ...
desc.set_value(config_param::INPUT_STRIDES, bwd_domain_strides);
desc.set_value(config_param::OUTPUT_STRIDES, fwd_domain_strides);
desc.commit(queue);
compute_backward(desc, ...);
The config_param::INPUT_STRIDES and config_param::OUTPUT_STRIDES parameters have been deprecated since oneMKL2024.1. A compile-time deprecation warning advising users to update their usage to config_param::FWD_STRIDES and config_param::BWD_STRIDES is emitted for any application using these configuration parameters.
Supported layouts on GPU devices
On GPU devices, oneMKL requires
the rank of the transform to be no greater than ;
the offset values and to be ;
either or for batched, two-dimensional real transforms (for and);
either (along with if ) or (along with if ) for three-dimensional real transforms (for and);
real descriptors to use config_value::COMPLEX_COMPLEX for config_param::CONJUGATE_EVEN_STORAGE and config_value::CCE_FORMAT for config_param:PACKED_FORMAT.