Pass Data to Algorithms

Intel® oneAPI DPC++ Library Developer Guide and Reference

Download PDF

ID 768913

Date 12/16/2022

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-14040A05-38E4-48A9-807B-62DEA2757629

View Details

Pass Data to Algorithms

You can use one of the following ways to pass data to an algorithm executed with a device policy:

oneapi:dpl::begin and oneapi::dpl::end functions
Unified shared memory (USM) pointers and std::vector with USM allocators
Iterators of host-side std::vector

Use oneapi::dpl::begin and oneapi::dpl::end Functions

oneapi::dpl::begin and oneapi::dpl::end are special helper functions that allow you to pass SYCL buffers to parallel algorithms. These functions accept a SYCL buffer and return an object of an unspecified type that provides the following API:

It satisfies CopyConstructible and CopyAssignable C++ named requirements and comparable with operator== and operator!=.
It gives the following valid expressions: a + n, a - n, and a - b, where a and b are objects of the type, and n is an integer value. The effect of those operations is the same as for the type that satisfies the LegacyRandomAccessIterator, a C++ named requirement.
It provides the get_buffer method, which returns the buffer passed to the begin and end functions.

The begin and end functions can take SYCL 2020 deduction tags and sycl::no_init as arguments to explicitly mention which access mode should be applied to the buffer accessor when submitting a SYCL kernel to a device. For example:

auto first1 = begin(buf, sycl::read_only);
auto first2 = begin(buf, sycl::write_only, sycl::no_init);
auto first3 = begin(buf, sycl::no_init);

The example above allows you to control the access mode for the particular buffer passing to a parallel algorithm.

To use the functions, add #include <oneapi/dpl/iterator> to your code. For example:

#include <oneapi/dpl/execution>
#include <oneapi/dpl/algorithm>
#include <oneapi/dpl/iterator>
#include <sycl/sycl.hpp>
int main(){
  sycl::buffer<int> buf { 1000 };
  auto buf_begin = oneapi::dpl::begin(buf);
  auto buf_end   = oneapi::dpl::end(buf);
  std::fill(oneapi::dpl::execution::dpcpp_default, buf_begin, buf_end, 42);
  return 0;
}

Use Unified Shared Memory

The following examples demonstrate two ways to use the parallel algorithms with USM:

USM pointers
USM allocators

If you have a USM-allocated buffer, pass the pointers to the start and past the end of the buffer to a parallel algorithm. Make sure that the execution policy and the buffer were created for the same queue. For example:

#include <oneapi/dpl/execution>
#include <oneapi/dpl/algorithm>
#include <sycl/sycl.hpp>
int main(){
  sycl::queue q;
  const int n = 1000;
  int* d_head = sycl::malloc_shared<int>(n, q);

  std::fill(oneapi::dpl::execution::make_device_policy(q), d_head, d_head + n, 42);

  sycl::free(d_head, q);
  return 0;
}

Alternatively, use std::vector with a USM allocator. For example:

#include <oneapi/dpl/execution>
#include <oneapi/dpl/algorithm>
#include <sycl/sycl.hpp>
int main(){
  const int n = 1000;
  auto policy = oneapi::dpl::execution::dpcpp_default;
  sycl::usm_allocator<int, sycl::usm::alloc::shared> alloc(policy.queue());
  std::vector<int, decltype(alloc)> vec(n, alloc);

  std::fill(policy, vec.begin(), vec.end(), 42);

  return 0;
}

When using device USM, such as allocated by malloc_device, manually copy data to this memory before calling oneDPL algorithms, and copy it back once the algorithms have finished execution.

Use Host-Side std::vector

Intel® oneAPI DPC++ Library parallel algorithms can be called with ordinary (host-side) iterators, as seen in the example below. In this case, a temporary SYCL buffer is created, and the data is copied to this buffer. After processing on a device is complete, the modified data is copied from the temporary buffer back to the host container. For example:

#include <oneapi/dpl/execution>
#include <oneapi/dpl/algorithm>
#include <vector>
int main(){
  std::vector<int> vec( 1000 );
  std::fill(oneapi::dpl::execution::dpcpp_default, vec.begin(), vec.end(), 42);
  // each element of vec equals to 42
  return 0;
}

Working with SYCL buffers is recommended to reduce data copying between the host and device.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI DPC++ Library Developer Guide and Reference

Pass Data to Algorithms

Use oneapi::dpl::begin and oneapi::dpl::end Functions

Use Unified Shared Memory

Use Host-Side std::vector