Visible to Intel only — GUID: GUID-540FD4E6-64AE-42F8-8CFA-48DECFDBEB01
Host Communication
The communication operations between processes are provided by Communicator.
The example below demonstrates the main concepts of communication on host memory buffers.
Example
Consider a simple oneCCL allreduce example for CPU.
Create a communicator object with user-supplied size, rank, and key-value store:
auto ccl_context = ccl::create_context(); auto ccl_device = ccl::create_device(); auto comms = ccl::create_communicators( size, vector_class<pair_class<size_t, device>>{ { rank, ccl_device } }, ccl_context, kvs);
Or for convenience use non-vector form without device and context parameters.
auto comm = ccl::create_communicator(size, rank, kvs);
Initialize send_buf (in real scenario it is supplied by the user):
const size_t elem_count = <N>; /* initialize send_buf */ for (idx = 0; idx < elem_count; idx++) { send_buf[idx] = rank + 1; }
allreduce invocation performs the reduction of values from all the processes and then distributes the result to all the processes. In this case, the result is an array with elem_count elements, where all elements are equal to the sum of arithmetical progression:
ccl::allreduce(send_buf, recv_buf, elem_count, reduction::sum, comm).wait();
Check the correctness of allreduce operation:
auto comm_size = comm.size(); auto expected = comm_size * (comm_size + 1) / 2; for (idx = 0; idx < elem_count; idx++) { if (recv_buf[idx] != expected) { std::count << "unexpected value at index " << idx << std::endl; break; } }