Explore Data Parallel C++ with Samples from Intel | ISO2DFD
This short video shows an Intel® oneAPI DPC++ Compiler implementation of a two-dimensional finite-difference stencil that solves the 2D acoustic isotropic wave-equation. It illustrates the basics of the DPC++ programming language using direct programming. A more detailed code walkthrough is available in the link section.
Hello, my name is Alberto Villarreal.
In this video, I show the basics of the DPC++ programming language with a code sample walkthrough. This code sample implements a stencil computation to illustrate some basic elements of the DPC++ programming language, specifically queues, buffers and accessors, and kernels, where now we have added the ability to call a user-defined function inside the kernel function to better encapsulate the details of the computation. The code is available to download as indicated in the “Explore” document and a detailed code walkthrough is available too.
This animation illustrates what this code sample does. It simulates how an acoustic wave, like a sound wave, that is generated in the air in the upper left, propagates through the air and through the water. Here is a brief overview of the algorithm that is implemented in the code sample. We want to simulate acoustic waves propagating through a 2-dimensional medium. ISO2DFD stands for “Isotropic 2-dimensional Finite Difference”, which refers to the algorithm we are using for our simulation.
This is the formula for the discretized 2D acoustic wave equation and the graphical representation of the 2D stencil. The variable P is the pressure of the medium, which is how sound waves propagate, changing the pressure of the medium on which they are propagating. This stencil formula updates the pressure at the next time step, showed here in orange, using the values of the pressure at previous times and at neighboring grid points, shown here in green.
How do we map this computation to a GPU? As shown in this figure the computation of every point in the grid can be mapped to a single processing element in the GPU because every point in the grid can be updated independently at every time iteration. This is why stencil computations are good candidates for GPU offload. Please see the code walkthrough for more details about the derivation of the stencil formula.
Now let’s take a look at the sections of the code sample where some important aspects of DPC++ are shown. First, we find a DPC++ queue defined in the main function. Notice that the queue definition takes two arguments, the device selector and the exception handler, which is defined separately in the code. The device selector can be defined in different ways, depending on which hardware is available. Here we are u sing a default selector, which would select a device at run-time.
Next, we find buffers and accessors defined in the code, which is one way to share information between the CPU and the GPU. Notice that Buffers are created in the host to hold data that we want to share between the host and the device. The accessors, which will let us access data from the buffers in the device, are defined once we submit the queue for execution in the device. You can find detailed information about buffers, accessors and queues in the code walkthrough.
And the third learning objective in this code sample is how to put the computation code in a function and call that function from each kernel invocation. This will let us simplify and add modularity and flexibility to the kernel when the code inside the kernel becomes more complex. Notice that if we have a function that takes pointers as arguments, which is the case here in the iso_2dfd_iteration_global function, we can pass the accessors as pointers when we call the function, using the “get_pointer” member function in the accessors.
This was just a short overview of the code sample. You can find a detailed code walkthrough in the links provided, which you can access from the Explore document. And you can download the complete code sample by clicking the link at the top of the code walkthrough document. We hope this code sample helps you improve your learning of DPC++ and you can use it as a starting point to write other applications in DPC++.
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.