Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference

ID 767253
Date 11/07/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Example for Using async_class Template Class

The following example illustrates how Intel's C++ asynchronous I/O template class can be used. Consider the following code that writes arrays of floats to an external file.

// Data is array of floats
std::vector<float>  v(10000);
 
// User defines new operator << for std::vector<float> type
std::ofstream& operator << (std::ofstream & str, std::vector<float> & vec)
{
// User output actions
...
 }
...
// Output file declaration – object of standard ofstream STL class
std::ofstream external_file(output.txt);
...
// Output operations
external_file << v;

The following code illustrates the changes to be made to the above code to execute the output operation asynchronously.

// Add new header to support STL asynchronous IO operations
 
#include <aiostream.h>
...
 
std::vector<float>  v(10000); 
 
std::ofstream& operator << (std::ofstream & str, std::vector<float> & vec)
{... }
...
// Declare output file as the instance of new async::async_class template
// class.
// New inherited from STL ofstream type is declared
async::async_class<std::ofstream> external_file(output.txt);
...
external_file << v;
...
// Add stop operation, to wait the completion of all asynchronous IO //operations
external_file.wait();
…

Performance Recommendations

It is recommended not to use asynchronous mode for small objects. For example, do not use asynchronous mode when the output standard type value in a loop where execution of other loop operations takes less time than output of the same value to the STL stream.

However, if you can find the balance between output of small data and its previous calculation inside the loop, you still have some stable performance improvement.

For example, in the following code, the program reads two matrices from external files, calculates the elements of a third matrix, and prints out the elements inside the loop.

#define ARR_LEN 900
{
  std::ifstream fA(A.txt);
  fA >> A;
  std::ifstream fB(B.txt);
  fB >> B;
  std::ofstream fC(f);
 
  for(int i=0; i< ARR_LEN; i++)
   {
    for(int j=0; j< ARR_LEN; j++)
    {
     C[i][j] = 0;
     for(int k=0; k < ARR_LEN; k++)
     C[i][j]+ = A[i][k]*B[k][j]*sin((float)(k))*cos((float)(-k))*sin((float)(k+1)
     )*cos((float)(-k-1));
     fC << C[i][j] << std::endl;
    }
   }
}

By increasing matrix size, you can also achieve performance improvement during parallel data reading from two files.