Overview of oneVPL Examples and Tools

author-image

By

The Intel® oneAPI Video Processing Library (oneVPL) provides a single set of video-focused APIs for encoding, decoding, and video processing that supports current and future Intel integrated and discrete GPUs. The oneVPL APIs provide fast, high-quality video solutions by leveraging the latest hardware features of Intel GPUs. 

Building oneVPL for Intel® GPUs

For building oneVPL for Intel® GPUs, first, we need to build the oneVPL Dispatcher from the oneVPL base repository. To use oneVPL for video processing you need to install at least one implementation. Here is a list of current implementations.
 

oneVPL Dispatcher

As shown in this diagram, the oneVPL Dispatcher dispatches the application to use either oneVPL GPU runtime, or the Media SDK GPU Runtime. We will support more GPU implementations in the future.


New Features in oneVPL API Functional Scope

Intel oneVPL API (2.x) has some key new features compared to the Media SDK  API (1.35)

Improved session initialization with MFXLoad

MFXLoad replaces MFXInit/MFXInitEx in oneVPL as the entry point function for the oneVPL dispatcher. This enumerates and initializes all available GPU runtimes (libmfxhw64.so or libmfxgen.so). This upgrade to the initialization process means that runtimes can report capabilities so that the best implementation can be chosen based on the required codec and hardware.


Internal Memory Management

oneVPL API (>= API 2.0) introduces interface functions for allocating frames internally. The key advantage of using internal memory management over an external one is that the programmer does not have to take care of memory allocation for obtaining the number of working frame surfaces since the allocation is done by oneVPL. The surface returned by these oneVPL functions is a reference counted object and it is the programmer’s responsibility to call  mfxFrameSurfaceInterface::Release after finishing all operations with the surface.

 
Note: Only the newer Intel GPUs (Gen12 or newer) support the internal Memory Management feature. Use of internal memory is not required for building oneVPL application. Developer can choose between internal and external memory management technique based on his requirement. If the developer is more focused on portability of the application, then external memory management is recommended. If the goal is to develop more performant application for newer Intel Hardware then internal memory is the most convenient option.

 

Building oneVPL Examples and Tools

Include Files

The oneVPL include folder is located at these locations on your development system.
 

$ONEAPI_ROOT/vpl/latest/include

Setting up the Environment 

Run the following command to set up the necessary environment for oneVPL.

source <oneapi_install_dir>/setvars.sh

Prerequisite Software


Build Program

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build .

Run Program

Following is the common format of the command line of any oneVPL application.

vpl_sample -i <InputFile> -o <OutputFile> -<other_parameters>

To see all the available options of a certain oneVPL sample application run the following command.

vpl_sample -?


 

oneVPL Examples

oneVPL examples provides rich code samples to show how to use the oneVPL API. The code samples are included in the  examples directory of oneVPL base repository on GitHub*.  There are many oneVPL examples available on this repository. However, we will only discuss ‘hello-*’ and ‘legacy-*’ examples here.

Dataset

There are various sample video files located in oneVPL/examples/content/ directory.

  • cars_320x240.i420 
    • Video file format: YUV420
    • Resolution: 320x240
  • cars_320x240.nv12
    • Video file format: NV12
    • Resolution: 320x240
  • cars_320x240.h265
    • Video file format: hevc (Main), yuv420p(tv)
    • Resolution: 320x240
  • cars_320x240.mjpeg
    • Video file format: mjpeg (Baseline), yuvj420p
    • Resolution: 320x240

‘hello-*’ Examples

‘hello-*’ examples are the most basic examples of the oneVPL repository. These examples are located in hello directory of oneVPL base repository. The features/options of these examples are very limited. It is a good starting point for new oneVPL users. These examples use 2.x APIs of oneVPL which only works for newer GPUs (Xe or newer). Those who are interested in more advanced features can switch to sample_* tools.  Here are some brief descriptions of the hello-* examples.

hello-createsession

This sample is a command line application that initializes a session. This is a hello world case to display oneVPL architecture.
Sample Command Line

hello-createsession -hw


Sample Output:

Session loaded: ApiVersion = 2.7        impl= Hardware:VAAPI

hello-decode

hello-decode is a minimal oneVPL decode application, that uses 2.2 or newer API with internal memory management. This sample decodes the H265 elementary video file to an output file (out.raw). The output file format is of raw native format (NV12 for GPU).
Sample Command Line:

./hello-decode -hw -i ../../../content/cars_320x240.h265

Sample Output:

Implementation details:
  ApiVersion:           2.7
  Implementation type:  HW
  AccelerationMode via: VAAPI
  Path: /usr/lib/x86_64-linux-gnu/libmfx-gen.so.1.2.7
Decoding ../../../content/cars_320x240.h265 -> out.raw
Output colorspace: NV12
Decoded 30 frames

hello-decvpp

The hello-decvpp example is an extension of hello-decode. Apart from decoding the H265 elementary stream to a raw format (dec_out.raw), this example also vpp the output with oneVPL, and writes the two vpp outputs to "vpp_640x480_out.raw" in raw native format and "vpp_128x96_out.raw" in raw BGRA format.

Sample Command Line:

./hello-decvpp -hw -i ../../../content/cars_320x240.h265

Sample Output:

Implementation details:
  ApiVersion:           2.7
  Implementation type:  HW
  AccelerationMode via: VAAPI
  Path: /usr/lib/x86_64-linux-gnu/libmfx-gen.so.1.2.7
Output colorspace: NV12
Decoding and VPP ../../../content/cars_320x240.h265 -> dec_out.raw and vpp_640x480_out.raw, vpp_128x96_out.raw
Decode and VPP processed 30 frames


hello-encode

hello-encode is a minimal oneVPL encode application, that uses 2.x API with internal memory management. This example takes a raw format (NV12 for GPU) video elementary stream as input and writes the output to an out.h265 file in H265 format after encoding it.
Sample Command Line:

./hello-encode -hw -i ../../../content/cars_320x240.i420 -w 320 -h 240


Sample Output:

Implementation details:
  ApiVersion:           2.7
  Implementation type:  HW
  AccelerationMode via: VAAPI
  Path: /usr/lib/x86_64-linux-gnu/libmfx-gen.so.1.2.7
Encoding ../../../content/cars_320x240.i420 -> out.h265
Input colorspace: NV12
Encoded 30 frames

hello-transcode

hello-transcode is a minimal oneVPL transcode application that uses oneVPL 2.2 API features including internal memory. This application takes a file containing a JPEG video elementary stream as an argument, decodes it, and encodes the output with oneVPL, and writes the encoded output to the file out.h265 in H.265 format.

Sample Command Line:

./hello-transcode -sw -i ../../../content/cars_320x240.mjpeg


Sample Output:

Implementation details:
  ApiVersion:           2.7
  Implementation type:  SW
  AccelerationMode via: NA
  Path: /root/samples_test/oneVPL/lib/libvplswref64.so.1
Transcoding ../../../content/cars_320x240.mjpeg -> out.h265
Transcoded 30 frames

Note: Only Software/CPU implementation is available for hello-transcode


hello-vpp

hello-vpp is a minimal oneVPL VPP application that uses 2.x API with internal memory management.  This sample is a command line application that takes a file containing a raw native format video elementary stream as an argument, processes it and writes the resized output to out.raw in BGRA raw video format.
Sample Command Line:

./hello-vpp -hw -i ../../../content/cars_320x240.i420 -w 320 -h 240


This command will write to the output file after resizing the frames. 
Sample Output:

Implementation details:
  ApiVersion:           2.7
  Implementation type:  HW
  AccelerationMode via: VAAPI
  Path: /usr/lib/x86_64-linux-gnu/libmfx-gen.so.1.2.7
Processing ../../../content/cars_320x240.i420 -> out.raw
Processed 30 frames

‘legacy-*’ Examples

‘legacy-*’ examples are similar to ‘hello-*’ examples but ‘legacy-*’ uses the core API subset. ‘legacy-*’ examples are located in coreAPI directory of oneVPL base repository. These examples are universal which means they work both on older and newer Intel hardware and runtimes. These are the only functions and options that are common between MediaSDK runtime and the oneVPL runtime implementation.  The advantages of programming with coreAPIs:
Increased Portability of the application from MediaSDK to oneVPL.
Can be compiled with both MediaSDK and oneVPL headers
Can be used for any Hardware supported by either Media SDK or oneVPL

legacy-decode

legacy-decode is a minimal oneVPL decode application using the core API subset. This sample is a command line application that takes a file containing an H.265 video elementary stream as an argument, decodes it with oneVPL, and writes the decoded output to the file out.raw in raw native format (NV12 for GPU). The functionality of the Legacy-decode is similar to the hello-decode in terms of functionality but it uses 1.x common APIs.
Sample Command Line:

./legacy-decode -hw -i ../../../content/cars_320x240.h265


Sample Output:

Implementation details:
  ApiVersion:           2.7
  Implementation type:  HW
  AccelerationMode via: VAAPI
  Path: /usr/lib/x86_64-linux-gnu/libmfx-gen.so.1.2.7
Decoding ../../../content/cars_320x240.h265 -> out.raw
Decoded 30 frames

legacy-encode

legacy-encode is a minimal oneVPL encode application that uses the core API subset. This sample is a command line application that takes a file containing a raw native format (NV12 for GPU) video elementary stream as an argument, encodes it with oneVPL and writes the encoded output to out.h265 in H.265 format. The functionality of the legacy-encode is similar to the hello-encode in terms of functionality but it uses 1.x common APIs.
Sample Command Line:

./legacy-encode -hw -i  ../../../content/cars_320x240.i420 -w 320 -h 240


Sample Output:

Implementation details:
  ApiVersion:           2.7
  Implementation type:  HW
  AccelerationMode via: VAAPI
  Path: /usr/lib/x86_64-linux-gnu/libmfx-gen.so.1.2.7
Encoding ../../../content/cars_320x240.i420 -> out.h265
Encoded 30 frames


legacy-vpp

legacy-vpp is a minimal oneVPL VPP application using the core API subset. This sample is a command line application that takes a file containing a raw native format video elementary stream as an argument, processes it with oneVPL writes the resized output to out.raw in BGRA raw video format. The functionality of the legacy-vpp is similar to the hello-vpp in terms of functionality but it uses 1.x common APIs.
Sample Command Line

./legacy-vpp  -hw  -i  ../../../content/cars_320x240.i420 -w 320 -h 240


Sample Output

Implementation details:
  ApiVersion:           2.7
  Implementation type:  HW
  AccelerationMode via: VAAPI
  Path: /usr/lib/x86_64-linux-gnu/libmfx-gen.so.1.2.7
Processing ../../../content/cars_320x240.i420 -> out.raw
Processed 30 frames


 

oneVPL Tools

There are oneVPL tools available in the oneVPL base repository which are a good starting point to get familiar with the oneVPL functionalities. The command line tools for checking available implementation supported by the current system are in the cli directory and the ‘sample-*’ tools are in the legacy directory.

Tools for Checking Implementation Capabilities

To check the implementation capabilities of oneVPL the following approaches can be used:,

First Approach:

vpl-inspect|grep Implementation


Sample output:

Implementation #0: mfx-gen
Implementation #1: oneAPI VPL CPU Implementation
  ImplName: oneAPI VPL CPU Implementation


Second Approach (improved visibility in Linux environment):

system_analyzer


Sample output:

Implementation #0: mfx-gen
  Library path: /usr/lib/x86_64-linux-gnu/libmfx-gen.so.1.2.7
  AccelerationMode: MFX_ACCEL_MODE_VIA_VAAPI
  ApiVersion: 2.7
  Impl: MFX_IMPL_TYPE_HARDWARE
  ImplName: mfx-gen
  MediaAdapterType: MFX_MEDIA_INTEGRATED
  VendorID: 0x8086
  DeviceID: 0x9A49
  GPU name: Intel® Iris® Xe Graphics GT2 (arch=Xe codename=Tiger Lake)
  PCI BDF: 0000:00:02.00
  DRMRenderNodeNum: 128
DeviceName: mfx-gen


Here, it is oneVPL dispatcher’s responsibility to load GPU runtimes (libmfxhw64.so or libmfxgen.so) depending on the system’s compatibility and selected codec. There are no plans for libmfxhw64.xo.1 to support new Gen (Gen12 or newer) platforms. Media SDK GPU runtime (libmfxhw64.so) runs on Gen9 and Gen11 while oneVPL GPU runtime (libmfxgen.so) supports Gen12, Xe, and newer hardware. Media SDK runtime supports API 1.35 and VPL runtime supports API 2.x (currently 2.8)

Loading GPU Runtimes
 

oneVPL Dispatcher Details
GPU implementation selection by oneVPL Dispatcher

Figure 1 demonstrates the different steps of the GPU implementation selection process. In step 1, the dispatcher creates the loader to load oneVPL implementation. In step 2, the dispatcher filters out available implementations based on the selected hardware and codec. In the next step (step-3), oneVPL dispatcher walks through all the available runtime libraries (libmfx-gen, libvplswref, and future libraries) available to gather more details and returns a sorted list of oneVPL implementations which are associated with specific Hardware.  For session creation (step 4), the best implementation for the current system, located at the 0th index of the sorted list, is loaded and initialized. After session creation is complete, an essential system handle is assigned to session that the loaded oneVPL library might use. 

‘sample_*’ Tools

sample_* tools are the oneVPL tools with the greatest number of functionalities. The main advantages of using sample_encode over other encoders are:

  • It supports almost all modern codecs available
  • It supports different features starting from bitrate control to enabling different fixed functions which give users more control of the video decoding, encoding, and processing.

sample_encode

sample_encode is a oneVPL application that performs preprocessing and encoding of an uncompressed video stream of raw native format (NV12 for GPU) according to a specific video compression standard and can write the encoded video stream to a specified file. 
Command Line format:

sample_encode h264|h265|mpeg2|mvc|jpeg -i InputYUVFile -o OutputEncodedFile -w width -h height -angle 180 -opencl


Sample Command Line:

sample_encode h265 -hw -i ../../../content/cars_320x240.nv12 -w 320 -h 240 -o output.h265


Sample Output:

Loaded Library configuration:
    Version: 2.7
    ImplName: mfx-gen
    Adapter number : 0
    Adapter type: integrated
    DRMRenderNodeNum: 128
Used implementation number: 0
Processing started
Frame number: 30
Encoding fps: 994
Processing finished

sample_decode

sample_decode is an oneVPL decode application that takes a file containing an encoded video elementary stream as an argument, that performs decoding of various video compression formats, and writes the decoded output to the file in raw native format (NV12 for GPU).
Command Line format:

sample_decode h264|h265|mpeg2|mvc|jpeg -i in.bit -o out.yuv


Sample Command Line:

sample_decode h265 -i ../../../content/cars_320x240.h265 -o out.yuv

Sample Output: 

Loaded Library configuration:
    Version: 2.7
    ImplName: mfx-gen
    Adapter number : 0
    Adapter type: integrated
    DRMRenderNodeNum: 128
Used implementation number: 0
Decoding started
Frame number:   30, fps: 742.868, fread_fps: 0.000, fwrite_fps: 0.000358
Decoding finished

sample_vpp

sample_vpp is a oneVPL application that performs video processing of raw video sequences.
Command Line Format:

sample_vpp [Options] -i InputFile -o OutputFile


Sample Command Line:

sample_vpp -sw 320 -sh 240  -dw 320 -dh 240 -i out.raw -o crop.raw 


Sample Output:

Loaded Library configuration:
    Version: 2.7
    ImplName: mfx-gen
    Adapter number : 0
    Adapter type: integrated
    DRMRenderNodeNum: 128
Used implementation number: 0
VPP started
Frame number: 30
VPP finished
Total frames 30
Total time 0.01 sec
Frames per second 2151.000 fps

sample_multi_transcode

sample_multi_transcode performs the transcoding (decoding and encoding) of a video stream from one compressed video format to another, with optional video processing (resizing) of uncompressed video prior to encoding. The application supports multiple input and output streams meaning it can execute multiple transcoding sessions concurrently.
Command Line Format:

Format-1: sample_multi_transcode [options] [--] pipeline-description 
Format-2: sample_multi_transcode [options] -par ParFile

ParFile is extension of what can be achieved by setting pipeline in the command line.


Sample Command Line:

sample_multi_transcode -hw -i::h265 ../../../content/cars_320x240.h265  -o::mpeg2 out.mpeg2
This command line transcodes a h265 video file to mpeg2 video format and writes it to out.mpeg2

.
Sample Output:

Session 0:
Loaded Library configuration:
    Version: 2.7
    ImplName: mfx-gen
    Adapter number : 0
    Adapter type: integrated
    DRMRenderNodeNum: 128
Used implementation number: 0
Input  video: HEVC
Output video: MPG2
Session 0 was NOT joined with other sessions
Transcoding started
Transcoding finished
Common transcoding time is 0.0288 sec
-------------------------------------------------------------------------------
*** session 0 [0x295d730] PASSED (MFX_ERR_NONE) 0.0287292 sec, 30 frames, 1044.235 fps
-hw -i::h265 ../../../content/cars_320x240.h265 -o::mpeg2 out.mpeg2