Intel® Graphics Performance Analyzers User Guide

ID 767266
Date 9/28/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Graphics API Metrics

This section describes all Graphics API metrics accessible from the Intel® GPA.

Main Metrics

Metric Name

Description

Draw Calls

Represents the number of draw calls issued from your application per frame. When the Draw Calls metric is high, see if you can optimize your application's geometry to improve vertex batching into fewer draw calls.

(Windows* only) Represents the number of draws submitted to the D3D Runtime.

Frames per Second

Represents the instantaneous frame rate normalized to seconds (inverted frame time).

Frame Number

Represents the number of the current CPU frame.

Frame Time

Represents the instantaneous frame time in microseconds.

DirectX* Metrics

Metric Name

Description

RT Changes

Represents the number of calls per frame:

  • Microsoft DirectX 9: The aggregated number of IDirect3DDevice9::SetRenderTarget calls per frame.
  • Microsoft DirectX 9Ex: The aggregated number of IDirect3DDevice9Ex::SetRenderTarget calls per frame.

Shader Creations

Represents the number of calls per frame:

  • Microsoft DirectX 9: The aggregated number of IDirect3DDevice9::CreateVertexShader/CreatePixelShader calls per frame.
  • Microsoft DirectX 9Ex : The aggregated number of IDirect3DDevice9Ex::CreateVertexShader/CreatePixelShader calls per frame.
  • Microsoft DirectX 10: The aggregated number of ID3D10Device::CreateVertexShader/CreateGeometryShader/ CreateGeometryShaderWithStreamOutput/CreatePixelShader calls per frame.
  • Microsoft DirectX 11: The aggregated number of ID3D11Device::CreateVertexShader/CreateGeometryShader/ CreateGeometryShaderWithStreamOutput/CreatePixelShader/CreateHullShader/CreateDomainShader calls per frame.

State Block Applies

Represents the number of IDirect3DStateBlock9::Apply calls per frame.

State Block Captures

Represents the number of IDirect3DStateBlock9::Capture calls per frame.

State Changes

Represents the number of state changes set to the D3D Runtime.

Texture Creations

Represents the number of calls per frame:

  • Microsoft DirectX 9: The aggregated number of IDirect3DDevice9::CreateTexture/CreateCubeTexture/CreateVolumeTexture calls per frame.
  • Microsoft DirectX 9Ex: The aggregated number of IDirect3DDevice9Ex::CreateTexture/CreateCubeTexture/CreateVolumeTexture calls per frame.
  • Microsoft DirectX 10: The aggregated number of ID3D10Device::CreateTexture1D/2D/3D calls per frame.
  • Microsoft DirectX 11: The aggregated number of ID3D11Device::CreateTexture1D/2D/3D calls per frame.

Buffer Creations

Represents the following:

  • Microsoft DirectX 10: The number of ID3D10Device::CreateBuffer calls per frame.
  • Microsoft DirectX 11: The number of ID3D11Device::CreateBuffer calls per frame.

Buffer Maps

Represents the following:

  • Microsoft DirectX 10: The number of ID3D10Buffer::Map calls per frame.
  • Microsoft DirectX 11: The number of ID3D11DeviceContext::Map calls that receive ID3D11Buffer as a resource parameter per frame.

Clears

Represents the number of calls per frame:

  • Microsoft DirectX 9 : The aggregated number of IDirect3DDevice9::Clear calls per frame.
  • Microsoft DirectX 9Ex : The aggregated number of IDirect3DDevice9Ex::Clear calls per frame.

Color Fills

Represents the number of calls per frame:

  • Microsoft DirectX 9 : The aggregated number of IDirect3DDevice9::ColorFill calls per frame.
  • Microsoft DirectX 9Ex : The aggregated number of IDirect3DDevice9Ex::ColorFill calls per frame.

IB Creations

Represents the number of calls per frame:

  • Microsoft DirectX 9 : The aggregated number of IDirect3DDevice9::CreateIndexBuffer calls per frame.
  • Microsoft DirectX 9Ex : The aggregated number of IDirect3DDevice9Ex::CreateIndexBuffer calls per frame.

IB Lock Time

Represents the amount of time in microseconds spent in IB locks. This is the amount of time actually inside the lock routines only.

IB Locks

Represents the number of index buffer (IB) locks called.

Locks

Represents the number of vertex buffer, index buffer, surface, and volume locks called per frame.

Lock Time

Represents the total amount of time in microseconds spent in vertex buffer, index buffer, surface, and volume locks. This is the amount of time actually inside the lock routines only.

Maps

Represents the number of texture and buffer maps.

Resource Copies

Represents the number of calls per frame:

  • Microsoft DirectX 10 : The number of ID3D10Device::CopyResource calls per frame.
  • Microsoft DirectX 11 : The number of ID3D11DeviceContext::CopyResource calls per frame.

Resource Creations

Represents the number of calls per frame:

  • Microsoft DirectX 9 : The aggregated number of IDirect3DDevice9 CreateXXX calls per frame.
  • Microsoft DirectX 9Ex : The aggregated number of IDirect3DDevice9Ex CreateXXX calls per frame.
  • Microsoft DirectX 10 : The aggregated number of ID3D10Device::CreateXXX calls per frame.
  • Microsoft DirectX 11 : The aggregated number of ID3D11Device::CreateXXX calls per frame.

CreateXXX means all Create calls; that is, the aggregation for DX Create Texture per Frame, DX Create Shader per Frame, DX Create Surface per Frame, etc.

RT Clears

Represents the number of calls per frame:

  • Microsoft DirectX 10 : The number of ID3D10RenderTargetView::Clear calls per frame.
  • Microsoft DirectX 11 : The number of ID3D11DeviceContext::ClearRenderT calls per frame.

RT Data Gets

Represents the number of calls per frame:

  • Microsoft DirectX 9 : The aggregated number of IDirect3DDevice9::GetRenderTargetData calls per frame.
  • Microsoft DirectX 9Ex : The aggregated number of IDirect3DDevice9Ex::GetRenderTargetData calls per frame.

Stretch Rects

Represents the number of calls per frame.

  • Microsoft DirectX 9 : The aggregated number of IDirect3DDevice9::StretchRect calls per frame.
  • Microsoft DirectX 9Ex : The aggregated number of IDirect3DDevice9Ex::StretchRect calls per frame.

Surface Creations

Represents the number of calls per frame:

  • Microsoft DirectX 9 : The aggregated number of IDirect3DDevice9::CreateOffscreenPlainSurface/CreateDepthStencilSurface/CreateRenderTarget calls per frame.
  • Microsoft DirectX 9Ex : The aggregated number of IDirect3DDevice9Ex::CreateOffscreenPlainSurface/CreateDepthStencilSurface/CreateRenderTarget calls per frame.

Surface Lock Time

Represents the total amount of time in microseconds spent in DX surface locks per frame.

Surface Locks

Represents the number of DirectX surface resource locks per frame.

Surface Updates

Represents the number of calls per frame:

  • Microsoft DirectX 9: The aggregated number of IDirect3DDevice9::UpdateSurface calls per frame.
  • Microsoft DirectX 9Ex : The aggregated number of IDirect3DDevice9Ex::UpdateSurface calls per frame.

Subresource Copies

Represents the number of calls per frame:

  • Microsoft DirectX 10 : The number of ID3D10Device::CopySubresourceRegion calls per frame.
  • Microsoft DirectX 11 : The number of ID3D11DeviceContext::CopySubresourceRegion calls per frame.

Subresource Update

Represents the number of calls per frame:

  • Microsoft DirectX 10 : The number of ID3D10Device::UpdateSubresource calls per frame.
  • Microsoft DirectX 11 : The number of ID3D11DeviceContext::UpdateSubresource calls per frame.

Texture1D Maps

Represents the number of calls per frame:

  • Microsoft DirectX 10 : The number of ID3D10Texture1D::Map calls per frame.
  • Microsoft DirectX 11 : The number of ID3D11DeviceContext::Map calls that receive. ID3D11Texture1D as a resource parameter per frame.

Texture2D Maps

Represents the number of calls per frame:

  • Microsoft DirectX 10 : The number of ID3D10Texture2D::Map calls per frame.
  • Microsoft DirectX 11 : The number of ID3D11DeviceContext::Map calls that receive. ID3D11Texture2D as a resource parameter per frame.

Texture3D Maps

Represents the number of calls per frame:

  • Microsoft DirectX 10 : The number of ID3D10Texture3D::Map calls per frame.
  • Microsoft DirectX 11 : The number of ID3D11DeviceContext::Map calls that receive. ID3D11Texture3D as a resource parameter per frame.

VB Creations

Represents the number of calls per frame:

  • Microsoft DirectX 9 : The aggregated number of IDirect3DDevice9::CreateVertexBuffer calls per frame.
  • Microsoft DirectX 9Ex : The aggregated number of IDirect3DDevice9Ex::CreateVertexBuffer calls per frame.

VB Lock Time

Represents the amount of time in microseconds spent within all VB lock APIs. This is the amount of time actually inside the lock routines only.

VB Locks

Represents the number of vertex buffer (VB) locks called.

Volume Lock Time

Represents the total amount of time in microseconds spent in DX volume locks per frame.

Volume Locks

Represents the number of DirectX volume resource locks per frame.

Z/Stencil Clears

Represents the number of calls per frame:

  • Microsoft DirectX 10 : The number of ID3D10DepthStencilView::Clear calls per frame.
  • Microsoft DirectX 11 : The number of ID3D11DeviceContext::ClearDepthStencilView calls per frame.

OpenGL* Metrics

Metric Name

Description

Buffer Creations

Represents the number of OpenGL ES buffers that are allocated from your application per frame.

Improving Performance:

When the Buffer Creations metric is high, see if you can change your application to allocate required buffers once at application startup.

Indexed Draw Calls

Represents the number of indexed draw calls being issued from your application per frame.

Improving Performance:

  • When the Indexed Draw Calls metric is high, see if you can optimize your application's geometry to better batch vertices into fewer draw calls, specifically those that use glDrawElements or its modifications.
  • When the Indexed Draw Calls metric is significantly lower than the Draw Calls metric see if you can optimize your application to use indexed geometry passed to the glDrawElements API or its modifications, rather than glDrawArrays and its modifications. Additionally, check the Vertex Count and Indexed Vertex Count metrics to see how much data is being drawn without indexing to estimate the impact of such optimization.

Vertex Count

Represents the number of vertices passed by your application to OpenGL draw calls per frame. This totals the counts passed to the draw calls.

Improving Performance:

  • When the Vertex Count metric is higher than expected and fragment processing is not the bottleneck, check whether you can improve culling, or whether you can use LOD or other techniques to reduce geometric complexity.
  • When the Vertex Count metric is low, try to increase geometric complexity as an alternative to expensive operations such as Alpha Test that might slow down the pipeline.

Indexed Vertex Count

Represents the number of indexed vertices passed by your application to indexed draw calls per frame.

Improving Performance:

When the Indexed Vertex Count metric is significantly lower than the Vertex Count metric, see if you can optimize your application to use indexed geometry passed to the glDrawElements API or its modifications, rather than glDrawArrays and its modifications.

RT Clears

Represents the number of Render Target clears per frame.

UseProgram Calls

Represents the number of calls to use OpenGL ES shader programs from your application per frame.

Improving Performance:

When the UseProgram Calls metric is high, see if you can improve batching to draw all the objects that use a particular program before binding the next one. Combining several shaders into an uber-shader with static flow control can also reduce this metric and its associated cost, but does not always result in a performance improvement.

Error Gets

Represents the number of times that the glGetError method is called from your application per frame.

Improving Performance:

When the Error Gets metric is high, see if you can optimize your application to avoid checking for error conditions during its render path.

BindBuffer Calls

Represents the number of calls to bind OpenGL* ES buffers from your application per frame.

Improving Performance:

When the BindBuffer Calls metric is high, see if you can improve batching to draw all instances of the same geometry without switching the bound buffer. Additional performance gains might come from using one buffer for several pieces of geometry and just drawing the appropriate piece where needed.

BindTexture Calls

Represents the number of calls to bind OpenGL ES textures from your application per frame.

Improving Performance:

When the BindTexture Calls metric is high, see if you can improve batching to draw all the objects that use a particular texture before binding the next one. If possible, a further step to improve performance would be to use texture atlasing rather than multiple smaller textures.

RT Changes

Represents the number of Render Target changes per frame.