Introduction to Intel® Board Support Package and Intel® TCC Mode in UEFI BIOS

author-image

By

Overview

Real-time systems are any systems that must meet a specific time requirement. These systems are common to industrial control and automation, and various other use cases that demand bounded, low-latency processing time both within networked devices and between them:

  • Computer numerical control (CNC) machines
  • Programmable logic controllers (PLC)
  • Motion control
  • Robotics

Real-time systems need time bounded, low-latency response time to help support time-sensitive operations like assembly line coordination, or predictable robotic movements to support safety within a shared work environment. Intel® Board Support Package and Intel® TCC Mode in UEFI BIOS are suitable for real-time applications that require worst-case execution time to be within hundreds of microseconds. This guide provides an overview of Intel® Board Support Package and Intel® TCC Mode in UEFI BIOS features that you can use in architecting and tuning your systems.

Requirements in Designing a Real-time Application

Real-time applications require the following characteristics from their host systems:

  • Appropriate resource partitioning allows real-time workloads to access compute resources and shared resources when workloads need them within a multiprocessor system. Cache and memory are some examples of shared resources within a multiprocessor system. Resource partitioning helps mitigate the effects of resource contention. Resource contention happens when processor cores designated for real-time workloads and processor cores for best-effort work access shared cache simultaneously, causing conflicts within the system if resources are not properly regulated or partitioned. Such a scenario will delay task completion by the processor for real-time workloads.
  • Task prioritization is the ability to designate real-time workloads as high priority, so the system will attempt to execute these workloads before non-prioritized tasks.
  • Temporal requirements and synchronization dictate that tasks must be completed within a bounded latency, and that data transfers between subsystems must be completed by a certain deadline.

Real-time systems require hardware and software that support the above characteristics. This guide dives into boot firmware, OS, and driver features for real-time systems and explains how to activate them.

Intel® Solutions

Intel®-based systems provide capabilities to access and set your platform to meet the requirements mentioned above, starting from the UEFI BIOS, Yocto Project*-based Intel® Board Support Package with PREEMPT_RT patch kernel, and Intel® TCC Tools.

  • In the boot firmware such as UEFI BIOS, you can enable Intel® TCC Mode with a single BIOS setting that configures a combination of settings in the processor and other subsystems to support real-time task execution.
  • The Yocto Project*-based Intel® Board Support Package helps to address operating system latency and provides drivers to implement support of software coordination between integrated and PCIe-connected devices with dedicated time synchronization capabilities such as precision time management (PTM), time-aware general-purpose I/O (TGPIO), and time-sensitive networking (TSN) technology.  
  • Intel uses PREEMPT_RT patch with selected kernel parameters to create a real-time operating system that enables high resolution timers and CPU isolation feature for real-time workloads.

 

Intel® TCC Mode

A Single Setting to Enable All Real-time Features

Intel® TCC Mode is a single bit setting in the system BIOS that sets a combination of register settings in the hardware. By enabling Intel® TCC Mode in the system BIOS:

  • Processor and Platform Controller Hub (PCH) are enabled and configured with the best known configurations.
  • Developers do not necessarily need to understand registers details and how they work unless customization for a specific component is required.

Configuration Details

Intel® TCC Mode configures the following:

  • Disable power management feature to provide consistent compute performance
  • Enable data transfer prioritization settings between subsystems
  • Enable timestamps for I/Os and Ethernet controllers. 

Refer to each tab below to understand more about settings configured by Intel® TCC Mode: 

System Power Management
Feature Setting Description
C states Disabled When disabled, prohibits the core from entering low-power states.
Intel SpeedStep® Disabled When disabled, turns off Intel SpeedStep® technology. The technology enabled the management of processor power consumption via performance state (P-state) transitions.
ACPI D3Cold Support Disabled When disabled, prohibits some device power management states.
Low Power S0 Idle Capability Disabled When disabled, prohibits S0ix states. S0ix states shut off parts of the SoC when they are not in use.
SA GV Disabled When disabled, turns off SA GV. SA GV dynamically scales the work point (V/F), by applying Dynamic Voltage Frequency Scaling (DVFS) based on memory bendwidth utilization and/or the latency requirement of the various workloads for better energy efficiency at the System Agent.
Page Close Idle Timeout Disabled When disabled, turns off memory power management.
Power Down Mode Disabled When disabled, turns off memory power management.
Intel® Speed Shift Technology Disabled When disabled, turns off Intel® Speed Shift Technology that manage the processor power management via hardware performance state (P-states) transitions.
Intel® Turbo Boost Max Technology 3.0 Disabled When disabled, turns off Intel® Turbo Boost Max Technology 3.0 which enables certain CPU cores to reach higher turbo frequencies (when running single-threaded workloads) than others.

Graphics Power Management

Feature Setting Description
RC6 (Render Standby) Disabled When disabled, prohibits the GPU from entering low-power state (RC6).

I/O Power Management

Feature Setting Description
DMI Link ASPM Control Disabled When disabled, turns off Active State Power Management (ASPM) control for the DMI link.
IO Fabric Low Latency Enabled When enabled, turns off some power management in the PCH I/O fabrics. This option provides the most aggressive I/O fabric performance setting. S3 state is not supported.
Legacy I/O Low Latency Enabled When enabled, turns off PCH clock gating.
Platform Controller Hub (PCH) PCI Express* Configuration

For each root port:

  • ASPM: Disabled
  • L1 Substates: Disabled
  • ASPM: When disabled, turns off Active State Power Management (ASPM). ASPM is an autonomous hardware-based, active state mechanism that enables power savings even when the connected components are in the D0 state. After a period of idle link time, an ASPM Physical-Layer protocol places the idle link into a lower power state.
  • L1 Substates: When disabled, turns off PCIe L1 substates. The fundamental idea behind L1 substates is to use something other than the high-speed logic inside the PCIe* transceivers to wake the devices. The goal is to achieve near zero power consumption with an active state.
PCI Express Clock Gating Disabled When disabled, turns off PCIe clock gating. This feature gates platform reference clocks in certain PCI Express Link Power Management States

GT COS refers to the interaction of the Graphics Processing Unit (GPU) with Cache Allocation Technology (CAT), specifically which L3 cache ways the GPU can access. With Intel® TCC Mode, the GPU has a limited number of cache ways that it can allocate into the L3 cache and allow processor cores to allocate more cache ways for critical workloads.

Data Direct I/O (DDIO) is a capability originally introduced with Intel® Xeon® processors that allows I/O devices to allocate data directly into specified ways of the L3 cache, providing lower latency for the CPU when processing the data.

A variant of this capability is available on select SKUs of 11th Gen Intel® Core™ processors as write cache or WRC.

You can further ensure optimal results by making sure that software consumes data as quickly as possible so that it consumes the data from the cache rather than memory. This setting is not available in Intel Atom® x6000E Series processors.

To prioritize critical reads and writes above low-priority transactions, Intel® platforms provide Virtual Channels (VCs) and Traffic Classes to help facilitate I/O prioritization. There are multiple VCs that differentiate between traffic types by distributing and ordering traffic between VCs, prioritizing VCs through arbitration, and tailoring VC attributes to each VC's specific purpose. To support real-time applications, the fabric in the SoC implements a low-bandwidth, high-priority VC that is generically and sometimes architecturally referred to as VCrt. 

VCs are implemented on a per-hop basis, so each IP and each sub-component within an IP must explicitly enable each VC. Intel® platforms support three types of VCs for real-time applications:

  • PCIe Virtual Channels - limited to the scope of PCIe controller, switches, and devices that make up the local PCIe hierarchy of a platform. 
  • Ethernet Virtual Channels - use the same PCIe VC concept for Ethernet endpoints by providing mechanisms to specify the transactions that should use VCrt.
  • Fabric Virtual Channels - VC implementation in the System-on-Chip (SoC) fabric. 

For more information about each VC implementation, refer to Chapter 7.4 Upstream Virtual Channels in the following guides:

Intel® TCC Mode enables time synchronization features by default with no additional action or platform tuning needed. These features are:

  • Time-Aware GPIO pins which you can program to toggle a TGPIO pin at a specific timestamp or capture the timestamp when the pin is toggled. This is done in the hardware and other software do not affect the precision. This feature is useful to time-synchronized the activities between devices by using TGPIO pins as signals or synchronizing your processor clock with an external signal.
  • Ethernet timestamps that synchronize the time between system clock (CLOCK_REALTIME) and a Precision Time Protocol Hardware Clock (PHC). This feature allows your application to extend precise time synchronization to other devices on the network beyond the compute node.

For sample applications that demonstrate the use of these features, visit the Intel® TCC Tools product page.

Intel® Board Support Package Configurations

Intel® Board Support Package consists of real-time configurations for operating system, Linux PREEMPT_RT patch, a set of drivers and settings that support key real-time behaviours such as resource partitioning, task prioritization, and operating system optimization to help achieve temporal isolation in real-time systems. These capabilities can be activated through kernel parameters, boot commands, and command-line parameters in the system. 

Intel uses the Linux with PREEMPT_RT patch to enable real-time features such as high-resolution timers and CPU isolation for real-time workloads. The following are parameters tested on Intel® platforms to provide the most optimized real-time kernel indicated by Cyclic Test and Real-Time Compute Performance (RTCP).
Parameters Values Description
CONFIG_GENERIC_IRQ_MIGRATION Y Supports generic IRQ migrating off CPU before the CPU is offline.
CONFIG_HIGH_RES_TIMERS Y Enables high-resolution timer support. Only applicable for tick functionality. 
CONFIG_CPU_ISOLATION Y Ensure CPUs that are running critical tasks are not disturbed by any source of "noise". Use isolcpus parameter to select CPUs for critical tasks.
CONFIG_RCU_NOCB_CPU Y Reduces OS jitter for aggressive HPC or real-time workloads. Use rcu_nocbs boot parameter to specify the CPUs that require offload RCU callback
CONFIG_SMP Y Enables symmetric multiprocessing support on multiprocessor systems.
CONFIG_MIGRATION Y Allows migration of the pages and processes physical location without changing virtual addresses.
CONFIG_PCIEPORTBUS Y Enables PCIe Port Bus to support Native Hot-Plug, Advanced Error Reporting, power management events, and Downstream Port Containment.
CONFIG_PCIE_PTM Y Enables PCIe Precision Time Measurement (PTM) support.
CONFIG_EXPERT Y Configures standard kernel features (expert users).
CONFIG_PREEMPT_RT Y Turns the kernel into a real-time kernel by replacing various locking primitives(spinlocks, rwlocks, etc.) with preemptible priority-inheritance aware variants, enforcing interrupt threading, and introducing mechanisms to break up long nonpreemptible sections.
CONFIG_PREEMPT_RT_FULL Y Turns the kernel into a real-time kernel by replacing various locking primitives(spinlocks, rwlocks, etc.) with preemptible priority-inheritance aware variants, enforcing interrupt threading, and introducing mechanisms to break up long nonpreemptible sections.
CONFIG_CPU_FREQ N Do not include CPU frequency scaling feature.
CONFIG_SCHED_MC_PRIO N Do not include CPU core priorities scheduler support feature.
CONFIG_PREEMPT_RCU Y Selects the RCU implementation designed for very large SMP systems that has many CPUs and requires real-time response.
CONFIG_HUGETLBFS Y Includes HugeTLB file system support.
CONFIG_EFI Y Enables the kernel to use available EFI runtime services.

Kernel parameters references: Kconfig file in linux-intel-lts github  

Intel recommends using the following kernel command-line parameters and values to boot the operating system for different platforms.
Parameter Value
Intel® Xeon® W-11000E Series 11th Gen Intel® Core™ Processors Intel Atom® x6000E Series
processor.max_cstate 0 0 0
intel_idle.max_cstate 0 0 0
clocksource tsc tsc tsc
tsc reliable reliable reliable
nmi_watchdog 0 0 0
nosoftlockup No value required No value required No value required
intel_pstate disable disable disable
idle poll poll poll
noht No value required. 
This can cause the timer tick to be enabled even with nohz_full set.
No value required. 
This can cause the timer tick to be enabled even with nohz_full set.

No value required. 
This can cause the timer tick to be enabled even with nohz_full set.

nohz No value required.
Note: Intel has found that nohz negatively affects the real-time performance of this processor and recommends omitting nohz in the kernel boot parameters.
No value required.
Note: Intel has found that nohz negatively affects the real-time performance of this processor and recommends omitting nohz in the kernel boot parameters.
No value required.
Note: Intel has found that nohz negatively affects the real-time performance of this processor and recommends omitting nohz in the kernel boot parameters.
In instances where the workload has frequent kernel interactions or is event driven, this parameter may impact performance compared to workload running in user space.

 

nohz_full X-Y (where X is the first CPU and Y is the last CPU in a consecutive list of CPUs allocated to real-time applications) X-Y (where X is the first CPU and Y is the last CPU in a consecutive list of CPUs allocated to real-time applications) 2 - 3
isolcpus X-Y (where X is the first CPU and Y is the last CPU in a consecutive list of CPUs allocated to real-time applications) X-Y (where X is the first CPU and Y is the last CPU in a consecutive list of CPUs allocated to real-time applications) 2 - 3
rcu_nocbs X-Y (where X is the first CPU and Y is the last CPU in a consecutive list of CPUs allocated to real-time applications) X-Y (where X is the first CPU and Y is the last CPU in a consecutive list of CPUs allocated to real-time applications) 2 - 3
irqaffinity X-Y (where X is the first CPU and Y is the last CPU in a consecutive list of CPUs allocated to handling interrupts) X-Y (where X is the first CPU and Y is the last CPU in a consecutive list of CPUs allocated to handling interrupts) 0
hugepages 1024 1024 1024
cpufreq.off 1
Note: Option may not be applicable to all kernel versions and may only benefit performance of some real-time  workloads. On some kernel versions, it may be appropriate to handle after boot to ensure frequency is locked at max.
1
Note: Option may not be applicable to all kernel versions and may only benefit performance of some real-time workloads. On some kernel versions, it may be approriate to handle after boot to ensure frequency is locked at max.
1
Note: Option may not be applicable to all kernel versions and may only benefit performance of some real-time workloads. On some kernel versions, it may be approriate to handle after boot to ensure frequency is locked at max.
i915.enable_rc6 0
Note: Option is only required when the platform includes integrated graphics and may not be available to all kernel versions. May only benefit performance of some real-time workloads.
0
Note: Option may not be applicable to all kernel versions and may only benefit performance of some real-time workloads.
0
Note: Option may not be applicable to all kernel versions and may only benefit performance of some real-time workloads. 
i915.enable_dc 0
Note: Option is only required when the platform includes integrated graphics.
0
Note: Option is only required when the platform includes integrated graphics.
0
Note: Option is only required when the platform includes integrated graphics.
i915.disable_power_well 0
Note: Option is only required when the platform includes integrated graphics.
0
Note: Option is only required when the platform includes integrated graphics.
0
Note: Option is only required when the platform includes integrated graphics.
mce off off off
hpet disable disable disable
numa_balancing disable disable disable
efi runtime runtime runtime
nowatchdog No value required
Note: Option may only benefit performance of some real-time workloads. Recommended on a case-by-case basis.
No value required
Note: Option may only benefit performance of some real-time workloads. Recommended on a case-by-case basis.
No value required
Note: Option may only benefit performance of some real-time workloads. Recommended on a case-by-case basis.
iommu pt
Note: Option may only benefit performance of some real-time workloads. Recommended on a case-by-case basis.
pt
Note: Option may only benefit performance of some real-time workloads. Recommended on a case-by-case basis.
pt
Note: Option may only benefit performance of some real-time workloads. Recommended on a case-by-case basis.
art virtallow
Note: Option may only benefit performance of some real-time workloads. Recommended on a case-by-case basis.
virtallow
Note: Option may only benefit performance of some real-time workloads. Recommended on a case-by-case basis.
virtallow
Note: Option may only benefit performance of some real-time workloads. Recommended on a case-by-case basis.
rcupdate.rcu_cpu_stall_suppress 1
Note: Option may only benefit performance of some real-time workloads. Recommended on a case-by-case basis.
1
Note: Option may only benefit performance of some real-time workloads. Recommended on a case-by-case basis.
1
Note: Option may only benefit performance of some real-time workloads. Recommended on a case-by-case basis.

You can use a combination of the command-line parameter isolcpus and the runtime parameter taskset to section off cores for real-time workloads and leave some cores for best-effort work and system management. Using isolcpus can prevent Linux OS from scheduling anything on the designated cores, while taskset enables you to lock a specific program to a specific core. If you use taskset to lock a program to a specific core, that program will be the only process to run on that core.

Example of using isolcpus to isolate certain CPU cores for real-time workloads during boot up:

isolcpus=nohz,domain,1-3
rcu_nocbs=1-3
nohz_full=1-3

The isolcpus option with nohz flag instructs Linux not to schedule any task to CPU cores 1-3 with  nohz being used to disable the scheduler tick on these CPU cores. The rcu_nocbs offloads Read-Copy-Update (RCU) callbacks on CPU cores 1-3 while the nohz_full disables the timer ticks to these CPU cores.

Example of using taskset to assign laser engraving application to CPU 0 during runtime:

taskset -c 0 ./robotic-arm-tcc

 

Intel provides Linux OS driver support through its Yocto Project–based BSP. These drivers implement support for native Linux APIs to enable software coordination between integrated and PCIe-connected devices with dedicated time synchronization capabilities, such as PCIe devices capable of precision time management (PTM), Time-aware General-Purpose I/O (TGPIO), and Ethernet interfaces enabled by Time-Sensitive Networking (TSN). The drivers use virtual channels to prioritize real-time data flows within the platform to the Ethernet interface where available. Read the release notes to learn more about the full capabilities of the BSP support package:

You can modify the behaviour of the scheduler to prioritize tasks over others by binding a process or a thread to a core. You can use Linux* tools such as taskset and numactl to enable the binding of a process to specific core.
Example of binding all processes generated by test.sh to CPU0.  By running this command, the processes generated by test.sh will have access to localized memory, resulting in better latency results than if processes were on separate CPU nodes. 

numactl --cpunobind=0 test.sh

The chrt command can also be used to modify schedule attributes, but may cause conflicts with the isolcpus settings.
Example of setting FIFO with priority 99 to a certain process id using chrt option.

chrt -f -p 99 <pid>

Intel® TCC Tools

Intel® TCC Tools provide APIs and tools that can optimize the system in a more granular level such as balancing power consumption with real-time performance or optimizing data streams between memory and other subsystems. The tools also provide measurement and analysis tools to analyze and identify bottleneck in your applications.
Intel® TCC Tools consist of the following features:

  1. Data Stream Optimizer (DSO): A command-line tool that automates real-time platform tuning and addresses workload latency between the CPU, memory and PCIe endpoints. Using this tool reduces the effort in manually testing and tuning the system because it takes in a set of requirements and automatically find the best system configurations that meet your defined real-time requirements. 
  2. Cache Allocation:
    1. Cache Configurator - a command-line tool that manages cache memory resources across various components such as CPU, GPU, and I/O. It provides a list of preset cache partitioning schemes with varying levels of cache isolation and software SRAM to choose from. 
    2. Cache Allocation Library - a collection of APIs to allocate specific buffers for real-time processes. The library is useful when there is a need to bound the worst-case execution time (WCET) of a particular function in the application and it is not achievable with malloc due to memory access jitter. 
  3. Measurement and analysis: Use these APIs to debug and troubleshoot your real-time system.
    1. Real-time Readiness Checker - a diagnostic tool that verifies if the system is set with the features to support real-time performance.
    2. Measurement Library - a set of functions that can be added into your application codes to measure latency statistics and deadline violations with low runtime overhead.

Accelerate Your Development with Intel Partners

Intel is committed to cultivating an ecosystem dedicated to building and enhancing real-time solutions. By collaborating with different solution providers, Intel provides a list of Intel partners to help accelerate your real-time system development and time to market. Go to Intel® Real-Time Ecosystem Highlights to find out more about how Intel partners benefit from integrating Intel® TCC into their solutions.