Microsoft* SQL Server* Tuning Guide for Online Analytical Processing Workload on 4th Generation Intel® Xeon® Scalable Processors Based Platform

ID 766473
Updated 11/7/2022
Version Latest
Public

author-image

By

Introduction

This guide is for users who are already familiar with Microsoft* SQL Server*. It provides recommendations for configuring hardware and software that will provide the best performance in most situations. However, please note that we rely on the users to carefully consider these settings for their specific scenarios, since Microsoft SQL Server can be deployed in multiple ways.

Microsoft SQL Server is a relational database management system developed by Microsoft. SQL Server is offered in many different editions, but this guide will focus on the SQL Server Enterprise edition. SQL Server’s database architecture is based on a Client-Server architecture model. This tuning guide will focus on Online Analytical Processing (OLAP) workloads (TPC-H like) while using SQL Server for Windows*.

4th Gen Intel® Xeon® Scalable processors deliver workload-optimized performance with built-in acceleration for AI, encryption, HPC, storage, database systems, and networking. They feature unique security technologies to help protect data on-premises or in the cloud.

  • New built-in accelerators for AI, HPC, Networking, Security, Storage and Analytics
  • Intel® Ultra Path Interconnect (Intel® UPI)
  • Intel® Speed Select Technology
  • Hardware-enhanced security
  • New flex bus I/O interface (PCIe 5.0 + CXL)
  • New flexible I/O interface up to 20 HSIO lanes (PCI 3.0)
  • Increased input/output (I/O) bandwidth with PCIe 5.0 (up to 80 lanes)
  • Increased memory bandwidth with DDR5
  • Increased multi-socket bandwidth with UPI 2.0 (up to 16 GT/s)

Server Configuration

Hardware

*The configuration described in this article is based on 4th Generation Intel® Xeon® processor hardware. The server platform, memory, hard drives, and network interface cards can be determined according to your usage requirements.

Hardware Model
Server Platform Intel Archer City Reference Platform
CPU 4th Gen Intel Xeon Scalable processor 8490
BIOS EGSDCRB1.86B.0090.D03
Memory 4096 GB (16x256GB DDR5 4800MT/s)
Storage/Disks 1x 960GB Intel® D3-S4510 SSD
  8x 7.68TB Intel® D7-P5510 SSD
NIC Intel® Ethernet Network Adapter X722-2

Software

Software Version
Operating System Windows Server 2022
Kernel Build 20348.1129
Binary Version SQL Server 2022 RTM (16.0.1000.6)
Workload Kit TPC-H Stepmaster, version 2.7.0-1005

BIOS Setting

Please note, that all BIOS settings outlined below are based on Intel’s software development platforms designed for the 3rd Gen Intel Xeon Scalable processors. Begin by resetting your BIOS to default setting, then follow the suggestion below for changes:

BIOS Configuration Item BIOS Location Recommended Value
energy efficient turbo EDKII Menu->Socket Configuration->Advanced Power Management Configuration->CPU P State Control->Energy Efficient Turbo Disable
turbo mode EDKII Menu->Socket Configuration->Advanced Power Management Configuration->CPU P State Control->Turbo Mode Enable
boot perf mode EDKII Menu->Socket Configuration->Advanced Power Management Configuration->CPU P State Control->Boot performance Mode Max perf*
Hardware P state EDKII Menu->Socket Configuration->Advanced Power Management Configuration->Hardware PM State Control->Hardware P State Disabled
Power perfromance tuning EDKII Menu->Socket Configuration->Advanced Power Management Configuration->CPU – Advanced PM Tuning->Energy Perf BIAS->Power Performance Tuning BIOS control EPB
Energy perfCFGmode EDKII Menu->Socket Configuration->Advanced Power Management Configuration->CPU – Advanced PM Tuning->Energy Perf BIAS->ENERGYPERFBIASCFGmode Performance
Workload configuration EDKII Menu->Socket Configuration->Advanced Power Management Configuration->CPU – Advanced PM Tuning->Energy Perf BIAS->Workload Configuration Leave Default
Package C state EDKII Menu->Socket Configuration->Advanced Power Management Configuration->CPU C State Control->Package C State Control->Packge C State C0\C1 state
Enhanced Halt State (C1E) EDKII Menu->Socket Configuration->Advanced Power Management Configuration->CPU C State Control->Enhanced Halt State (C1E) Disable
CPU C6 Report EDKII Menu->Socket Configuration->Advanced Power Management Configuration->CPU C State Control->CPU C6 Report Disable
C1 Auto Demotion EDKII Menu->Socket Configuration->Advanced Power Management Configuration->CPU C State Control->CPU C1 auto demotion Disable
C1 Auto UnDemotion EDKII Menu->Socket Configuration->Advanced Power Management Configuration->CPU C State Control->CPU C1 auto undemotion Disable
UmaBasedClustering EDKII Menu->Socket Configuration->Common RefCode Configuration->UMA-Based Clustering Disable (All2All)
Sub Numa Cluster EDKII Menu->Socket Configuration->Uncore Configuration->Uncore General Configuration->SNC (Sub NUMA) Enable SNC2 (2-clusters)
(Windows Server 2019 OS) VTD disable Socket Configuration -> Processor Configuration -> Extended APIC Disable
(Windows Server 2019 OS) VTD disable Socket Configuration -> IIO Configuration -> Intel VT for Directed I/O (VT-d) -> Intel VT for Directed I/O Disable

Memory Configuration/Settings

This workload runs best with at ratio of at least 40% memory capacity to scale factor size. For example, a 10TB scale factor TPC-H like run would perform optimally with 4TB+ of available memory. 1DPC is ideal to achieve maximum frequency capable for this platform. Workload is memory frequency sensitive, and as such it is recommended to run with the highest memory frequency available (DDR5 5600 MTS at the time of this writing). For specifics consult your platform’s manual.

Storage/Disk Configuration/Settings

The table below details the type and number of drives used. Each type of drive sets (DB, Temp DB, etc.…) are JBOD, with the initial 8 disks being split into 3 volumes with equal size allocations for each volume. Note: This configuration does not address redundancy concerns that may be requied for your organization.

Drive No Drive Description Volume 1 Volume 2 Volume 3
Drive 1 7.68TB Intel SSD D7-P5510 data01 tempdb01 backup01
Drive 2 7.68TB Intel SSD D7-P5510 data02 tempdb02 backup02
Drive 3 7.68TB Intel SSD D7-P5510 data03 tempdb03 backup03
Drive 4 7.68TB Intel SSD D7-P5510 data04 tempdb04 backup04
Drive 5 7.68TB Intel SSD D7-P5510 data05 tempdb05 backup05
Drive 6 7.68TB Intel SSD D7-P5510 data06 tempdb06 backup06
Drive 7 7.68TB Intel SSD D7-P5510 data07 tempdb07 backup07
Drive 8 7.68TB Intel SSD D7-P5510 data08 tempdb08 backup08
Drive 9 7.68TB Intel SSD D7-P5510 log N/A N/A
Drive 10 960GB Intel D3-S4510 OS N/A N/A

Network Configuration/Setting

This workload runs best with the following settings for all adapters connecting the Server to the Client:

The following PowerShell script is used as an example with ‘Ethernet 2’:

Set-NetAdapterAdvancedProperty -Name "Ethernet 2" -RegistryKeyword '*JumboPacket' –
RegistryValue '9014'
Set-NetAdapterRss
-Name "Ethernet 2" -Enable 1 -BaseProcessorGroup 0 -BaseProcessorNumber 0 -MaxProcessors 16 -NumaNode 65535 -MaxProcessorGroup 0 -MaxProcessorNumber 16 -NumberOfReceiveQueues 8  -Profile Closest

In addition, the following must be set in SQL Server using the SQL Server sp_configure command.

sp_configure network_packet_size,8192 
go
RECONFIGURE WITH OVERRIDE
go

Software Tuning

Software configuration tuning is essential. From the Operating System to SQL Server configuration settings, they are all designed for general purpose applications and default settings are almost never tuned for best performance.

Microsoft Windows Server* 2022 settings

Configuration Item Recommended Value
Windows Firewall Control Panel -> Windows Firewall -> 'Turn Windows Firewall on or off' (left side pane) => select "Turn off Windows Firewall" for both private and public networks
Visual Performance Setting Computer -> properties -> advanced system settings -> advanced -> performance -> settings -> visual: adjust best performance
Virtual Memory Setting Computer -> properties -> advanced tab, virtual memory change: custom size: 4096 initial and maximum (not system managed)
Power Profile Control Panel -> Hardware and Sound -> Power Options -> High Performance Option
Large Page Enable Large Page Enable (Reboot to take effect): Control Panel -> tools -> local security policy -> local policy -> user rights assignment Lock pages in memory properties —add user or group Administrators (or the user to be used to launch SQL server)
Remove Windows Defender Powershell command: Remove-WindowsFeature -Name Windows-Defender-Features

OLAP Architecture

The TPC-H like workload consists of business-oriented ad-hoc queries and concurrent data modification. It examines large volumes of data, executes queries with a high degree of complexity, and it is a widely trusted industry standard benchmark. TPC-H is a phased workload consisting of a power phase and a throughput phase. The power phase has an initial period of high disk IO activity, followed by the throughput phase, consisting of longer periods of compute intenstive activity. Memory capacity and memory bandwidth are key contributers of performance for this workload.

Example of an OLAP benchmarking hardware configuration:

Tuning SQL Server for OLAP Workload

The following sp_configure commands should be used to configure SQL Server:

sp_configure show_advanced_options,1
go
RECONFIGURE WITH OVERRIDE
go
sp_configure max_degree_of_parallelism,<This number should be equal to number of logical processors/threads>
go
RECONFIGURE WITH OVERRIDE
go
sp_configure min server memory,<This number should be equal to 90% of system memory>
go
RECONFIGURE WITH OVERRIDE
go
sp_configure max server memory,<This number should be equal to 90% of system memory>
go
RECONFIGURE WITH OVERRIDE
go
sp_configure network_packet_size,8192
go
RECONFIGURE WITH OVERRIDE
go

The following flags should be used on the command line when starting up SQL Server prior to running the OLAP workload.

Sqlservr.exe -c -x -T834
Related Tools
Microsoft Windows Performance Monitor tool (Perfmon) can be used to monitor overall system and disk performance metrics.

Best Practices for Testing and Verification

This workload is best run just after a fresh restoration of the database from backup and restart of the system. This workload should drive CPU Utilization to 90%+ across all logical processors (in the throughput phase). CPU kernel utilization should be sub 1%. This is a phased workload with a period of intense read dominate sequential I/O followed by a compute intensive phase requiring high amounts of memory capacity and bandwidth. Typically, this workload reaches peak performance with memory size equal to about 40% of the database size. For verification of the database integrity, refer to the TPC-H kit mentioned in resources for details. The TPC-H kit provides verification tools suitable for an audit per the TPC-H benchmark spec.

Conclusion

We have shared our best-known methods to optimally benchmark our 4th Generation Intel Xeon Scalable Processors using an OLAP benchmark. We have covered both software and hardware configuration considerations to get the best performance.

Additional Resources

Microsoft SQL Server

Transaction Processing Council

TPC-H benchmark kit version 3.0.1