The Intel(R) Speed Select Technology (Intel(R) SST) provides a powerful newcollection of features that give more granular control over CPU performance.With Intel(R) SST, one server can be configured for power and performance for avariety of diverse workload requirements.
These capabilities are further enhanced in some of the newer generations ofserver platforms where these features can be enumerated and controlleddynamically without pre-configuring via BIOS setup options. This dynamicconfiguration is done via mailbox commands to the hardware. One way to enumerateand configure these features is by using the Intel Speed Select utility.
This document explains how to use the Intel Speed Select tool to enumerate andcontrol Intel(R) SST features. This document gives example commands and explainshow these commands change the power and performance profile of the system undertest. Using this tool as an example, customers can replicate the messagingimplemented in the tool in their production software.
intel-speed-select configuration tool¶
Most Linux distribution packages may include the “intel-speed-select” tool. If not,it can be built by downloading the Linux kernel tree from kernel.org. Oncedownloaded, the tool can be built without building the full kernel.
From the kernel tree, run the following commands:
# cd tools/power/x86/intel-speed-select/# make# make install
Getting Help¶
To get help with the tool, execute the command below:
# intel-speed-select --help
The top-level help describes arguments and features. Notice that there is amulti-level help structure in the tool. For example, to get help for the feature “perf-profile”:
# intel-speed-select perf-profile --help
To get help on a command, another level of help is provided. For example for the command info “info”:
# intel-speed-select perf-profile info --help
Summary of platform capability¶
To check the current platform and driver capabilities, execute:
#intel-speed-select --info
For example on a test system:
# intel-speed-select --infoIntel(R) Speed Select TechnologyExecuting on CPU model: XPlatform: API version : 1Platform: Driver version : 1Platform: mbox supported : 1Platform: mmio supported : 1Intel(R) SST-PP (feature perf-profile) is supportedTDP level change control is unlocked, max level: 4Intel(R) SST-TF (feature turbo-freq) is supportedIntel(R) SST-BF (feature base-freq) is not supportedIntel(R) SST-CP (feature core-power) is supported
Intel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP)¶
This feature allows configuration of a server dynamically based on workloadperformance requirements. This helps users during deployment as they do not haveto choose a specific server configuration statically. This Intel(R) Speed SelectTechnology - Performance Profile (Intel(R) SST-PP) feature introduces a mechanismthat allows multiple optimized performance profiles per system. Each profiledefines a set of CPUs that need to be online and rest offline to sustain aguaranteed base frequency. Once the user issues a command to use a specificperformance profile and meet CPU online/offline requirement, the user can expecta change in the base frequency dynamically. This feature is called“perf-profile” when using the Intel Speed Select tool.
Number or performance levels¶
There can be multiple performance profiles on a system. To get the number ofprofiles, execute the command below:
# intel-speed-select perf-profile get-config-levelsIntel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 get-config-levels:4package-1 die-0 cpu-14 get-config-levels:4
On this system under test, there are 4 performance profiles in addition to thebase performance profile (which is performance level 0).
Lock/Unlock status¶
Even if there are multiple performance profiles, it is possible that theyare locked. If they are locked, users cannot issue a command to change theperformance state. It is possible that there is a BIOS setup to unlock or checkwith your system vendor.
To check if the system is locked, execute the following command:
# intel-speed-select perf-profile get-lock-statusIntel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 get-lock-status:0package-1 die-0 cpu-14 get-lock-status:0
In this case, lock status is 0, which means that the system is unlocked.
Properties of a performance level¶
To get properties of a specific performance level (For example for the level 0, below), execute the command below:
# intel-speed-select perf-profile info -l 0Intel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 perf-profile-level-0 cpu-count:28 enable-cpu-mask:000003ff,f0003fff enable-cpu-list:0,1,2,3,4,5,6,7,8,9,10,11,12,13,28,29,30,31,32,33,34,35,36,37,38,39,40,41 thermal-design-power-ratio:26 base-frequency(MHz):2600 speed-select-turbo-freq:disabled speed-select-base-freq:disabled ... ...
Here -l option is used to specify a performance level.
If the option -l is omitted, then this command will print information about allthe performance levels. The above command is printing properties of theperformance level 0.
For this performance profile, the list of CPUs displayed by the“enable-cpu-mask/enable-cpu-list” at the max can be “online.” When thatcondition is met, then base frequency of 2600 MHz can be maintained. Tounderstand more, execute “intel-speed-select perf-profile info” for performancelevel 4:
# intel-speed-select perf-profile info -l 4Intel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 perf-profile-level-4 cpu-count:28 enable-cpu-mask:000000fa,f0000faf enable-cpu-list:0,1,2,3,5,7,8,9,10,11,28,29,30,31,33,35,36,37,38,39 thermal-design-power-ratio:28 base-frequency(MHz):2800 speed-select-turbo-freq:disabled speed-select-base-freq:unsupported ... ...
There are fewer CPUs in the “enable-cpu-mask/enable-cpu-list”. Consequently, ifthe user only keeps these CPUs online and the rest “offline,” then the basefrequency is increased to 2.8 GHz compared to 2.6 GHz at performance level 0.
Get current performance level¶
To get the current performance level, execute:
# intel-speed-select perf-profile get-config-current-levelIntel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 get-config-current_level:0
First verify that the base_frequency displayed by the cpufreq sysfs is correct:
# cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency2600000
This matches the base-frequency (MHz) field value displayed from the“perf-profile info” command for performance level 0(cpufreq frequency is inKHz).
To check if the average frequency is equal to the base frequency for a 100% busyworkload, disable turbo:
# echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
Then runs a busy workload on all CPUs, for example:
#stress -c 64
To verify the base frequency, run turbostat:
#turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 Package Core CPU Bzy_MHz - - 2600 0 0 0 2600 0 1 1 2600 0 2 2 2600 0 3 3 2600 0 4 4 2600 . . . .
Changing performance level¶
To the change the performance level to 4, execute:
# intel-speed-select -d perf-profile set-config-level -l 4 -oIntel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 perf-profile set_tdp_level:success
In the command above, “-o” is optional. If it is specified, then it will alsooffline CPUs which are not present in the enable_cpu_mask for this performancelevel.
Now if the base_frequency is checked:
#cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency2800000
Which shows that the base frequency now increased from 2600 MHz at performancelevel 0 to 2800 MHz at performance level 4. As a result, any workload, which canuse fewer CPUs, can see a boost of 200 MHz compared to performance level 0.
Changing performance level via BMC Interface¶
It is possible to change SST-PP level using out of band (OOB) agent (Via someremote management console, through BMC “Baseboard Management Controller”interface). This mode is supported from the Sapphire Rapids processorgeneration. The kernel and tool change to support this mode is added to Linuxkernel version 5.18. To enable this feature, kernel config“CONFIG_INTEL_HFI_THERMAL” is required. The minimum version of the toolis “v1.12” to support this feature, which is part of Linux kernel version 5.18.
To support such configuration, this tool can be used as a daemon. Adda command line option --oob:
# intel-speed-select --oobIntel(R) Speed Select TechnologyExecuting on CPU model:143[0x8f]OOB mode is enabled and will run as daemon
In this mode the tool will online/offline CPUs based on the new performancelevel.
Check presence of other Intel(R) SST features¶
Each of the performance profiles also specifies weather there is support ofother two Intel(R) SST features (Intel(R) Speed Select Technology - Base Frequency(Intel(R) SST-BF) and Intel(R) Speed Select Technology - Turbo Frequency (IntelSST-TF)).
For example, from the output of “perf-profile info” above, for level 0 and level4:
- For level 0::
speed-select-turbo-freq:disabledspeed-select-base-freq:disabled
- For level 4::
speed-select-turbo-freq:disabledspeed-select-base-freq:unsupported
Given these results, the “speed-select-base-freq” (Intel(R) SST-BF) in level 4changed from “disabled” to “unsupported” compared to performance level 0.
This means that at performance level 4, the “speed-select-base-freq” feature isnot supported. However, at performance level 0, this feature is “supported”, butcurrently “disabled”, meaning the user has not activated this feature. Whereas“speed-select-turbo-freq” (Intel(R) SST-TF) is supported at both performancelevels, but currently not activated by the user.
The Intel(R) SST-BF and the Intel(R) SST-TF features are built on a foundationtechnology called Intel(R) Speed Select Technology - Core Power (Intel(R) SST-CP).The platform firmware enables this feature when Intel(R) SST-BF or Intel(R) SST-TFis supported on a platform.
Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP)¶
Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP) is an interface thatallows users to define per core priority. This defines a mechanism to distributepower among cores when there is a power constrained scenario. This defines aclass of service (CLOS) configuration.
The user can configure up to 4 class of service configurations. Each CLOS groupconfiguration allows definitions of parameters, which affects how the frequencycan be limited and power is distributed. Each CPU core can be tied to a class ofservice and hence an associated priority. The granularity is at core level notat per CPU level.
Enable CLOS based prioritization¶
To use CLOS based prioritization feature, firmware must be informed to enableand use a priority type. There is a default per platform priority type, whichcan be changed with optional command line parameter.
To enable and check the options, execute:
# intel-speed-select core-power enable --helpIntel(R) Speed Select TechnologyExecuting on CPU model: XEnable core-power for a package/die Clos Enable: Specify priority type with [--priority|-p] 0: Proportional, 1: Ordered
There are two types of priority types:
Ordered
Priority for ordered throttling is defined based on the index of the assignedCLOS group. Where CLOS0 gets highest priority (throttled last).
Priority order is:CLOS0 > CLOS1 > CLOS2 > CLOS3.
Proportional
When proportional priority is used, there is an additional parameter calledfrequency_weight, which can be specified per CLOS group. The goal ofproportional priority is to provide each core with the requested min., thendistribute all remaining (excess/deficit) budgets in proportion to a definedweight. This proportional priority can be configured using “core-power config”command.
To enable with the platform default priority type, execute:
# intel-speed-select core-power enableIntel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 core-power enable:successpackage-1 die-0 cpu-6 core-power enable:success
The scope of this enable is per package or die scoped when a package containsmultiple dies. To check if CLOS is enabled and get priority type, “core-powerinfo” command can be used. For example to check the status of core-power featureon CPU 0, execute:
# intel-speed-select -c 0 core-power infoIntel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 core-power support-status:supported enable-status:enabled clos-enable-status:enabled priority-type:proportionalpackage-1 die-0 cpu-24 core-power support-status:supported enable-status:enabled clos-enable-status:enabled priority-type:proportional
Configuring CLOS groups¶
Each CLOS group has its own attributes including min, max, freq_weight anddesired. These parameters can be configured with “core-power config” command.Defaults will be used if user skips setting a parameter except clos id, which ismandatory. To check core-power config options, execute:
# intel-speed-select core-power config --helpIntel(R) Speed Select TechnologyExecuting on CPU model: XSet core-power configuration for one of the four clos ids Specify targeted clos id with [--clos|-c] Specify clos Proportional Priority [--weight|-w] Specify clos min in MHz with [--min|-n] Specify clos max in MHz with [--max|-m]
For example:
# intel-speed-select core-power config -c 0Intel(R) Speed Select TechnologyExecuting on CPU model: Xclos epp is not specified, default: 0clos frequency weight is not specified, default: 0clos min is not specified, default: 0 MHzclos max is not specified, default: 25500 MHzclos desired is not specified, default: 0package-0 die-0 cpu-0 core-power config:successpackage-1 die-0 cpu-6 core-power config:success
The user has the option to change defaults. For example, the user can change the“min” and set the base frequency to always get guaranteed base frequency.
Get the current CLOS configuration¶
To check the current configuration, “core-power get-config” can be used. Forexample, to get the configuration of CLOS 0:
# intel-speed-select core-power get-config -c 0Intel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 core-power clos:0 epp:0 clos-proportional-priority:0 clos-min:0 MHz clos-max:Max Turbo frequency clos-desired:0 MHzpackage-1 die-0 cpu-24 core-power clos:0 epp:0 clos-proportional-priority:0 clos-min:0 MHz clos-max:Max Turbo frequency clos-desired:0 MHz
Associating a CPU with a CLOS group¶
To associate a CPU to a CLOS group “core-power assoc” command can be used:
# intel-speed-select core-power assoc --helpIntel(R) Speed Select TechnologyExecuting on CPU model: XAssociate a clos id to a CPU Specify targeted clos id with [--clos|-c]
For example to associate CPU 10 to CLOS group 3, execute:
# intel-speed-select -c 10 core-power assoc -c 3Intel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-10 core-power assoc:success
Once a CPU is associated, its sibling CPUs are also associated to a CLOS group.Once associated, avoid changing Linux “cpufreq” subsystem scaling frequencylimits.
To check the existing association for a CPU, “core-power get-assoc” command canbe used. For example, to get association of CPU 10, execute:
# intel-speed-select -c 10 core-power get-assocIntel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-1 die-0 cpu-10 get-assoc clos:3
This shows that CPU 10 is part of a CLOS group 3.
Disable CLOS based prioritization¶
To disable, execute:
# intel-speed-select core-power disable
Some features like Intel(R) SST-TF can only be enabled when CLOS based prioritizationis enabled. For this reason, disabling while Intel(R) SST-TF is enabled can causeIntel(R) SST-TF to fail. This will cause the “disable” command to display an errorif Intel(R) SST-TF is already enabled. In turn, to disable, the Intel(R) SST-TFfeature must be disabled first.
Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF)¶
The Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF) feature letsthe user control base frequency. If some critical workload threads demandconstant high guaranteed performance, then this feature can be used to executethe thread at higher base frequency on specific sets of CPUs (high priorityCPUs) at the cost of lower base frequency (low priority CPUs) on other CPUs.This feature does not require offline of the low priority CPUs.
The support of Intel(R) SST-BF depends on the Intel(R) Speed Select Technology -Performance Profile (Intel(R) SST-PP) performance level configuration. It ispossible that only certain performance levels support Intel(R) SST-BF. It is alsopossible that only base performance level (level = 0) has support of IntelSST-BF. Consequently, first select the desired performance level to enable thisfeature.
In the system under test here, Intel(R) SST-BF is supported at the baseperformance level 0, but currently disabled. For example for the level 0:
# intel-speed-select -c 0 perf-profile info -l 0Intel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 perf-profile-level-0 ... speed-select-base-freq:disabled ...
Before enabling Intel(R) SST-BF and measuring its impact on a workloadperformance, execute some workload and measure performance and get a baselineperformance to compare against.
Here the user wants more guaranteed performance. For this reason, it is likelythat turbo is disabled. To disable turbo, execute:
#echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
Based on the output of the “intel-speed-select perf-profile info -l 0” basefrequency of guaranteed frequency 2600 MHz.
Measure baseline performance for comparison¶
To compare, pick a multi-threaded workload where each thread can be scheduled onseparate CPUs. “Hackbench pipe” test is a good example on how to improveperformance using Intel(R) SST-BF.
Below, the workload is measuring average scheduler wakeup latency, so a lowernumber means better performance:
# taskset -c 3,4 perf bench -r 100 sched pipe# Running 'sched/pipe' benchmark:# Executed 1000000 pipe operations between two processes Total time: 6.102 [sec] 6.102445 usecs/op 163868 ops/sec
While running the above test, if we take turbostat output, it will show us that2 of the CPUs are busy and reaching max. frequency (which would be the basefrequency as the turbo is disabled). The turbostat output:
#turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1Package Core CPU Bzy_MHz0 0 0 10000 1 1 10050 2 2 10000 3 3 26000 4 4 26000 5 5 10000 6 6 10000 7 7 10050 8 8 10050 9 9 10000 10 10 10000 11 11 9950 12 12 10000 13 13 1000
From the above turbostat output, both CPU 3 and 4 are very busy and reachingfull guaranteed frequency of 2600 MHz.
Intel(R) SST-BF Capabilities¶
To get capabilities of Intel(R) SST-BF for the current performance level 0,execute:
# intel-speed-select base-freq info -l 0Intel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 speed-select-base-freq high-priority-base-frequency(MHz):3000 high-priority-cpu-mask:00000216,00002160 high-priority-cpu-list:5,6,8,13,33,34,36,41 low-priority-base-frequency(MHz):2400 tjunction-temperature(C):125 thermal-design-power(W):205
The above capabilities show that there are some CPUs on this system that canoffer base frequency of 3000 MHz compared to the standard base frequency at thisperformance levels. Nevertheless, these CPUs are fixed, and they are presentedvia high-priority-cpu-list/high-priority-cpu-mask. But if this Intel(R) SST-BFfeature is selected, the low priorities CPUs (which are not inhigh-priority-cpu-list) can only offer up to 2400 MHz. As a result, if thisclipping of low priority CPUs is acceptable, then the user can enable IntelSST-BF feature particularly for the above “sched pipe” workload since only twoCPUs are used, they can be scheduled on high priority CPUs and can get boost of400 MHz.
Enable Intel(R) SST-BF¶
To enable Intel(R) SST-BF feature, execute:
# intel-speed-select base-freq enable -aIntel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 base-freq enable:successpackage-1 die-0 cpu-14 base-freq enable:success
In this case, -a option is optional. This not only enables Intel(R) SST-BF, but italso adjusts the priority of cores using Intel(R) Speed Select Technology CorePower (Intel(R) SST-CP) features. This option sets the minimum performance of eachIntel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP) class tomaximum performance so that the hardware will give maximum performance possiblefor each CPU.
If -a option is not used, then the following steps are required before enablingIntel(R) SST-BF:
Discover Intel(R) SST-BF and note low and high priority base frequency
Note the high priority CPU list
Enable CLOS using core-power feature set
Configure CLOS parameters. Use CLOS.min to set to minimum performance
Subscribe desired CPUs to CLOS groups
With this configuration, if the same workload is executed by pinning theworkload to high priority CPUs (CPU 5 and 6 in this case):
#taskset -c 5,6 perf bench -r 100 sched pipe# Running 'sched/pipe' benchmark:# Executed 1000000 pipe operations between two processes Total time: 5.627 [sec] 5.627922 usecs/op 177685 ops/sec
This way, by enabling Intel(R) SST-BF, the performance of this benchmark isimproved (latency reduced) by 7.79%. From the turbostat output, it can beobserved that the high priority CPUs reached 3000 MHz compared to 2600 MHz.The turbostat output:
#turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1Package Core CPU Bzy_MHz0 0 0 21510 1 1 21660 2 2 21750 3 3 21750 4 4 21750 5 5 30000 6 6 30000 7 7 21800 8 8 26620 9 9 21760 10 10 21750 11 11 21760 12 12 21760 13 13 2661
Disable Intel(R) SST-BF¶
To disable the Intel(R) SST-BF feature, execute:
# intel-speed-select base-freq disable -a
Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)¶
This feature enables the ability to set different “All core turbo ratio limits”to cores based on the priority. By using this feature, some cores can beconfigured to get higher turbo frequency by designating them as high priority atthe cost of lower or no turbo frequency on the low priority cores.
For this reason, this feature is only useful when system is busy utilizing allCPUs, but the user wants some configurable option to get high performance onsome CPUs.
The support of Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)depends on the Intel(R) Speed Select Technology - Performance Profile (IntelSST-PP) performance level configuration. It is possible that only a certainperformance level supports Intel(R) SST-TF. It is also possible that only the baseperformance level (level = 0) has the support of Intel(R) SST-TF. Hence, firstselect the desired performance level to enable this feature.
In the system under test here, Intel(R) SST-TF is supported at the baseperformance level 0, but currently disabled:
# intel-speed-select -c 0 perf-profile info -l 0Intel(R) Speed Select Technologypackage-0 die-0 cpu-0 perf-profile-level-0 ... ... speed-select-turbo-freq:disabled ... ...
To check if performance can be improved using Intel(R) SST-TF feature, get the turbofrequency properties with Intel(R) SST-TF enabled and compare to the base turbocapability of this system.
Get Base turbo capability¶
To get the base turbo capability of performance level 0, execute:
# intel-speed-select perf-profile info -l 0Intel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 perf-profile-level-0 ... ... turbo-ratio-limits-sse bucket-0 core-count:2 max-turbo-frequency(MHz):3200 bucket-1 core-count:4 max-turbo-frequency(MHz):3100 bucket-2 core-count:6 max-turbo-frequency(MHz):3100 bucket-3 core-count:8 max-turbo-frequency(MHz):3100 bucket-4 core-count:10 max-turbo-frequency(MHz):3100 bucket-5 core-count:12 max-turbo-frequency(MHz):3100 bucket-6 core-count:14 max-turbo-frequency(MHz):3100 bucket-7 core-count:16 max-turbo-frequency(MHz):3100
Based on the data above, when all the CPUS are busy, the max. frequency of 3100MHz can be achieved. If there is some busy workload on cpu 0 - 11 (e.g. stress)and on CPU 12 and 13, execute “hackbench pipe” workload:
# taskset -c 12,13 perf bench -r 100 sched pipe# Running 'sched/pipe' benchmark:# Executed 1000000 pipe operations between two processes Total time: 5.705 [sec] 5.705488 usecs/op 175269 ops/sec
The turbostat output:
#turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1Package Core CPU Bzy_MHz0 0 0 30000 1 1 30000 2 2 30000 3 3 30000 4 4 30000 5 5 31000 6 6 31000 7 7 30000 8 8 31000 9 9 30000 10 10 30000 11 11 30000 12 12 31000 13 13 3100
Based on turbostat output, the performance is limited by frequency cap of 3100MHz. To check if the hackbench performance can be improved for CPU 12 and CPU13, first check the capability of the Intel(R) SST-TF feature for this performancelevel.
Get Intel(R) SST-TF Capability¶
To get the capability, the “turbo-freq info” command can be used:
# intel-speed-select turbo-freq info -l 0Intel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-0 speed-select-turbo-freq bucket-0 high-priority-cores-count:2 high-priority-max-frequency(MHz):3200 high-priority-max-avx2-frequency(MHz):3200 high-priority-max-avx512-frequency(MHz):3100 bucket-1 high-priority-cores-count:4 high-priority-max-frequency(MHz):3100 high-priority-max-avx2-frequency(MHz):3000 high-priority-max-avx512-frequency(MHz):2900 bucket-2 high-priority-cores-count:6 high-priority-max-frequency(MHz):3100 high-priority-max-avx2-frequency(MHz):3000 high-priority-max-avx512-frequency(MHz):2900 speed-select-turbo-freq-clip-frequencies low-priority-max-frequency(MHz):2600 low-priority-max-avx2-frequency(MHz):2400 low-priority-max-avx512-frequency(MHz):2100
Based on the output above, there is an Intel(R) SST-TF bucket for which there aretwo high priority cores. If only two high priority cores are set, then max.turbo frequency on those cores can be increased to 3200 MHz. This is 100 MHzmore than the base turbo capability for all cores.
In turn, for the hackbench workload, two CPUs can be set as high priority andrest as low priority. One side effect is that once enabled, the low prioritycores will be clipped to a lower frequency of 2600 MHz.
Enable Intel(R) SST-TF¶
To enable Intel(R) SST-TF, execute:
# intel-speed-select -c 12,13 turbo-freq enable -aIntel(R) Speed Select TechnologyExecuting on CPU model: Xpackage-0 die-0 cpu-12 turbo-freq enable:successpackage-0 die-0 cpu-13 turbo-freq enable:successpackage--1 die-0 cpu-63 turbo-freq --auto enable:success
In this case, the option “-a” is optional. If set, it enables Intel(R) SST-TFfeature and also sets the CPUs to high and low priority using Intel SpeedSelect Technology Core Power (Intel(R) SST-CP) features. The CPU numbers passedwith “-c” arguments are marked as high priority, including its siblings.
If -a option is not used, then the following steps are required before enablingIntel(R) SST-TF:
Discover Intel(R) SST-TF and note buckets of high priority cores and maximum frequency
Enable CLOS using core-power feature set - Configure CLOS parameters
Subscribe desired CPUs to CLOS groups making sure that high priority cores are set to the maximum frequency
If the same hackbench workload is executed, schedule hackbench threads on highpriority CPUs:
#taskset -c 12,13 perf bench -r 100 sched pipe# Running 'sched/pipe' benchmark:# Executed 1000000 pipe operations between two processes Total time: 5.510 [sec] 5.510165 usecs/op 180826 ops/sec
This improved performance by around 3.3% improvement on a busy system. Here theturbostat output will show that the CPU 12 and CPU 13 are getting 100 MHz boost.The turbostat output:
#turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1Package Core CPU Bzy_MHz...0 12 12 32000 13 13 3200