cray_pm, cray_rapl
provide access to Cray Power Management (PM) and Running Average Power Limit (RAPL) counters
- Author:
Hewlett Packard Enterprise Development LP.
- Copyright:
Copyright 2019,2021,2023-2024 Hewlett Packard Enterprise Development LP.
- Manual section:
5
DESCRIPTION
The PAPI Cray RAPL component provides socket-level access to Running Average Power Limit (RAPL) measurement counters, while the similar PAPI Cray Power Management (PM) counters provide compute node-level access to additional power management counters. Together, these counters enable you to monitor and report energy usage during program execution.
These counters are only available on HPE Cray EX systems.
Execute the PAPI_component_avail utility on a compute node to verify these counters are available.
CrayPat supports experiments that make use of both sets of counters. These counters are accessed through use of the set of runtime environment variables PAT_RT_PERFCTR. When cray_rapl counters are specified, one core per socket is tasked with collecting and recording the specified events. When cray_pm counters are specified, one core per compute node is tasked with collecting and recording the specified events. The resulting metrics appear on text reports.
To list the available events, use the PAPI_native_avail utility on a compute node and filter for the desired PAPI components. For example:
$ aprun papi_native_avail -i cray_rapl$ aprun papi_native_avail -i cray_pm
For complete lists of the cray_pm and cray_rapl power management events currently supported by a processor family, execute pat_help, select counters, select your processor type, and then select native.
NOTES
In cray_rapl events, the term “package” refers to the physical socket. Energy values are reported in joules in pat_report, but values acquired via a direct call to PAPI_read are returned in nanojoules. When collected through the PAPI interface from a user’s program, the cray_rapl counter values are in terms of their respective units. As collected by the instrumented executable, the cray_rapl counter values are not converted into their respective units but collected as raw values. See the Intel Software Development Manual or the AMD Processor Programming Reference Manual listed in the SEE ALSO section below to determine how raw counter values are converted to their respective units.
PP0_ENERGY is supported on AMD Zen processors only.
The PM_FRESHNESS, PM_GENERATION, PM_STARTUP, PM_SCAN_HZ, and PM_VERSION events are intended for those users who are using the PAPI API to access the counters directly, and provide no useful information when requested through CrayPat.
WARNINGS
The cray_rapl energy-based counter events (those that measure in units of nanojoules or Joules) are collected in MSRs that are 32-bits in width. The MSR is updated every 1 millisecond. These counters have a wraparound time of around 60 seconds when power consumption is high. Users are warned that for those instrumented executables that have execution times that exceed 60 seconds and perform sampling experiments, selected cray_rapl energy-based counter events may reflect this wraparound and not represent an accurate representation of the energy consumed. The same effect applies to those programs that directly call the PAPI functions to set-up and acquire cray_rapl energy-based counter events.
The cray_rapl counters are collected by processor zero on each socket upon which the application is scheduled and executing. If the application is not scheduled to run on processor zero of a socket, no cray_rapl counters are collected to represent the application on that socket.
The cray_pm counters are collected by processor zero on each compute node upon which the application is scheduled and executing. If the application is not scheduled to run on processor zero of a compute node, no cray_pm counters are collected to represent the application on that compute node.
If collecting cray_rapl or cray_pm counters, the application must be launched such that MPI ranks are bound to specific sockets. If using the PBS workload manager, launch the application using the aprun -cc cpu option/keyword combination. If using the SLURM workload manager, launch the application using the srun –exclusive –cpu_bind=rank options. For more information, see the aprun(1) or srun(1) man pages.
Both cray_rapl and cray_pm counters are high-overhead functions. Collecting cray_rapl or cray_pm counters injects significant overhead into instrumented executables or programs that use the PAPI API.
The Cray Hardware Supervisory System (HSS) samples node power counters at a rate of approximately 10Hz and the resulting values are made available to user-space applications via /sysfs special files. Therefore, the contents of the /sysfs files may not represent accurate data for programs or individual functions that execute faster than this rate.
SEE ALSO
app2(1), intro_craypat(1), pat_build(1), pat_help(1), pat_report(1), pat_run(1)
accpc(5), cray_pm(5), cray_rapl(5), hwpc(5), cray_cassini(5), uncore(5), papi_counters(5)
Intel 64 and IA-32 Architectures Software Developer Manuals (link to URL http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html)
Processor Programming Reference (PPR) for AMD Family 19h Model 01h, Revision B1 Processors Manual (link to URL https://www.amd.com/en/support/tech-docs)
MSR Safe (link to URL https://github.com/LLNL/msr-safe)