intro_papi

introduces the Performance API (PAPI)

Author:

Hewlett Packard Enterprise Development LP.

Copyright:

Copyright 2019,2021-2024 Hewlett Packard Enterprise Development LP.

Manual section:

3

DESCRIPTION

The Performance Application Programming Interface (PAPI) provides a standard application programming interface for accessing hardware performance counters on microprocessor-based computer systems. The version of PAPI used on Cray systems is supplied, documented, and supported by the PAPI Group, http://icl.cs.utk.edu/papi/ (link to URL http://icl.cs.utk.edu/papi/) .

High-level Functions

PAPI provides simple interfaces for instrumenting applications written in C or Fortran. The Fortran-only functions are distinguished by the PAPIF_ prefix. These functions are described in the following man pages.

For general information on the Fortran interface, see the PAPIF(3) man page.

The high-level functions are self-initializing. You may mix high- and low-level functions, but low-level functions must follow either a high-level function call or a call to PAPI_library_init(3).

PAPI_num_counters(3), PAPIF_num_counters(3)

Return the number of hardware counters available on the system.

PAPI_flips(3), PAPIF_flips(3)

Return the Mflips (floating-point instruction rate per second), real, and CPU time.

PAPI_flops(3), PAPIF_flops(3)

Return the Mflops (floating-point operation rate per second), real, and CPU time.

PAPI_ipc(3), PAPIF_ipc(3)

Return the instructions per cycle, real, and CPU time.

PAPI_epc(3), PAPIF_epc(3)

Get arbitrary events per cycle, real and processor time, reference and core cycles.

PAPI_accum_counters(3), PAPIF_accum_counters(3)

Add current counts to array, and reset counters.

PAPI_read_counters(3), PAPIF_read_counters(3)

Copy current counts to array, and reset counters.

PAPI_start_counters(3), PAPIF_start_counters(3)

Start counting hardware events.

PAPI_stop_counters(3), PAPIF_stop_counters(3)

Stop counting events and return the current counts.

Low-level Functions

PAPI also provides an advanced interface for instrumenting applications. The PAPI library must be initialized before calling any of these functions; initialization can be done by issuing either a high-level function call or a call to PAPI_library_init(3). The Fortran-only functions are distinguished by the PAPIF_ prefix. See the individual man pages for more information.

PAPI_accum(3), PAPIF_accum(3)

Accumulate and reset hardware events from an event set.

PAPI_add_event(3), PAPIF_add_event(3)

Add a single PAPI preset or a native hardware event to an event set.

PAPI_add_events(3), PAPIF_add_events(3)

Add an array of PAPI preset or native hardware events to an event set.

PAPI_add_named_event(3), PAPIF_add_named_event(3)

Add an event by name to a PAPI event set. PAPI_assign_eventset_component(3),

PAPIF_assign_eventset_component(3)

Assign a component index to an existing but empty event set.

PAPI_attach(3), PAPIF_attach(3)

Attach PAPI event set to the specified thread ID.

PAPI_cleanup_eventset(3), PAPIF_cleanup_eventset(3)

Remove all PAPI events from an event set.

PAPI_create_eventset(3), PAPIF_create_eventset(3)

Create a new, empty, PAPI event set.

PAPI_descr_error(3), PAPIF_descr_error(3)

Returns the PAPI error code description string.

PAPI_destroy_eventset(3), PAPIF_destroy_eventset(3)

Deallocate memory associated with an empty PAPI event set.

PAPI_detach(3), PAPIF_detach(3)

Detach specified event set from a previously specified process or thread ID.

PAPI_disable_component(3), PAPIF_disable_component(3)

Disable the specified component before library init.

PAPIF_disable_component_by_name(3)

Disable a component by name before library init.

PAPI_enum_cmp_event(3), PAPIF_enum_cmp_event(3)

Return the event code for the next available component event.

PAPI_enum_event(3), PAPIF_enum_event(3)

Return the event code for the next available preset or native event.

PAPI_event_code_to_name(3), PAPIF_event_code_to_name(3)

Translate an integer PAPI event code into an ASCII PAPI preset or native event name.

PAPI_event_name_to_code(3), PAPIF_event_name_to_code(3)

Translate an ASCII PAPI present or event name into an integer PAPI event code.

PAPI_get_cmp_opt(3), PAPIF_get_cmp_opt(3)

Get component-specific PAPI options.

PAPI_get_component_index(3), PAPIF_get_component_index(3)

Return the component index for the named component.

PAPI_get_component_info(3), PAPIF_get_component_info(3)

Get information about a specific software component.

PAPI_get_dmem_info(3), PAPIF_get_dmem_info(3)

Get dynamic memory usage information.

PAPI_get_event_component(3), PAPIF_get_event_component(3)

Return the component an event belongs to.

PAPI_get_event_info(3), PAPIF_get_event_info(3)

Get the name and description of a given preset or native event code.

PAPI_get_eventset_component(3), PAPIF_get_eventset_component(3)

Return index for the component an eventset is assigned to.

PAPI_get_executable_info(3), PAPIF_get_exe_info(3)

Get the address space information of the executable.

PAPI_get_hardware_info(3), PAPIF_get_hardware_info(3)

Get information about the system hardware.

PAPI_get_multiplex(3), PAPIF_get_multiplex(3)

Get the multiplexing status of specified event set.

PAPI_get_opt(3), PAPIF_get_opt(3)

Get the option settings of the PAPI library or of a specific event set.

PAPI_get_overflow_event_index(3), PAPIF_get_overflow_event_index(3)

Decompose an overflow vector into an event index array.

PAPI_get_real_cyc(3), PAPIF_get_real_cyc(3)

Get the total number of cycles since the starting point.

PAPI_get_real_nsec(3), PAPIF_get_real_nsec(3)

Get the total number of nanoseconds since the starting point.

PAPI_get_real_usec(3), PAPIF_get_real_usec(3)

Get the total number of microseconds since the starting point.

PAPI_get_shared_lib_info(3), PAPIF_get_shared_lib_info(3)

Get information about shared libraries used by the process.

PAPI_get_thr_specific(3), PAPIF_get_thr_specific(3)

Get a pointer to a thread-specific stored data structure.

PAPI_get_virt_cyc(3), PAPIF_get_virt_cyc(3)

Get the process cycle count since the starting point.

PAPI_get_virt_nsec(3), PAPIF_get_virt_nsec(3)

Get the process time in nanoseconds since the starting point.

PAPI_get_virt_usec(3), PAPIF_get_virt_usec(3)

Get the process time in microseconds since the starting point.

PAPI_is_initialized(3), PAPIF_is_initialized(3)

Return the initialization state of the PAPI library.

PAPI_library_init(3), PAPIF_library_init(3)

Initialize the PAPI library.

PAPI_list_events(3), PAPIF_list_events(3)

List the events that are members of an event set.

PAPI_list_threads(3), PAPIF_list_threads(3)

List the registered thread IDs.

PAPI_lock(3), PAPIF_lock(3)

Lock one of two PAPI internal user mutex variables.

PAPI_multiplex_init(3), PAPIF_multiplex_init(3)

Initialize multiplex support in the PAPI library.

PAPI_num_cmp_hwctrs(3), PAPIF_num_cmp_hwctrs(3)

Return the number of hardware counters for the specified component.

PAPI_num_components(3), PAPIF_num_components(3)

Return the number components available on the system.

PAPI_num_events(3), PAPIF_num_events(3)

Return the number of events in an event set.

PAPI_num_hwctrs(3), PAPIF_num_hwctrs(3)

Return the number of hardware counters.

PAPI_overflow(3), PAPIF_overflow(3)

Set up an event set to begin registering overflows.

PAPI_perror(3), PAPIF_perror(3)

Convert PAPI error codes to strings.

PAPI_profil(3), PAPIF_profil(3)

Generate PC histogram data where hardware counter overflow occurs.

PAPI_query_event(3), PAPIF_query_event(3)

Query if a PAPI event exists.

PAPI_query_named_event(3), PAPIF_query_named_event(3)

Query if a named PAPI event exists.

PAPI_read(3), PAPIF_read(3)

Read hardware events from an event set, with no reset.

PAPI_read_ts(3), PAPIF_read_ts(3)

Timestamped read of hardware events.

PAPI_register_thread(3), PAPIF_register_thread(3)

Inform PAPI of the existence of a new thread.

PAPI_remove_event(3), PAPIF_remove_event(3)

Remove a hardware event from a PAPI event set.

PAPI_remove_named_event(3), PAPIF_remove_named_event(3)

Remove a named hardware event from a PAPI event set.

PAPI_remove_events(3), PAPIF_remove_events(3)

Remove an array of hardware events from a PAPI event set.

PAPI_reset(3), PAPIF_reset(3)

Reset the hardware event counts in an event set.

PAPI_set_cmp_domain(3), PAPIF_set_cmp_domain(3)

Set the default counting domain for the new event sets bound to the specified component.

PAPI_set_cmp_granularity(3), PAPIF_set_cmp_granularity(3)

Set the default counting granularity for event sets bound to the specified component.

PAPI_set_debug(3), PAPIF_set_debug(3)

Set the current debug level for PAPI.

PAPI_set_domain(3), PAPIF_set_domain(3)

Set the default execution domain for new event sets.

PAPI_set_granularity(3), PAPIF_set_granularity(3)

Set the default granularity for new event sets.

PAPI_set_multiplex(3), PAPIF_set_multiplex(3)

Convert a standard event set to a multiplexed event set.

PAPI_set_opt(3), PAPIF_set_opt(3)

Change the option settings of the PAPI library or a specific event set.

PAPI_set_thr_specific(3), PAPIF_set_thr_specific(3)

Save a pointer as a thread specific stored data structure.

PAPI_shutdown(3), PAPIF_shutdown(3)

Finish using PAPI and free all related resources.

PAPI_sprofil(3), PAPIF_sprofil(3)

Generate hardware counter profiles from multiple code regions.

PAPI_start(3), PAPIF_start(3)

Start counting hardware events in an event set.

PAPI_state(3), PAPIF_state(3)

Return the counting state of an event set.

PAPI_stop(3), PAPIF_stop(3)

Stop counting hardware events in an event set and return current events.

PAPI_strerror(3), PAPIF_strerror(3)

Return a pointer to the error message corresponding to a specified error code.

PAPI_thread_id(3), PAPIF_thread_id(3)

Get the thread identifier of the current thread.

PAPI_thread_init(3), PAPIF_thread_init(3)

Initialize thread support in the PAPI library.

PAPI_unlock(3), PAPIF_unlock(3)

Unlock one of two PAPI internal user mutex variables.

PAPI_unregister_thread(3), PAPIF_unregister_thread(3)

Inform PAPI that a previously registered thread is disappearing.

PAPI_write(3), PAPIF_write(3)

Write counter values into counters.

PAPI Utilities

The PAPI utilities provide more information about the system and the PAPI events that can be examined. See the individual man pages for more information about each utility.

papi_avail(1)

Returns information regarding the availability of PAPI preset events.

papi_clockres(1)

Measures and reports clock latency and resolution for PAPI timers.

papi_command_line(1)

Execute PAPI preset or native events from the command line.

papi_component_avail(1)

Reports information about the components with which PAPI was built and whether they are enabled for the processor upon which the utility was executed.

papi_cost(1)

Computes execution time costs for basic PAPI operations.

papi_decode(1)

Decodes PAPI preset events into a csv format suitable for use with PAPI_encode_events.

papi_error_codes(1)

Lists all currently defined PAPI error codes.

papi_event_chooser(1)

Given a list of named events, lists other events that can also be counted at the same time.

papi_mem_info(1)

Returns information regarding the memory architecture of the current processor.

papi_multiplex_cost(1)

Computes the execution time costs for basic PAPI operations on multiplexed event sets.

papi_native_avail(1)

Returns detailed information for PAPI native events.

papi_version(1)

Reports version information about the current PAPI installation.

ENVIRONMENT VARIABLES

PAPI_VERBOSE

Issues additional messages about error conditions encountered during processing of performance counter events. Set to any value to activate.

NOTES

If no attributes for a given event are specified, some default is provided. This default is usually the ALL attribute but will vary according to the event. In some cases using the ALL attribute may not be appropriate for the summing of the sub-events. Users should explicitly use event attributes to ensure that they are collecting those counts that will provide the best insight into the performance of their application.

If you cannot access an event, use the papi_component_avail utility to verify that the component is available and supported on your system. For example, if you are having difficulty accessing the CUDA counters, use this utility to verify that the cuda component is present and supported on your system.

The papi_native_avail command lists the names of all the performance counter events available for selection and use with the PAT_RT_PERFCTR environment variables. In most cases, the event name may take on one or more event attributes further detailing how the counts for the event are collected. The unit masks are one such attribute, shown as another name. If the name of the unit mask is not specified (appended with a :uname to the event name) and there is no implied unit mask default defined, the event will be accepted as valid but the resulting count will reflect the number of CPU cycles. Unfortunately, the papi_native_avail command gives no indication of what the default unit mask is for a given event. The papi_command_line command also accepts such a specified event as valid, but returns cycle counts for the given command. This anomaly also occurs when using the PAPI API directly, such as with PAPI_create_eventset, PAPI_add_event, and PAPI_read. Users are encouraged to always append a unit mask name (if one exists for the event) to the event name, to ensure that the proper counts are collected. PAT_RT_PERFCTR cannot be set to the name of a derived metric, only to the names of counter events or groups of counter events. A derived metric appears in the report if all of the events required for that metric are collected.