reveal

performance analysis and code restructuring assistant

Author:

Hewlett Packard Enterprise Development LP.

Copyright:

Copyright 2019-2024 Hewlett Packard Enterprise Development LP.

Manual section:

1

SYNOPSIS

reveal [program_library.pl [experiment_data_directory]]

DESCRIPTION

Reveal is Cray’s next-generation integrated performance analysis and code optimization tool. Reveal extends Cray’s existing performance measurement, analysis, and visualization technology by combining runtime performance statistics and program source code visualization with the Cray Compiling Environment (CCE) compile-time optimization feedback.

Reveal supports source code navigation using whole-program analysis data provided by the Cray Compiling Environment, coupled with performance data collected during program execution by the Cray performance tools, to understand which high-level serial loops could benefit from improved parallelism. Reveal provides enhanced loopmark listing functionality, dependency information for targeted loops, and assists users optimizing code by providing variable scoping feedback and suggested compiler directives.

The reveal utility supports the following options:

program_library.pl

Reveal takes a program_library directory as input to enable browsing source with compiler optimization information. When Reveal is invoked with just a program_library directory specified, it provides enhanced loopmark functionality.

A program_library directory is generated by specifying the -hpl= /path/program_library argument on the CCE compiler command line for fortran and
-f cray-program-library-path=/path/program_library for C/C++. See the crayftn(1) or clang(1) man pages for more information on how to generate a program library and the “Examples” section of this man page for step-by-step examples.

experiment_data_directory

Reveal can also take an experiment_data_directory containing loop work estimates to assist with navigation to loops that are good candidates for parallelization. The experiment_data_directory is generated by instrumenting and executing a program using CrayPat. An example of how to generate an experiment_data_directory with loop work estimates is provided in the “Examples” section of this man page.

To begin using Reveal, load the perftools-base module and then enter the reveal utility:

$ module load perftools-base
$ reveal

If files are specified on the command line, the user can open an existing program library file by selecting the “File → Open” option in the menu bar.

To launch Reveal and specify a program_library file:

$ reveal my_program_library.pl

To launch Reveal and specify both a program_library and an experiment_data_directory:

$ reveal my_program_library.pl experiment_data_directory

Reveal includes an integrated help system. All other information about using Reveal is presented in the help system, which is accessible whenever Reveal is running by selecting Help from the menu bar and choosing a topic.

Reveal is a GUI tool that requires that your workstation support the X Window System. Depending on your system configuration, you may need to use the ssh -X option to enable X Window System support in your shell session. Depending on your workstation configuration, you may also need to enable X Window System hosting on your workstation or load an X Window client such as Xming.

NOTES

The perftools-base module must be loaded before you can use Reveal. An instrumentation module is also required when instrumenting programs and running performance analysis experiments.

You must keep the program library with the program source. Moving just the program_library directory to another location and then opening it with Reveal is not supported.

EXAMPLES

Generating Loop Work Estimates

Loop work estimates are generated using the perftools-lite-loops instrumentation module. This instrumentation module invokes the Cray compiler with the CCE compiler -h profile_generate option and then instruments the program for tracing. Loop work over time is not supported in full-trace mode (PAT_RT_SUMMARY set to 0).

To generate a loop work estimate, follow these steps.

Make sure the following modules are loaded:

$ module load PrgEnv-cray
$ module load perftools-base
$ module load perftools-lite-loops

The perftools-base module does not affect program behavior and can be left loaded when not collecting performance data.

Compile and link the program with the CCE.

$ ftn -c my_program.f
$ ftn -o my_program my_program.o

Note: This option disables most automatic compiler optimizations, which is why Cray recommends generating this data separately from generating the program_library file. The program_library is most useful when generated from fully optimized code.

Note: perftools-lite-loops disables all OpenMP optimizations, including API calls such as omp_get_wtime(). In order to compile codes containing such OpenMP API calls, conditional compilation should be used. For implementations supporting a preprocessor, this can be done using the _OPENMP macro. For example:

#if defined(_OPENMP)
   time = omp_get_wtime();
#endif

In order to conditionally compile Fortran code, conditional compilation sentinels recognized by the OpenMP standard should be used.

The resulting binary, my_program, is instrumented to collect work estimates.

Execute the instrumented executable.

$ srun -n pes ./my_program

This generates an experiment data directory and a loops report to stdout. The experiment_data_directory can be fed as input to Reveal.

Generating a Program Library

To generate a program_library.pl directory, make sure the perftools-lite-loops module is unloaded and the Cray (CCE) programming environment module is loaded. Then use the CCE compiler option -h pl for fortran or -f cray-program-library-path for C/C++ to generate the program_library in your current working directory. The perftools-base module should remain loaded in order to provide access to Reveal.

$ module load PrgEnv-cray
$ module unload perftools-lite-loops
$ ftn -O3 -hpl=my_program.pl -c my_program_file1.f90
$ ftn -O3 -hpl=my_program.pl -c my_program_file2.f90
$ ftn -O3 -hpl=my_program.pl -c my_program_file3.f90
$
$ CC -O3 -fcray-program-library-path=my_program.pl -c my_program_file1.C
$ CC -O3 -fcray-program-library-path=my_program.pl -c my_program_file2.C
$

Note: The -h profile_generate option disables most automatic compiler optimizations, which is why Cray recommends generating the program_library separately from the loop work estimate. The program_library is most useful when generated from fully optimized code.

Note: perftools-lite-loops disables all OpenMP optimizations, including API calls such as omp_get_wtime(). In order to compile codes containing such OpenMP API calls, conditional compilation should be used. For implementations supporting a preprocessor, this can be done using the _OPENMP macro. For example:

#if defined(_OPENMP)
   time = omp_get_wtime();
#endif

In order to conditionally compile Fortran code, conditional compilation sentinels recognized by the OpenMP standard should be used.

Exploring the Results

After you have collected performance data from program execution and generated a program_library , launch Reveal and use it to integrate the results and explore opportunities for code optimization. The perftools-base module provides access to Reveal.

$ module load PrgEnv-cray
$ reveal my_program.pl experiment_data_directory

Note: The PrgEnv-cray module must be loaded in order to perform automatic OMP scoping of loops.

SEE ALSO

intro_craypat(1), pat_build(1), pat_opts(1), pat_help(1), pat_report(1), pat_run(1), grid_order(1), reveal(1)

perftools-base(4), perftools-lite(4), perftools-preload(4)

accpc(5), cray_pm(5), cray_rapl(5), hwpc(5), cray_cassini(5), uncore(5), papi_counters(5)