sanitizers4hpc Man Page

sanitizers4hpc User Reference

DESCRIPTION

Sanitizers4hpc is an aggregation tool to collect and analyze LLVM Sanitizers output at scale. Currently, Clang AddressSanitizer, LeakSanitizer, and ThreadSanitizer tools are supported. Support for AMD’s GPU Sanitizer library and Nvidia’s Compute Sanitizer tools are also available.

Options

The sanitizers4hpc tool frontend accepts the following options:

-e

Print AddressSanitizer error and stack section headers
-l,–launcher-args=LAUNCHER_ARGS

Arguments to pass to the system launcher, not including target binary and arguments
-r,–errors-from=REGEX

Only display Sanitizer output occurring in this source file, binary, or library. A regex can be used to display output from frames matching a pattern. To provide multiple patterns, use multiple instances of this argument
-a,–asan-options=ASAN_OPTIONS

Supply additional AddressSanitizer runtime options to the launched job
-o,–lsan-options=LSAN_OPTIONS

Supply additional LeakSanitizer runtime options to the launched job
-t,–tsan-options=TSAN_OPTIONS

Supply additional ThreadSanitizer runtime options to the launched job
-u,–msan-options=MSAN_OPTIONS

Supply additional MemorySanitizer runtime options to the launched job. Also supports the suppressions=<file> option to suppress function or library output.
-m,–cuda-sanitizer=COMPUTE_SANITIZER_PATH

Path to CUDA Compute Sanitizer binary if using CUDA Sanitizer mode
-c,–cuda-options=COMPUTE_SANITIZER_OPTIONS

Supply additional CUDA Sanitizer runtime options to the launched job Sanitizers4hpc is only compatible with the Memcheck tool. --log-file is used internally to capture CUDA Sanitizer output and will not be applied if it is passed via --cuda-options.
-f,–force-clang-san

Bypass Sanitizers4hpc’s check that a binary is instrumented with a sanitizer.
-n,–force-mpi

Bypass Sanitizers4hpc’s check that a binary is linked with MPI. This only has an effect on PALS systems. On PALS systems, apps must be MPI apps to be launched by Sanitizers4hpc.

EXAMPLES

Sanitizers4hpc manages the launch of your job through the system’s workload manager. Start the tool by supplying any Sanitizers4hpc options, followed by a double dash – and your workload manager’s job launch arguments. For example, to run the binary a.out with four ranks,

sanitizers4hpc --launcher-args="-n4" -- ./a.out binary_argument

Notice that the workload manager arguments are provided in –launcher-args and the target binary and its arguments are listed after the double dash –.

After encountering a memory error, AddressSanitizer will produce error reports for each affected rank. Sanitizers4hpc processes these error reports and aggregates them for easier analysis. In this example run, the application has encountered an invalid read off the end of a buffer on all ranks. Sanitizers4hpc will print a single error report, noting that while it occurs on all four ranks, it is reported by the AddressSanitizer library at the same place in the source file.

RANKS: <0-3>
AddressSanitizer: heap-buffer-overflow on address at pc bp sp
READ of size 4 at 0x61d000002680 thread T0
    ...
    #1 0x328dc3 in main /source.c:37:22
    ...

SUMMARY: AddressSanitizer: heap-buffer-overflow /source.c:52:15 in main

CPU Sanitizers

To use Sanitizers4hpc, your application must be built with Sanitizer support for your desired tool. Sanitizers4hpc supports the Sanitizer libraries included with both the Cray CCE and the GNU GCC compilers.

AddressSanitizer detects memory access errors, including out-of-bounds reads and writes, as well as invalid frees or stack usage. Enabled with the compiler flag -fsanitize=address. More information and runtime flags can be found at https://clang.llvm.org/docs/AddressSanitizer.html.
LeakSanitizer detects runtime memory leaks and can be used in conjunction with AddressSanitizer. Enabled with the compiler flag -fsanitize=leak. More information and runtime flags can be found at https://clang.llvm.org/docs/LeakSanitizer.html.
ThreadSanitizer detects possible race conditions in multithreaded code. Enabled with the compiler flag -fsanitize=thread. More information and runtime flags can be found at https://clang.llvm.org/docs/ThreadSanitizer.html.
MemorySanitizer detects use of uninitialized memory. Enabled with the compiler flag -fsanitize=memory -fno-omit-frame-pointer. More information and runtime flags can be found at https://clang.llvm.org/docs/MemorySanitizer.html.

If your MSan-instrumented application calls uninstrumented library functions, you will likely see false positive output from MemorySanitizer. Clang MemorySanitizer does not natively support suppression files. However, Sanitizers4hpc implements suppression functionality similar to that of AddressSanitizer and LeakSanitizer.

The default suppression file will suppress all function calls in libraries matching libmpi*.so. You can set your own suppression file with SANITIZERS4HPC_MSAN_SUPPRESSIONS=<path>, or with the Clang-style option MSAN_OPTIONS=suppressions=<path>. Set SANITIZERS4HPC_MSAN_SUPPRESSIONS= to an empty value to disable the default suppression file.
- interceptor_via_fun: Filter functions matching this value. Wildcards are supported with *
- interceptor_via_lib: Filter files (libraries, source files) matching this value. Wildcards are supported with *

AMD GPU Sanitizers

Some versions of the AMD AFAR compiler support AddressSanitizers for HIP code running in a GPU kernel. Refer to the compiler documentation for for details on building and running HIP code with Sanitizers enabled.

AMD’s GPU sanitizer implementation is based on AddressSanitizers, and can detect memory errors inside running GPU kernels. GPU code must be built with AMD’s AFAR compiler providing GPU Sanitizer support. It is enabled with the same compiler flag as AddressSanitizer: -fsanitize=address.

After building your target HIP application with GPU Sanitizers enabled, no further modifications are required to run Sanitizers4hpc with GPU Sanitizer support. Additionally, Sanitizers4hpc automatically applies a custom leak suppression file that will clean up a number of false-positive leaks from the HIP runtime library.

GPU Sanitizers output will include the memory error type (e.g. invalid read or write) and the kernel coordinates:

ERROR: AddressSanitizer: heap-buffer-overflow on amdgpu device 0 at pc ...
READ of size 4 in workgroup id (328,0,0)
:0:rocdevice.cpp          
 :2616: 101892175517 us: 62816: [tid:0x7f288d4ff700]
Device::callbackQueue aborting with error :
HSA_STATUS_ERROR_MEMORY_FAULT: Agent attempted to access an inaccessible
 address. code: 0x2b

The AMD GPU Sanitizer library is in active development, and while error reporting details are limited, they will improve. Current limitations include reporting of only the top frame of the stack.

Nvidia GPU Sanitizers

Nvidia’s Compute Sanitizer tool is included in your system’s CUDA toolkit. Compute Sanitizer is capable of detecting and attributing out of bounds and misaligned memory access errors in CUDA applications. The tool also reports hardware exceptions encountered by the GPU.

Sanitizers4hpc will analyze your target application to determine if Compute Sanitizer support is required for the target application. Use the -m=COMPUTE_SANITIZER_PATH option for Sanitizers4hpc to specify the path to your system’s Compute Sanitizer binary.

Upon encountering a memory error in a CUDA application, Sanitizers4hpc will invoke CUDA-Memcheck and analyze its results to display an aggregated error report.

CRAY MPI WRAPPPERS

Memory errors can occur as a result of running an MPI command, such as MPI_Send.

int *sendbuf = malloc(sizeof(int) * 4);
// Out-of-bounds read inside MPI_Send
MPI_Send(sendbuf, 5, MPI_INT, 1, 1, MPI_COMM_WORLD);

Because the invalid memory access occurs as part of an MPI operation, it may not always be caught by Address Sanitizers, unless the MPI library itself is rebuilt and instrumented with Sanitizer support.

If your target application is built against Cray MPI, Sanitizers4hpc will generate an application-specific wrapper library that performs correctness checking when running common MPI operations. This library is then preloaded into the application during launch, enabling reliable memory bounds checking with MPI functions.

For example, the above invalid MPI_Send operation will result in the following memory error reported by Sanitizers4hpc:

RANKS: <0>
AddressSanitizer: unknown-crash on address ...
WRITE of size 8 at ... thread T0
    #0 is_invalid_store ./libS4hMPIWrapper.c:161
    #1 MPI_Send ./libS4hMPIWrapper.c:379
    #2 test_invalid_MPI_Send ./source.c:307:2
    #3 main ./source.c:751:7

To disable automatic building and loading of the MPI wrapper library, use option -d when launching Sanitizers4hpc e.g. sanitizers4hpc -d ./a.out

The target application must be instrumented with Address Sanitizers, but can either be statically or dynamically linked against the Sanitizers library.

Automatic library build

The wrapper library must be built against the same Cray MPI and Address Sanitizer header versions as the target application. Because of this, Sanitizers4hpc will attempt to automatically detect the correct include directories for your application’s MPI header files, and generate the Address Sanitizer headers.

To override the automatically detected header locations, set the following environment variables as needed:

SANITIZERS4HPC_MPI_INCLUDE to the MPI include directory containing mpi.h
SANITIZERS4HPC_ASAN_INCLUDE to the Clang Sanitizers directory containing the directory sanitizer. Inside this directory should be the file asan_interface.h.

Listing of wrapped MPI functions

MPI_Send
MPI_Recv
MPI_Isend
MPI_Irecv
MPI_Sendrecv
MPI_Sendrecv_replace
MPI_Bsend
MPI_Ibsend
MPI_Ssend
MPI_Rsend
MPI_Bcast
MPI_Gather
MPI_Scatter
MPI_Allgather
MPI_Allreduce
MPI_Reduce
MPI_Alltoall
MPI_Gatherv
MPI_Scatterv
MPI_Reduce_scatter
MPI_Put
MPI_Get
MPI_Accumulate
MPI_Get_accumulate
MPI_Fetch_and_op
MPI_Compare_and_swap
MPI_Pack
MPI_Unpack

NOTES

Troubleshooting job launch

Sanitizers4hpc relies on the Cray debugger support library, CTI, to launch jobs across a variety of system and workload manager configurations. If you are encountering problems, such as hangs or launch errors when running a job with Sanitizers4hpc, you may need to set certain CTI_ environment variables as referenced in the launch error message.

For a detailed explanation on CTI launch errors and solutions, refer to the CTI manpage by running

module load cray-cti
man cti

MPI and SHMEM debugging

AddressSanitizer and LeakSanitizer tools do not natively support debugging memory errors for memory managed by MPI or SHMEM libraries. If your program uses MPI or SHMEM operations, Sanitizers4hpc will still be able to run your program and detect memory errors in the local heap, but it will not be able to detect memory errors that occur inside an MPI or SHMEM operation.

For example, if an MPI_Send operation writes off of the end of a buffer during receive, no error will be caught by AddressSanitizer. On the other hand, a memcpy from a SHMEM mapped region to a local heap- allocated buffer will still raise an AddressSanitizer error.

For Sanitizer tools to catch memory errors inside MPI operations, the MPI library itself must be rebuilt with the compiler flag -fsanitize=address. This will add the proper instrumentation to the MPI operations needed for AddressSanitizer and LeakSanitizer correctness checking.

Preloading `libfabric`

When AddressSanitizer is enabled before libfabric is loaded, it can interfere with libfabric initialization and lead to a segfault. Sanitizers4hpc will work around this by automatically preloading libfabric before starting the application. To disable this, set SANITIZERS4HPC_PRELOAD_LIBFABRIC=0.

AMD HIP runtime

To properly support AMD GPU Sanitizers, the active HIP runtime needs to have been instrumented. Sanitizers4hpc will detect if an application is linked against the HIP runtime, and attempt to automatically prepend the directory containing the instrumented runtime to LD_LIBRARY_PATH.

If during the course of running an application with AMD GPU Sanitizers you see the following error:

Hostcall: no handler found for service ID 4

This indicates that the instrumented HIP runtime was not successfully loaded.

For official ROCm builds, AMD provides instrumented HIP runtimes in the subdirectory asan of the normal HIP runtime location. Find the directory contaning an instrumented libamdhip64.so and prepend it to LD_LIBRARY_PATH. Then, re-run Sanitizers4hpc.