HPE Cray Programming Environment User Guide: CSM

About

The HPE Cray Programming Environment User Guide: CSM includes programming environment and user access concepts, configuration information, component overviews, and relevant examples.

IMPORTANT: This guide assumes an HPE Cray Supercomputing EX system running the HPE Cray Supercomputing Operating System (COS). Technical details for a system running SUSE or Red Hat may differ.

This publication is intended for software developers, engineers, scientists, and other programming environment users.

HPE Cray Programming Environment Components

The HPE Cray Programming Environment (CPE) provides tools designed to maximize developer productivity, application scalability, and code performance. It includes compilers, analyzers, optimized libraries, and debuggers. It also provides a variety of parallel programming models that allow users to make appropriate choices based on the nature of existing and new applications.

CPE uses build environment containers, providing the ability to compile, and launch and track job status. Containers enable users to store and retrieve files from both the local and shared system storage. Users can access CPE on User Access Node(s) (UAN) or through User Access Instances (UAI).

CPE components include:

  • HPE Cray Compiling Environment (CCE) - CCE consists of compilers that perform code analysis during compilation and automatically generate highly optimized code. Compilers support numerous command-line arguments to provide manual control over compiler behavior and optimization. Supported languages include Fortran, C and C++, and UPC (Unified Parallel C).

  • HPE Cray Scientific and Math Libraries (CSML) - CSML is a set of high performance libraries that provide portability for scientific applications by implementing APIs for arrays (NetCDF), dense linear algebra (BLAS, LAPACK, ScaLAPACK) and fast Fourier transforms (FFTW).

  • HPE Cray Message Passing Toolkit (CMPT) - CMPT is a collection of software libraries enabling data transfers between nodes running in parallel applications. CMPT comprises the Message Passing Interface (MPI) and OpenSHMEM parallel programming models. CMPT libraries support practical, portable, efficient, and flexible mechanisms for performing data transfers between parallel processes.

  • HPE Cray Environment Setup and Compiling Support (CENV) - CENV provides libraries that support code compilation and setting up the development environment. It comprises compiler drivers, hugepages utility, and the CPE API, which is a software package used for building module files.

  • HPE Cray Performance Measurement & Analysis Tools (CPMAT) - CPMAT provides tools to analyze the performance and behavior of programs that are run on Cray systems, and a Performance API (PAPI).

  • HPE Cray Debugging Support Tools (CDST) - CDST provides debugging tools, including gdb4hpc, Valgrind4hpc and Sanitizers4hpc.

Cray Environment setup and compiling support

HPE Cray Environment (CENV) provides libraries that support code compilation and development environment setup. It comprises compiler drivers, utilities, and the CPE API (craype-api), which is a software package used for building module files.

Modules and modulefiles

CPE Environment Modules enables users to modify their environment dynamically by using modulefiles. The module command provides a user interface to the Modules package. The module command system interprets modulefiles, which contain Tool Command Language (Tcl) code, and dynamically modifies shell environment variables such as PATH and MANPATH.

Sites can alternatively enable Lmod to handle modules. Both module systems use the same module names and syntax shown in command-line examples.

Tip: Use either Environment Modules or Lmod on a per-system basis. The systems are mutually exclusive and cannot both run on the same system.

The /etc/cray-pe.d/cray-pe-configuration.sh and /etc/cray-pe.d/cray-pe-configuration.csh configuration files allow sites to customize the default environment. The system administrator can also create modulefiles for a product set to support user-specific needs. For more information about the Environment Modules software package, see the module(1) and modulefile(4) manpages.

Lmod

In addition to the default Environment Modules system, CPE offers support for Lmod as an alternative module management system. Lmod is a Lua-based module system that loads and unloads modulefiles, handles path variables, and manages library and header files. The CPE implementation of Lmod is hierarchical, managing module dependencies and ensuring any module a user has access to is compatible with other loaded modules. Features include:

  • Lmod is set up to automatically load a default set of modules. The default set includes one each of compiler, network, CPU, and MPI modules. Users may choose to load different modules. However, it is recommended that, at minimum, a compiler, network, CPU, and an MPI module be loaded, to ensure optimal assistance from Lmod.

  • Lmod supports loading multiple different compiler modules concurrently by loading a dominant core-compiler module and one or more support mixed-compiler modules. See Lmod Mixed Compiler Support for details.

  • Lmod uses “families” of modules to flag circular conflicts, which is most apparent when module details are displayed through module show and when users attempt to load conflicting modules.

  • Environment Modules and Lmod modules use the same names, so all command examples work similarly.

Tip: Environment Modules and Lmod are mutually exclusive, and both cannot run on the same system. Contact the system administrator about setting Lmod as the default module management system.

For more Lmod information, see The User Guide for Lmod.

About Hugepages

Note: This hugepages implementation only works in Cray’s Operating System. If using another Linux distro, use the hugepages implementation appropriate for that distro.

Hugepages are virtual memory pages which are bigger than the default base page size of 4K bytes. Hugepages can improve memory performance for common access patterns on large data sets. Hugepages also increase the maximum size of data and text in a program accessible by the high-speed network. Access to huge pages is provided through a virtual file system called hugetlbfs. Every file on this file system is backed by hugepages and is directly accessed with mmap() or read().

The libhugetlbfs library allows an application to use hugepages more easily by directly accessing the hugetlbfs file system. A user may enable libhugetlbfs to back application text and data segments.

Use hugepages for the following:

  • For SHMEM applications, map the static data and/or private heap onto huge pages.

  • For applications written in Unified Parallel C and other languages based on the PGAS programming model, to map the static data and/or private heap onto huge pages.

  • For MPI applications, map the static data and/or heap onto huge pages.

  • For applications using shared memory that are concurrently registered with high-speed network drivers for remote communication.

  • For applications doing heavy I/O.

  • To improve memory performance for common access patterns on large data sets.

Note: On x86 processors, the only page sizes supported by the processor are 4K, 2M, and 1G.

See the intro_hugepages(3) man page for more details.

To use hugepages, load the appropriate craype-hugepages at link time. Possible values are:

  • craype-hugepages128K

  • craype-hugepages512K

  • craype-hugepages2M

  • craype-hugepages4M

  • craype-hugepages8M

  • craype-hugepages16M

  • craype-hugepages32M

  • craype-hugepages64M

  • craype-hugepages128M

  • craype-hugepages256M

  • craype-hugepages512M

  • craype-hugepages1G

  • craype-hugepages2G

For example, to use 2 megabyte hugepages:

user@hostname> module load craype-hugepages2M

About Cray Compiling Environment

Module: PrgEnv-cray

Command: ftn, cc, CC

Compiler-specific manpages: crayftn(1), craycc(1), crayCC(1) - available only when the compiler module is loaded

Online help: ftn -help, cc -help, CC -help

Documentation: See Additional Resources

To use the Cray Compiling Environment (CCE), load the PrgEnv-cray module.

user@hostname> module load PrgEnv-cray

CCE provides Fortran, C and C++ compilers that perform substantial analysis during compilation and automatically generate highly optimized code. The compilers support numerous command-line arguments that enable manual control over compiler behavior and optimization. For more information about the Cray Fortran, C, and C++ compiler command-line arguments, see the crayftn(1), craycc(1), and crayCC(1) manpages, respectively.

PrgEnv modules provide wrappers (cc, CC, ftn) for both CCE and third-party compiler drivers. These wrappers call the correct compiler with appropriate options to build and link applications with relevant libraries as required by modules loaded. (Only dynamic linking is supported.) These wrappers replace direct calls to compiler drivers in Makefiles and build scripts. For more information about compiler pragmas and directives, see the intro_directives(1) manpages.

One of the most useful compiler features is the ability to generate annotated loopmark listings showing what optimizations were performed and their locations. Together with compiler messages, these listings can help locate areas in the code that are compiling without error but are not fully optimized. For more detailed information about generating and reading loopmark listings, see crayftn(1), craycc(1), and crayCC(1) manpages, the Cray Fortran Reference Manual (S-3901), and HPE Performance Analysis Tools User Guide (S-8014). See Additional Resources for direct links to these publications.

In many cases, code that is not properly optimizing can be corrected without substantial recoding by applying the right pragmas or directives. For more information about compiler pragmas and directives, see the intro_directives(1) man page.

Third-Party Compilers

CPE supports third-party compilers, including:

  • AOCC

  • AMD ROCm

  • Intel

  • GNU

  • NVIDIA

The compilers and their respective dependencies, including wrappers and mappings (for example, mapping cc to gcc in PrgEnv-gnu), are loaded using the module load <modulename> command. For example,

user@hostname> module load PrgEnv-gnu

About AOCC

Module: PrgEnv-aocc

Command: ftn, cc, CC

Documentation: AOCC Documentation

CPE enables, but does not bundle, the AMD Optimizing C/C++ Compiler (AOCC). CPE provides a bundled package of support libraries to install into the programming environment to enable AOCC and CPE utilities such as debuggers and performance tools.

  • If not available on the system, contact a system administrator to install AOCC and the support bundle.

  • To us AOCC, load the PrgEnv-aocc module:

    user@hostname> module load PrgEnv-aocc
    

About the AMD ROCm compiler

Module: PrgEnv-amd

Command: ftn, cc, CC

Documentation: https://rocmdocs.amd.com/en/latest/

CPE enables, but does not bundle, the AMD ROCm Compiler. CPE provides a bundled package of support libraries to install into the programming environment to enable this compiler and CPE utilities, such as debuggers and performance tools. Contact your system administrator to install ROCm and the support bundle if these resources are not available on the system.

The “amd” module provided by CPE is auto loaded when PrgEnv-amd is loaded. This module supports AMD ROCm C/C++/Fortran Compilers. The AMD compiler module enables access to AMD compatible libraries.

The “rocm” toolkit provided by CPE is optional and must be loaded by the user. The ROCm toolkit module extends the AMD compiler module to enable support for ROC Profiler, ROC Tracer, HIP, and ROCm. The ROCm module enables access to AMD accelerators for all programming environments.

Load the PrgEnv-amd module to use AMD: user@hostname> module load PrgEnv-amd

Load the PrgEnv-amd module to use AMD:

user@hostname> module load PrgEnv-amd

About the Intel compiler

Module: PrgEnv-intel

Command: ftn, cc, CC

Documentation: Intel oneAPI Website

CPE enables, but does not bundle, the Intel® oneAPI for Linux compiler. CPE provides a bundled package of support libraries to install into the programming environment to enable the Intel compiler and CPE utilities such as debuggers and performance tools.

Intel oneAPI includes their “classic” compilers (icc, icpc and ifort) as well as new versions of each of those (icx, icpx, and ifx). Because ifx is an experimental Fortran compiler, Intel encourages users to stay with the “classic” Fortran (ifort) compiler along with their new C/C++ (icx and icpx).

Because all of the Intel compilers come in the same package, the PrgEng-intel meta-module now has three options for the “intel” sub-module. They are:

  1. intel ( icx, icpx, ifort ) - PrgEnv-intel defaults to this given it is Intel’s recommendation

  2. intel-classic ( icc, icpc, ifort ) - All “classic” Intel compilers

  3. intel-oneapi ( icx, icpx, ifx ) - All “new” Intel compilers, where ifx is “beta” per Intel

    • To use the Intel DPC+/C++ compiler, load the PrgEnv-intel module:

      user@hostname> module load PrgEnv-intel
      
    • To use the Intel C++ Compiler Classic instead, switch to the intel-classic module:

      user@hostname> module swap intel intel-classic
      

About GNU

Module: PrgEnv-gnu

Command: ftn, cc, CC

Compiler-specific manpages: gcc(1), gfortran(1), g++(1) - available only when the compiler module is loaded.

Documentation: GCC Online Documentation

CPE bundles and enables the open-source GNU Compiler Collection (GCC).

  • To use GCC, load the PrgEnv-gnu module:

    user@hostname> module load PrgEnv-gnu
    

About the NVIDIA compiler

Module: PrgEnv-nvidia

Command: ftn, cc, CC

Documentation: NVIDIA HPC Compilers User’s Guide

CPE enables, but does not bundle, the Nvidia Compilers. CPE provides a bundled package of support libraries to install into the programming environment to enable this compiler and CPE utilities, such as debuggers and performance tools. Contact your system administrator to install Nvidia and the support bundle if these resources are not available on the system.

The “nvidia” module provided by CPE is auto loaded when PrgEnv-nvidia is loaded. This module supports Nvidia C/C++/Fortran Compilers. This compiler module enables access to Nvidia compatible libraries.

The “cuda” toolkit module extends the Nvidia compiler module to enable support for the NVIDIA CUDA compiler, libraries, debuggers, profilers, and other utilities for developing applications targeting NVIDIA GPUs. This module is required to interface with NVIDIA accelerators for all programming environments.

Load the PrgEnv-nvidia module to use Nvidia: user@hostname> module load PrgEnv-nvidia

Load the PrgEnv-nvidia module to use Nvidia:

user@hostname> module load PrgEnv-nvidia

Programming languages

The following programming languages are bundled with and supported by the CCE:

  • Fortran - The CCE Fortran compiler supports the Fortran 2018 standard (ISO/IEC 1539:2018), with some exceptions and deferred features as noted elsewhere.

    • Documentation is available in HPE Cray Fortran Reference Manual (S-3901) and also in manpages, beginning with the crayftn(1) manpage. Where information in the manuals differs from the manpage, the information in the manpage is presumed to be more current.

    • For the current direct link to the reference manual, see Additional Resources.

  • C/C++ - The default C/C++ compiler is based on Clang/LLVM.

    • Supports Unified Parallel C (UPC), an extension of the C programming language designed for high performance computing on large-scale parallel systems.

    • Documentation is provided in Clang Documentation, clang(1) man page, and HPE Cray Clang C and C++ Quick Reference (S-2179).

    • For the current direct link to the quick reference, see Additional Resources.

The following third-party programming languages are bundled with the Programming Environment:

  • Python

  • R

HPE Cray Scientific and Math Libraries

Modules: cray-libsci, cray-libsci_acc, cray-fftw, cray-hdf5, cray-hdf5-parallel, cray-netcdf, cray-netcdf-hdf5parallel

Manpages: intro_libsci(3s), intro_libsci_acc(s), intro_fftw3(3s) - available only when the associated module is loaded.

The HPE Cray Scientific and Math Libraries (CSML) are a collection of numerical routines optimized for best performance on HPE Cray Supercomputer systems. These libraries satisfy dependencies for many commonly used applications on HPE Cray systems for a wide variety of domains. If the module for a CSML package is loaded, all relevant headers and libraries for these packages are added to the compile and link lines of the cc, ftn, and CC CPE drivers. You must load the cray-hdf5 module (a dependency) before loading the cray-netcdf module.

The CSML collection contains the following scientific libraries:

  • BLAS (Basic Linear Algebra Subprograms)

  • CBLAS (Collection of wrappers providing a C interface to the Fortran BLAS library)

  • LAPACK (Linear Algebra PACKage)

  • LAPACKE (C interfaces to LAPACK Routines)

  • BLACS (Basic Linear Algebra Communication Subprograms)

  • ScaLAPACK (Scalable Linear Algebra PACKage)

  • FFTW3 (the Fastest Fourier Transforms in the West, release 3)

  • HDF5 (Hierarchical Data Format)

  • NetCDF (Network Common Data Format)

HPE Cray Message Passing Toolkit

Module: cray-mpich

Manpage: intro_mpi(3) - Available only when the associated module is loaded.

Website: MPI Forum

The HPE Cray Message Passing ToolKit (CMPT) is a collection of message passing libraries to aid in parallel programming.

MPI is a widely used parallel programming model that establishes a practical, portable, efficient, and flexible standard for passing messages between ranks in parallel processes. Cray MPI is derived from Argonne National Laboratory MPICH and implements the MPI-3.1 standard as documented by the MPI Forum in MPI: A Message Passing Interface Standard, Version 3.1.

MPI supports both OpenFabrics Interfaces (OFI) and Unified Communication X (UCX) network modules with OFI typically run as the default version. In some situations, we recommend unloading the OFI module and then loading the UCX module, rerunning, and comparing for optimal performance. These versions are binary compatible; therefore, recompiling or relinking an application is not necessary. In addition to some performance differences where one module might perform better than the other for a given application, the OFI version has a known limitation when establishing initial connections for applications that use an all-to-all communication pattern or a many-to-one pattern at very high scale. In these situations, we recommend trying the UCX version and comparing it to the performance of running with the OFI network module.

Support for MPI varies depending on system hardware. To see which functions and environment variables the system supports, check the intro_mpi(3) manpage. Because the OFI and UCX MPI versions are quite different, different intro_mpi man pages are displayed depending on which module is loaded. These man pages are a good source of additional information for the currently running network module, including the different runtime environment variables that can further control performance.

OpenSHMEMX

Module: cray-openshmemx

Manpage: intro_shmem(3) - available only when the associated module is loaded.

Website: Cray OpenSHMEMX

OpenSHMEM is a Partitioned Global Address Space (PGAS) library interface specification. OpenSHMEM provides a standard Application Programming Interface (API) for SHMEM libraries to aid portability and facilitate uniform predictable results of OpenSHMEM programs by explicitly stating the behavior and semantics of the OpenSHMEM library calls.

SHMEM has a long history as a parallel programming model. For the past two decades SHMEM library implementations in systems evolved through different generations.

OpenSHMEMX is a proprietary OpenSHMEM implementation that is OpenSHMEM specification version 1.4 compliant. Refer to the intro_shmem(3) manpage for more details.

DSMML

Module: cray-dsmml

Manpage intro_dsmml(3) - available only when the associated module is loaded.

Website Cray DSMML

Distributed Symmetric Memory Management Library (DSMML) is a proprietary memory management library. DSMML is a standalone memory management library for maintaining distributed shared symmetric memory heaps for top-level PGAS languages and libraries like Coarray Fortran, UPC, and OpenSHMEM. DSMML allows user libraries to create multiple symmetric heaps and share information with other libraries. Through DSMML, interoperability can be extracted between PGAS programming models.

Refer to the intro_dsmml(3) manpage for more details.

Debugger Support Tools

CPE ships with numerous debugging tools.

HPE Tools

A number of tools are included:

  • Gdb4hpc - A command line interactive parallel debugger that allows debugging of the application at scale. Helps diagnose hangs and crashes. A good all-purpose debugger to track down bugs, analyze hangs, and determining the causes of crashes.

  • Valgrind4hpc - A parallel debugging tool used to detect memory leaks and parallel application errors.

  • Sanitizers4hpc - A parallel debugging tool used to detect memory access or leak issues at runtime using information from LLVM Sanitizers.

  • Stack Trace Analysis Tool (STAT) - A single merged stack backtrace tool to analyze application behavior at the function level. Helps trace down the cause of crashes.

  • Abnormal Termination Processing (ATP) - A scalable core file generation and analysis tool for analyzing crashes, with a selection algorithm to determine which core files to dump. Helps determine the cause of crashes.

  • Cray Comparative Debugger (CCDB) - CCDB is not a traditional debugger, but rather a tool to run and step through two versions of the same application side by side to help determine where they diverge.

All CPE debugger tools support C/C++, Fortran, and Universal Parallel C (UPC).

Third-Party Debugging Tools

There are two third-party debugging tools available:

Tool Infrastructure

CPE provides several tools for tool developers to enhance their own debuggers for use with the CPE:

  • Common Tools Interface (CTI) - Offers a simple, WLM agnostic API to support tools across all Cray systems.

  • Multicast Reduction Network (MRNET) - Provides a scalable communication tool for libraries.

  • Dyninst - Provides dynamic instrumentation libraries.

Cray Performance Measurement and Analysis Tools

The Cray Performance Measurement and Analysis Tools (CPMAT) suite reduces the time needed to port and tune applications. It provides an integrated infrastructure for measurement, analysis, and visualization of computation, communication, I/O, and memory utilization to help users optimize programs for faster execution and more efficient computing resource usage.

The toolset allows developers to perform sampling, profile, and trace experiments on executables, extracting information at the program, function, loop, and line level. Programs written in Fortran and C/C++ (including UPC) and HIP, with MPI, SHMEM, OpenMP, CUDA, or a combination of these programming languages and models are supported. It supports profiling applications built with CCE, AMD, and GNU compilers.

Performance analysis consists of three basic steps:

  1. Instrument the program to specify what kind of data to collect under what conditions.

  2. Execute the instrumented executable to generate and capture data.

  3. Analyze the resulting data.

Programming interfaces include:

  • perftools-lite-* - Simple interface that produces reports to stdout. There are four perftools-lite submodules:

    • perftools-lite - Lowest overhead sampling experiment identifies key program bottlenecks.

    • perftools-lite-events - Produces a summarized trace; a good tool for detailed MPI statistics, including synchronization overhead.

    • perftools-lite-loops - Provides loop work estimates (must be used with CCE).

    • perftools-lite-hbm - Reports memory traffic information (CCE, x86-64 systems only). See the perftools-lite(4) manpage for details.

  • perftools - Advanced interface provides full-featured data collection and analysis capability, including full traces with timeline displays. It includes the following components:

    • pat_build - Utility instruments programs for performance data collection.

    • pat_report - After using pat_build to instrument the program, set runtime environment variables, execute the program, use pat_report to generate text reports from the resulting data, and export the data for use in other applications. See the pat_report(1) manpage for details.

    • CrayPat runtime library - Collects specified performance data during program execution. See the intro_craypat(1) manpage for details.

  • pat_run - Launches a dynamically linked program instrumented for performance analysis. After successfully run, collected data may be explored further with the pat_report and Cray Apprentice2 tools. See the pat_run(1) manpage for details.

Also included:

  • PAPI - The PAPI library, from the Innovative Computing Laboratory at the University of Tennessee in Knoxville, is distributed with the performance tools. PAPI allows applications or custom tools to interface with hardware performance counters made available by the processor, network, or accelerator vendor. Performance tools’ components use PAPI internally for CPU, GPU and network performance counter collection for derived metrics, observations, and performance reporting. A simplified user interface, which does not require the source code modification of using PAPI directly, is provided for accessing counters.

  • Cray Apprentice2 - An interactive X Window System tool for visualizing and manipulating performance analysis data captured during program execution.

  • pat_view - Aggregates and presents multiple sampling experiments for program scaling analysis. See the pat_view(1) manpage for more information.

  • Reveal - Extends performance tools technology by combining performance statistics and program source code visualization with compiler optimization feedback to better identify and exploit parallelism, and to pinpoint memory bandwidth sensitivities in an application. Reveal lets users navigate source code to highlighted dependencies or bottlenecks during optimization. Using the program library provided by CCE and the performance data collected, the user can navigate source code to understand which high-level loops could benefit from OpenMP parallelism from loop-level optimizations such as exposing vector parallelism. Reveal provides dependency and variable scoping information for those loops and assists the user with creating parallel directives.

Use performance tools to:

  • Identify bottlenecks

  • Find load-balance and synchronization issues

  • Find communication overhead issues

  • Identify loops for parallelization

  • Map memory bandwidth utilization

  • Optimize vectorization within application code

  • Collect application energy consumption information

  • Collect scaling information for application code

  • Interpret performance data

More information is available in HPE Performance Analysis Tools User Guide (S-8014). For the current direct link to this publication, see Additional Resources.

About CPE Deep Learning Plugin

Modules: craype-dl-plugin-py3, craype-dl-plugin-py2

Commands: import dl_comm as cdl, help(cdl), help(cdl.gradients)

Manpage: intro_dl_plugin(3)

The CPE Deep Learning Plugin (CPE DL Plugin) is a highly tuned communication layer for performing distributed deep learning training. The CPE DL Plugin provides a high performance gradient-averaging operation and routines to facilitate process identification, job size determination, and broadcasting of initial weights and biases. The routines can be accessed through the plugin’s C or Python APIs. The Python API provides support for TensorFlow, PyTorch, Keras, and NumPy.

For more information about CPE DL Plugin directives, see the intro_dl_plugin(3) manpage.

User Access Service

The User Access Service (UAS) is a containerized service managed by Kubernetes that enables application developers to create and run user applications. UAS runs on a non-compute node (NCN) that is acting as a Kubernetes worker node.

Users launch a User Access Instance (UAI) using the cray command. Users can also transfer data between the Cray system and external systems using the UAI.

When a user requests a new UAI, the UAS service returns status and connection information to the newly created UAI. External access to UAS is routed through a node that hosts gateway services.

The time zone inside the UAI container matches the time zone on the host on which it is running, For example, if the time zone on the host is set to CDT, the UAIs on that host will also be set to CDT.

Table: UAS Components

Component

Function/Description

User Access Instance (UAI)

An instance of UAS container

uas-mgr

Manages UAI life cycles

Table: UAS Container Contents

Container Element

Components

Operating system

SLES15 SP1

kubectl command

Utility to interact with Kubernetes

cray command

Command that allows users to create, describe, and delete UAIs

Use cray uas list to list the following parameters for a UAI.

Note that the example values below are used throughout the UAS procedures. They are used as examples only. Users should substitute with site-specific values.

Table: UAS Parameters

Parameter

Description

Example value

uai_connect_string

The UAI connection string

ssh user@203.0.113.0 -i \ ~/.ssh/id_rsa

uai_img

The UAI image ID

registry.local/cray/  \  cray-uas-sles15sp1-slurm:latest

uai_name

The UAI name

uai-user-be3a6770

uai_status

The state of the UAI

Running: Ready

username

The user who created the UAI

user

uai_age

The age of the UAI

11m

uai_host

The node hosting the UAI

ncn-m001

UAS Limitations

Functionality Not Currently Supported by the User Access Service

  • Lustre (lfs) commands within the UAS service pod.

  • Executing Singularity containers within the UAS service.

  • Building Docker containers within the UAS environment.

  • Building containerd containers within the UAS environment.

  • dmesg cannot run inside a UAI due to container security limitations.

  • Users cannot ssh from ncn-m001 to a UAI. This is because UAIs use LoadBalancer IPs on the Customer Access Network (CAN) instead of NodePorts and the LoadBalancer IPs are not accessible from ncn-m001.

Other Limitations

  • A known issue exists where X11 traffic may not forward DISPLAY correctly if the user logs into an NCN node before logging into a UAI.

  • The cray uas uais commands are not restricted to the user authenticated with cray auth login.

Limitations Related To Restarts

Changes made to a running UAI will be lost if the UAI is restarted or deleted. The only changes in a UAI that will persist are those written to an externally mounted file system (such as Lustre or NFS).

A UAI may restart due to an issue on the physical node, scheduled node maintenance, or intentional restarts by a site administrator. In this case, any running processes (such as compiles), Slurm interactive jobs, or changes made to the UAI (such as package installations) are lost.

If a UAI restarts on a node that was recently rebooted, some of the configured volumes may not be ready and it could appear that content in the UAI is missing. In this case, restart the UAI.

Log in to an HPE Cray Supercomputing EX System

Users access an HPE Cray Supercomputing EX system through either a User Access Instance (UAI) or a User Access Node (UAN).

Log in to a User Access Instance (UAI)

UAIs are containerized environments managed by Kubernetes and hosted on a non-compute node (NCN) acting as a worker (such as ncn-m001). The user must create a UAI before they are able to log in.

To log into an existing UAI, use the uai_connect_string created in Create a UAI from a Specific Image in Legacy Mode:

user@hostname> ssh <USERNAME@UAI_IP_ADDRESS> -i ~/.ssh/id_rsa

When challenged for a password, enter the passphrase specified when the ssh key was generated.

Log in to a User Access Node (UAN)

UANs are bare metal nodes configured by the system administrator and run an image based off the compute node image. To log into a UAN, use the ssh command. When challenged for a password, enter your username password:

user@hostname> ssh <uan_ip>
Last login: Thu Aug 26 09:28:38 CDT 2021 from 172.96.255.255

This node is running Cray's Linux Environment version 1.3.0

############################################################################

 .d8888b.                                888     888       d8888 888b    888
d88P  Y88b                               888     888      d88888 8888b   888
888    888                               888     888     d88P888 88888b  888
888        888d888 8888b.  888  888      888     888    d88P 888 888Y88b 888
888        888P"      "88b 888  888      888     888   d88P  888 888 Y88b888
888    888 888    .d888888 888  888      888     888  d88P   888 888  Y88888
Y88b  d88P 888    888  888 Y88b 888      Y88b. .d88P d8888888888 888   Y8888
 "Y8888P"  888    "Y888888  "Y88888       "Y88888P" d88P     888 888    Y888
                                888
                           Y8b d88P
                            "Y88P"

You have logged into a Cray Shasta Premium User Access Node

Hostname:     uan01
Distribution: SLES 15.1 1
CPUS:         128
Memory:       257.5GB
Configured:   2021-08-26


Please contact your IT system admin for any support requests.
#############################################################################

Differences Between UAI and UAN

UAI

UAN

Type

Container managed by Kubernetes

Baremetal Server

Single/Multiuser

Single User

Multiuser

Access

- SSH with user ssh keys

- SSH with user login/password

- UAI must be initially created by user

- Always up, no need for user to create

- No fixed hostname or IP address

- Fixed hostname and IP address

- SSH uses non-standard ports

- SSH uses standard ports

Image

- Default image based compute image

- Default image based on compute image

- Each user can have their own image

- All users share the same image and process space

- Each user is in their own process space

craycli

yes

yes

Data Transfer

SCP using UAI IP and Port number and user SSH keys

SCP using UAN and username/password

Troubleshooting

Kubernetes logs and techniques plus Linux techniques

Normal Linux logs and techniques

Create a UAI from a Specific Image in Legacy Mode

PREREQUISITES

  • A public SSH key must be available.

  • The HPE Cray CLI for non-admin users must be initialized.

OBJECTIVE

This procedure details how to create a UAI that uses a specific, registered image.

PROCEDURE

  1. List available UAS images:

    ncn-m001## cray uas images list
    default_image = "registry.local/cray/cray-uas-sles15sp1-slurm:latest"
    image_list = [ "registry.local/cray/cray-uas-sles15sp1-slurm:latest",
    "registry.local/cray/cray-uas-sles15sp1:latest",]
    

    Troubleshooting: If the HPE Cray CLI has not been initialized, the CLI commands will not work. See Configure the Cray Command Line Interface (CLI) in the Cray System Management Administration Guide for more information. See HPE Cray Supercomputing EX software documentation links for additional information on this guide.

  2. Create a new UAI:

    ncn-m001## cray uas create --publickey ~/.ssh/id_rsa.pub
    uai_age = "0m"
    uai_connect_string = "ssh vers@10.102.10.249"
    uai_host = "ncn-m002"
    uai_img = "registry.local/cray/cray-uai-compute:latest"
    uai_ip = "10.102.10.249"
    uai_msg = "ContainerCreating"
    uai_name = "uai-vers-1dcf28e5"
    uai_status = "Waiting"
    username = "vers"[uai_portmap]
    

    To create a UAI with a non-default image, add the --imagename argument with the above command.

  3. Verify the UAI is in the “Running: Ready” state:

    ncn-m001:~ ## cray uas list
    uai_age = "1m"
    uai_connect_string = "ssh vers@10.102.10.249"
    uai_host = "ncn-m002"
    uai_img = "registry.local/cray/cray-uai-compute:latest"
    uai_ip = "10.102.10.249"
    uai_msg = "ContainerCreating"
    uai_name = "uai-vers-1dcf28e5"
    uai_status = "Running: Ready"
    username = "vers"
    
  4. Log in to the UAI with the connection string:

    user@hostname> ssh <USERNAME@UAI_IP_ADDRESS>
    

Access a UAI from Multiple Hosts

Adds extra public ssh keys to a UAI to allow access from multiple hosts.

PREREQUISITES

OBJECTIVE

When a UAI is first created, it can only be accessed with one SSH key. This procedure adds more public keys to a UAI to allow access from multiple hosts.

PROCEDURE

  1. Identify the connection string for the UAI.

    ncn-m001$ cray uas list | grep connect_string
    'uai_connect_string": "ssh user@203.0.113.0"
    
  2. Copy the new public key (~/.ssh/<second-key.pub>) to the UAI.

    ncn-m001$ scp -i ~/.ssh/id_rsa ~/.ssh/<second-key.pub> <USERNAME@UAI_IP_ADDRESS>:<second-key.pub>
    
  3. Add the new key to /etc/uas/ssh/authorized_keys in the UAI.

    ncn-m001$ ssh <USERNAME@UAI_IP_ADDRESS> 'cat ~/<second-key.pub> >> etc/uas/ssh/authorized_keys'
    
  4. SSH to the UAI with the new connection string.

    ncn-m001$ ssh <USERNAME@UAI_IP_ADDRESS>
    

Configuring the development environment with modules

Each modulefile contains information needed to configure the shell for an application. After the Modules package is initialized, the environment can be modified on a per-module basis using the module command. Typically, modulefiles instruct the module command to alter or set shell environment variables such as $PATH, $MANPATH, and so forth. Multiple users can share modulefiles on a system, and users can create their own to supplement or replace the shared modulefiles.

Add or remove modulefiles from the current environment as needed. The environment changes contained in a modulefile can be summarized through the module command as well. If no arguments are given, a summary of the module usage and subcommands is shown. The subcommand describes the action for the module command to take and its associated arguments.

Unless noted otherwise, the commands described in this section work for both the default module system and Lmod. Also, modules and modulefiles listed in the examples in this section are for demonstration purposes only. Actual versions may differ from versions on the current system.

Getting started

After logging in, load the needed programming environment:

user@hostname> module load PrgEnv-<compiler>

Listing loaded modules

To list loaded modules:

user@hostname> module list
Currently Loaded Modules:
 1) craype-x86-rome           5) xpmem/2.6.2-2.5_2.27__gd067c3f.shasta     9) cray-mpich/8.1.28
 2) libfabric/1.15.2.0        6) cce/17.0.0                               10) cray-libsci/23.12.5
 3) craype-network-ofi        7) craype/2.7.30                            11) PrgEnv-cray/8.5.0
 4) perftools-base/23.12.0    8) cray-dsmml/0.2.2

Module versions are for example purposes only and may vary from those on the system.

Listing available programming modules

To list programming modules, enter:

user@hostname> module avail PrgEnv
--------------------------- /opt/cray/pe/modulefiles ---------------------------
PrgEnv-amd/8.6.0      (D)   PrgEnv-cray/8.6.0    (L,D)   PrgEnv-intel/8.6.0   (D)
PrgEnv-aocc/8.6.0     (D)   PrgEnv-gnu-amd/8.6.0 (D)     PrgEnv-nvidia/8.6.0  (D)
PrgEnv-cray-amd/8.6.0 (D)   PrgEnv-gnu/8.6.0     (D)

Module versions are for example purposes only and may vary from those on the system.

Listing available modules

To list available modules, enter:

user@hostname> module avail

To list all available modules of a certain type (for example module avail cce), enter:

user@hostname> module avail cce

--------------------------- opt/cray/pe/lmod/modulefiles/mix_compilers ---------------------------
  cce-mixed/16.0.1    cce-mixed/16.0.0    cce-mixed/17.0.0 (D)

------------------------------ /opt/cray/pe/lmod/modulefiles/core --------------------------------
  cce/16.0.0    cce/16.0.1    cce/17.0.0 (L,D)

Module versions are for example purposes only and may vary from those on the system.

Load Modules

To load the default version of a module, enter, for example:

user@hostname> module load cce

To load a specific version of a module, enter, for example:

user@hostname> module load cce/<version>

Unloading modules

To remove a module, enter, for example:

user@hostname> module unload cray-libsci

Changing module versions

To swap out the default module for a specific version, enter, for example:

user@hostname> module switch cce cce/<version>

Changing module versions using the cpe module

The cpe module specifies all CPE modules associated with a given monthly CPE release. The module name is cpe/<date>, where <date> is the release date in the format yy.mm. The purpose of cpe is to enable users to switch currently loaded CPE modules to the version provided in a given monthly release by using a single command. All subsequently loaded modules treat the version associated with cpe/<date> as the default version.

For example, if cray-mpich/8.1.3 and cray-libsci/21.03.1.1 are included in cpe/21.03 (March 2021) and the user currently has cray-mpich/8.1.2 and cray-libsci/20.12.1.2 loaded, switch both to the March 2021 versions by entering:

user@hostname> module load cpe/21.03

Unloading cpe does not restore the previously loaded module versions and, in fact, will have no effect on currently loaded modules. To compensate for this deficiency, the cpe directory contains restore_system_default scripts:

user@hostname> source /opt/cray/pe/cpe/21.03/restore_system_defaults.sh

Changing programming environments

Use module swap to change between programming environments. For example:

user@hostname> module swap PrgEnv-cray PrgEnv-gnu

Displaying module information

To display information about module conflicts and links, enter, for example:

user@hostname> module show perftools
--------------------------------------------------------------------------------------------------------------
   /opt/cray/pe/lmod/modulefiles/perftools/23.05.0/perftools.lua:
--------------------------------------------------------------------------------------------------------------
prereq("perftools-base")
family("perftools")
help([[
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
This instrumentation module enables the full functionality of CrayPat, which
includes a wealth of performance measurement, analysis and presentation options
collection through pat_build and the CrayPat runtime Environment variables.
specified counter overflows, and tracing experiments, which count some event
such as the number of times a specific function is executed.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

]])
setenv("CRAYPAT_COMPILER_OPTIONS","1")

user@hostname> module show PrgEnv-cray
-------------------------------------------------------------------
family("PrgEnv")
help([[    The PrgEnv-cray modulefile loads the Cray Programming Environment,
    which includes the Cray Compiling Environment (CCE).
    This modulefile defines the system paths and environment variables
    needed to build an application using CCE for supported
    Cray systems.

    This module loads the following modules:
     - cray-dsmml
     - cray-mpich
     - cray-libsci

    NOTE: This list is defined in /etc/cray-pe.d/cray-pe-configuration.sh.]])
whatis("Enables the Programming Environment using the cray compilers.")
setenv("PE_ENV","CRAY")
load("craype")
load("cray-dsmml")
load("cray-mpich")
load("cray-libsci")

Swapping other programming environment components

Switching the module environment does not completely change the run-time environment for products containing dynamically linked libraries, such as MPI. This outcome occurs because the runtime linker caches dynamic libraries as specified by /etc/ld.so.conf. To use a nondefault version of a dynamic library at run time, prepend <CRAY_LD_LIBRARY_PATH> to <LD_LIBRARY_PATH>.

To revert the environment to an earlier version of cray-mpich 8.0, enter:

user@hostname> module swap cray-mpich/8.0.5.4 cray-mpich/8.0.3
user@hostname> export LD_LIBRARY_PATH=$<CRAY_LD_LIBRARY_PATH>:$<LD_LIBRARY_PATH>

Change the run-time linking search path behavior by using the PE_LD_LIBRARY_PATH environment variable. For example, enter:

user@hostname> export PE_LD_LIBRARY_PATH=system
user@hostname> module swap cray-mpich/8.0.5.4 cray-mpich/8.0.3

If the PE_LD_LIBRARY_PATH environment variable is set to system, CPE modules directly interact with LD_LIBRARY_PATH. Otherwise, if the variable is not set or set to any other value, CPE modules retain the default behavior, leaving LD_LIBRARY_PATH under user control.

Lmod mixed compiler support

CPE Lmod supports loading multiple different compiler modules concurrently by loading a dominant core-compiler module and one or more support mixed-compiler modules. Core-compiler modules are located in the core directory. Loading a core-compiler module sets Lmod hierarchy variables. After the core-compiler module is loaded, module avail lists the mixed-compiler modules available for loading.

The CPE Lmod mixed compiler support provides the flexibility for users to choose which compiler modules (including user generated compiler modules) to mix together; however, only compiler modules released by CPE are supported. Because users can generate their own compiler modules, CPE cannot guarantee that all mix-compilers shown for core-compiler modules are compatible.

Example:

Loading the CCE and GCC modules together with CCE as the dominant core compiler and GCC as the supporting mixed compiler (Note that command output text is abbreviated, and module versions are generalized.):

user@hostname> module load PrgEnv-cray
user@hostname> module avail
...
----- /opt/cray/pe/lmod/modulefiles/mix_compilers ----
   ...
    gcc-mixed/[version]
   ...

user@hostname> module load gcc-mixed
user@hostname> module list

Currently Loaded Modules:
  #)cce/[version] #)PrgEnv-cray/[version] #)gcc-mixed/[version]

Transfer Application Data

Copy Application Data from an External Workstation to a UAI

PREREQUISITES

  • UAS must be running.

  • The file that needs to be transferred from the system must have been created.

OBJECTIVE

Move application data from an external workstation, such as a laptop, to a User Access Instance (UAI). Application data is stored on shared storage that is mounted onto the UAI on which the user is logged in.

PROCEDURE

  1. Log in to the NCN.

  2. Retrieve the UAI port number and IP address:

    user@ncn> cray uas list | grep connect_string
    
    `uai_connect_string = "ssh <USERNAME@UAI_IP_ADDRESS>
    
  3. Transfer data from the workstation to the UAI with a password:

    user@uan> scp -i ~/.ssh/id_rsa <fileName.txt> <USERNAME@UAI_IP_ADDRESS>:
    

    The system returns a message after the file is completely transferred:

    fileName.txt              100%   30     1.9KB/s   00:00
    
  4. Verify that the file transferred successfully:

    a. Connect to the UAI with the connection string:

    user@ncn> ssh <USERNAME@UAI_IP_ADDRESS>
    

    b. Switch to the home directory where the application data was saved on the HPE Cray.

    c. List the files in the directory:

    [user@uai ~]$ ls -ltr <fileName.txt>
    -rw-r--r-- 1 user 1049 30 Sep  7 20:35 <fileName>
    

Copy Application Data from a UAI to an External Workstation

PREREQUISITES

The user must have a running User Access Instance (UAI).

OBJECTIVE

This procedure details how to move application data from the User Access Instance (UAI) to an external workstation, such as a laptop. Application data is stored on shared storage that is mounted onto the UAI on which the user is logged in.

PROCEDURE

  1. Log in to the UAI with the connection string:

    $ ssh <USERNAME@UAI_IP_ADDRESS>
    
  2. Transfer data from the UAI to the workstation:

    [user@uai ~]$ scp <fileName.txt> <USERNAME>@<workstation-hostname>:<path>
    

    The system returns a message after the file is completely transferred:

    <fileName.txt>              100%   30     1.9KB/s   00:00
    
  3. Verify the file was transferred successfully:

    a. Log on to the workstation.

    b. Switch to the directory on the workstation where the application data was saved.

    c. List the files in the directory:

    $ ls -ltr <fileName.txt>
    -rw-r--r-- 1 <USERNAME> 1049 30 Sep  7 20:35 <fileName.txt>
    

Compiling an application

Build an MPI application

PREREQUISITES

  • CPE must be loaded.

OBJECTIVE

This procedures provides details on how to build an MPI application using the CPE compiler driver cc.

LIMITATIONS

  • Only dynamic linking is supported.

  • Vendor-specific compiler commands such as gcc are not supported.

PROCEDURE

  1. Verify that the correct modules are loaded. Note that module versions are for example purposes only and may vary from those on the system.

    user@hostname> module list
    Currently Loaded Modulefiles:
    1) cce/13.0.1
    2) craype/2.7.15
    3) craype-x86-rome
    4) cray-libsci/21.08.1.2
    5) craype-network-ofi
    6) cray-dsmml/0.2.1
    7) perftools-base/22.04.0
    8) xpmem/2.2.40-7.0.1.0_2.4__g1d7a24d.shasta
    
  2. Change to the directory where the application is located.

    user@hostname> cd /lus/<USERNAME>/
    
  3. Create an application.

    See Example MPI program source for a sample “Hello World” MPI application.

  4. Build the application.

    user@hostname> cc mpi_hello.c -o mpi_hello.x
    

Example MPI program source

mpi_hello.c

/* MPI hello world example */
#include <stdio.h>
#include <mpi.h>
int main(int argc, char **argv)
{
   int rank;
   MPI_Init(&argc, &argv);
   MPI_Comm_rank(MPI_COMM_WORLD, &rank);
   printf("Hello from rank %d\n", rank);
   MPI_Finalize();
   return 0;
}

Build an OpenSHMEM application

OpenSHMEMX is a proprietary OpenSHMEM specification implementation compliant with the OpenSHMEM API Specification Version 1.4. It is designed to be modular to support different transport layers for communication. OpenSHMEMX uses Libfabric for both inter- and intra-node communication. The library manpages are in development, and details on the base environment variables are documented in the intro_shmem manpage.

PREREQUISITES

A UAI must be running (see Create a UAI from a Specific Image in Legacy Mode).

OBJECTIVE

This procedure provides details on how to compile an OpenSHMEM program with OpenSHMEMX.

LIMITATIONS

  • Only dynamic linking is currently supported.

  • OpenSHMEMX is not supported on CentOS.

  • Vendor-specific compiler commands, such as gcc, are not supported.

PROCEDURE

  1. Log in to the UAI with the connection string:

    user@hostname> ssh <USERNAME@UAI_IP_ADDRESS>
    
  2. Load CPE modules:

    [user@uai ~]$ module load cray-dsmml
    [user@uai ~]$ module load cray-openshmemx
    
  3. Verify the correct modules are loaded. Note that module versions below are examples only and may vary from those on the machine.

    [user@uai ~]$ module list
    
    Currently Loaded Modulefiles:
    
    1) craype-x86-rome           5) xpmem/2.6.2-2.5_2.27__gd067c3f.shasta     9) cray-mpich/8.1.28
    2) libfabric/1.15.2.0        6) cce/17.0.0                               10) cray-libsci/23.12.5
    3) craype-network-ofi        7) craype/2.7.30                            11) PrgEnv-cray/8.5.0
    4) perftools-base/23.12.0    8) cray-dsmml/0.2.2
    

    IMPORTANT: Use the actual platform name in place of [platform_name] in the above example.

  4. Change to the directory where the application is located:

    [user@uai ~]$ cd /lus/<USERNAME>
    
  5. Create an OpenSHMEM application. See Example OpenSHMEM program source for a sample Hello World OpenSHMEM application.

  6. Build the application:

    [user@uai ~]$ cc -dynamic shmem_hello.c -o shmem_hello.x
    

Example OpenSHMEM program source

shmem_hello.c

/* OpenSHMEM hello world example */
#include <stdio.h>
#include <shmem.h>
int main(void)
{
    shmem_init();
    int me   = shmem_my_pe();
    int npes = shmem_n_pes();
    printf("Hello World from PE #%d of %d\n", me, npes);
    shmem_finalize();
    return 0;
}

Build a DSMML application

Distributed Symmetric Memory Management Library (DSMML) is a proprietary memory management library. DSMML is a standalone memory management library for maintaining distributed shared symmetric memory heaps for top-level PGAS languages and libraries like Coarray Fortran, UPC, and OpenSHMEM. DSMML allows user libraries to create multiple symmetric heaps and share information with other libraries. Through DSMML, interoperability can be extracted between PGAS programming models.

PREREQUISITES

Ensure that a UAI is running (see Create a UAI from a Specific Image in Legacy Mode).

OBJECTIVE

This procedure provides details on how to compile a DSMML program with HPE Cray DSMML.

LIMITATIONS

  • Only dynamic linking is currently supported.

  • HPE Cray DSMML is not supported on CentOS.

  • Vendor-specific compiler commands, such as gcc, are not supported.

PROCEDURE

  1. Log in to the UAI with the connection string.

    user@hostname> ssh <USERNAME@UAI_IP_ADDRESS> -i ~/.ssh/id_rsa
    
  2. Load CPE modules.

    [user@uai ~]$ module load cray-dsmml
    
  3. Verify the correct modules are loaded. Note that module versions are for example purposes only and may vary from those on the machine.

    [user@uai ~]$ module list
    Currently Loaded Modulefiles:
    1) craype-x86-rome           5) xpmem/2.6.2-2.5_2.27__gd067c3f.shasta     9) cray-mpich/8.1.28
    2) libfabric/1.15.2.0        6) cce/17.0.0                               10) cray-libsci/23.12.5
    3) craype-network-ofi        7) craype/2.7.30                            11) PrgEnv-cray/8.5.0
    4) perftools-base/23.12.0    8) cray-dsmml/0.2.2
    
  4. Change to directory where the application is located.

    [user@uai ~]$ cd /lus/<USERNAME>
    
  5. Create an application.

    See Example HPE Cray DSMML program source for a sample DSMML application.

  6. Build the application.

    user@ ~]$ cc dsmml_example.c -o dsmml_example.x
    

Example HPE Cray DSMML program source

dsmml_example.c


/* Example DSMML segment creation operation */
#include <stdio.h>
#include <dsmml.h>
int main(void)
{
   dsmml_init_info_t reg_attrs;
   dsmml_sheap_seg_info_t valid_seg;
   dsmml_sheap_seg_info_t *seg_info = NULL;

   req_attrs.mype     = 0;
   req_attrs.smp_mype = 0;
   req_attrs.smp_npes = 1;

   valid_seg.id        = 0;
   valid_seg.act_addr  = NULL;
   valid_seg.base_addr = (void *)0x30000000000;
   valid_seg.length    = 1024;
   valid_seg.type      = DSMML_MEM_SYS_DEFAULT;
   valid_seg.pagesize  = DSMML_HPSIZE_DEFAULT;
   valid_seg.mode      = DSMML_MODE_DEFAULT;
   valid_seg.smp_mype  = 0;
   valid_seg.smp_npes  = 1;
   valid_seg.smp_set   = 0;

   dsmml_init(&req_attrs);
   dsmml_create_sheap_seg(&valid_seg);
   dsmml_get_sheap_seg(1, &seg_info);

   fprintf(stderr, "ID %d len %ld act_addr %p\n",
           seg_info->id, seg_info->length,
           seg_info->act_addr);

   dsmml_finalize();
   return 0;
}

Running an application

You can control HPE Slingshot network resources on systems running Slurm or PBS Pro with PALS.

Running an application with Slurm in batch mode

PREREQUISITES

OBJECTIVE

This procedure creates a launch script (either MPI or OpenSHMEM) and submits it as a job using Slurm within a User Access Instance (UAI).

PROCEDURE

  1. Log in to the UAI with the connection string:

    user@hostname> ssh <USERNAME@UAI_IP_ADDRESS>
    
  2. Load CPE modules:

    MPI: MPI modules are loaded by default.

    OpenSHMEM:

    [user@uai ~]$ module load cray-openshmemx
    

    Cray DSMML: Cray DSMML modules are loaded by default.

  3. Change to the directory where the application is located:

    [user@uai ~]$ cd /lus/<USERNAME>/
    
  4. Create a launch script.

    IMPORTANT: If your login shell does not match the batch script shell (for example, your login shell is tcsh, but the batch script uses bash), the module environment may not be initialized. To fix this, add -l to the first line of the batch script (for example, #!/bin/bash -l).

    To launch the application with sbatch, add srun to the launch script.

    MPI: This example launch script is specific to the Hello World MPI Application example code shown in Example MPI program source. It runs on four nodes with one PE per node.

    [user@uai ~]$ cat launch.sh
    #!/bin/sh
    #SBATCH --time=5
    #SBATCH --nodes=4
    #SBATCH --tasks-per-node=1
    ulimit -s unlimited ## in case not set by default
    srun ./mpi_hello.x
    

    OpenSHMEM: The example batch script below is specific to the Hello World openSHMEM Application example shown in Example OpenSHMEM program source. It runs on two nodes with four PEs per node.

    [user@uai ~]$ cat launch.sh
    #!/bin/sh
    #SBATCH --time=5
    #SBATCH --nodes=2
    #SBATCH --tasks-per-node=4
    module load cray-openshmemx
    srun -n8 ./shmem_hello.x
    

    Cray DSMML: The example batch script is specific to the Example HPE Cray DSMML Application code shown in Example HPE Cray DSMML program source. It runs on a single node with one PE.

    [user@uai ~]$ cat launch.sh
    #!/bin/sh
    #SBATCH --time=5
    #SBATCH --nodes=1
    #SBATCH --tasks-per-node=1
    srun -n1 ./dsmml_example.x
    
  5. Assign the required permissions to the launch.sh script to ensure it is executable:

    [user@uai ~]$ chmod u+x launch.sh
    
  6. Launch the batch script:

    [user@uai ~]$ sbatch launch.sh
    Submitted batch job 1065736
    
  7. Check job output:

    MPI:

    [user@uai ~]$ cat slurm-1065736.out
    Hello from rank 1
    Hello from rank 3
    Hello from rank 0
    Hello from rank 2
    

    Troubleshooting: Add ldd to the job script to ensure that the correct modules are loaded:

    [user@uai ~]$ ldd ./mpi_hello.x
    

Running an application with Slurm in interactive mode

PREREQUISITES

OBJECTIVE

This procedure launches an application using the Slurm srun command.

PROCEDURE

  1. Log in to the UAI with the connection string:

    user@hostname> ssh <USERNAME@UAI_IP_ADDRESS>
    
  2. Load CPE modules:

    MPI: MPI modules are loaded by default.

    OpenSHMEM:

    
    
  3. Change to the directory where the application is located:

    [user@uai ~]$ cd /lus/<USERNAME>
    
  4. Execute the application with srun:

    MPI:

    [user@uai ~]$ srun -N<ranks> --ntasks-per-node=<number_tasks_per_node> ./mpi_hello.x
    

    Example:

    See also Example MPI program source.

    user@uai ~]$ srun -N4 --ntasks-per-node=1 ./mpi_hello.x
    

    Example using UCX Instead of OFI:

    [user@uai ~]$ module swap craype-network-ofi craype-network-ucx
    [user@uai ~]$ module swap cray-mpich cray-mpich-ucx
    [user@uai ~]$ srun -N4 --ntasks-per-node=1 ./mpi_hello.x
    

    OpenSHMEM:

    [user@uai ~]$ srun -n<ranks> ./shmem_hello.x
    

    Example

    See also Example OpenSHMEM program source.

    [user@uai ~]$ srun -n8 ./shmem_hello.x
    

Running an application with PBS Pro in batch mode

PREREQUISITES

  • PBS Pro workload manager must be installed and configured.

  • The application must be compiled (see Build an MPI application).

OBJECTIVE

This procedure creates a launch script and submits it as a PBS job using PALS.

IMPORTANT: Due to the way PALS is integrated with PBS, the job-specific temporary directory (TMPDIR) is only created on the head node of the job. This scenario can cause application failure if it tries to create temporary files or directories. To work around this problem, add export TMPDIR=/tmp to the job script before calling aprun or mpiexec.

PROCEDURE

  1. Change to the directory where the application is located:

    user@hostname> cd /lus/<USERNAME>
    
  2. Create a launch script launch.sh.

    IMPORTANT: If your login shell does not match the batch script shell (for example, your login shell is tcsh, but the batch script uses bash), the module environment might not be initialized. To fix this issue, add -l to the first line of the batch script (for example, #!/bin/bash -l).

    MPI: This example launch script is specific to the “Hello World” MPI application (see Example MPI program source) running on four nodes.

    #!/bin/bash
    #PBS -l walltime=00:00:30
    echo start job $(date)
    module load cray-pals
    echo "mpiexec hostname"
    mpiexec hostname
    echo "mpiexec -n 4 /lus/<USERNAME>/hello_mpi"
    mpiexec -n4 /lus/<USERNAME>/mpi_hello.x
    echo end job $(date)
    exit 0
    
  3. Assign the required permissions to the launch.sh script to verify it is executable:

    user@hostname> chmod u+x launch.sh
    
  4. Launch the batch script:

    user@hostname> qsub -l select=4,place=scatter launch.sh
    
  5. Check job output:

    user@hostname> cat launch.sh.o426757
    Hello from rank 3
    Hello from rank 2
    Hello from rank 1
    Hello from rank 0
    

Running an application with PBS Pro in interactive mode

PREREQUISITES

  • PBS Pro workload manager must be installed and configured.

  • The application must be compiled (see Build an MPI application).

OBJECTIVE

This procedure interactively submits a job to PBS using the PALS mpiexec command.

PROCEDURE

  1. Initiate an interactive session:

    user@hostname> qsub -I
    qsub: waiting for job 4071.pbs-host to start
    qsub: job 4071.pbs-host ready
    user@hostname>
    
  2. Load the PrgEnv-cray, cray-pals, and cray-pmi modules:

    user@hostname> module load PrgEnv-cray; module load cray-pals; module load cray-pmi
    
  3. Acquire information about mpiexec:

    user@hostname> type mpiexec
    mpiexec is /opt/cray/pe/pals/<version>/bin/mpiexec
    
  4. Change to the directory where the application is located:

    user@hostname> cd /lus/<USERNAME>/
    
  5. Run the executable MPI program:

    user@hostname> mpiexec -n4 ./mpi_hello.x
    Hello from rank 1
    Hello from rank 2
    Hello from rank 3
    Hello from rank 0
    

    Example using UCX Instead of OFI:

    user@hostname> module swap craype-network-ofi craype-network-ucx
    
    Inactive Modules:
      1) cray-mpich
    
    user@hostname> module swap cray-mpich cray-mpich-ucx
    user@hostname> mpiexec -n4 ./mpi_hello.x
    Hello from rank 1
    Hello from rank 2
    Hello from rank 3
    Hello from rank 0
    

Using MPIxlate to run applications built with a non-MPICH compatible MPI library

HPE Cray MPIxlate is an MPI Application Binary Interface (ABI) translator. The translator allows you to run applications built with applicable MPI libraries which are not ABI-compatible with HPE Cray MPICH. HPE Cray MPIxlate also allows you to perform this functionality without performing recompilation functions on HPE Cray systems that use HPE Cray MPICH. HPE Cray MPIxlate translates Open MPI and SGI MPI ABIs into the MPICH ABI, which is used by HPE Cray MPICH.

Understanding the HPE Cray MPIxlate command line syntax

Run the MPIxlate translator by entering its command from the command line prompt. The command line syntax for typical HPE Cray MPIxlate usage includes three parts, and the HPE Cray MPIxlate command is located between the launcher and applications commands:

Command syntax:

Command Part 1

Command Part 2

Command Part 3

Launcher command & options

mpixlate -s <src-ABI> -t <tgt-ABI>

Application command & options

Example command:

Example Command Part 1

Example Command Part 2

Example Command Part 3

srun --ntasks=4

mpixlate -s ompi.40  -t cmpich.12

./mpiapp.OMPI

See the mpixlate(1) man-page included in the RPM download and the following sections for more information on using HPE Cray MPIxlate.

Understanding shared library ABI naming patterns used by HPE Cray MPIxlate

ABI naming patterns used by HPE Cray MPIxlate integrate ABI major version numbers. Note that the shared library ABI version is different from the corresponding product version number. ABI version numbers are part of the shared library full name. To find the shared library full name, enter:

user@hostname> ls -al libmpi.so*
lrwxrwxrwx 1 root root       17     June 4 2023     libmpi.so     ->   libmpi.so.40.30.5
lrwxrwxrwx 1 root root       17     June 4 2023     libmpi.so.40  ->   libmpi.so.40.30.5
-rwxr-xr-x 1 root root  1433032     June 4 2023     libmpi.so.40.30.5

user@hostname>

Determine the major (Maj) version number after you ascertain the shared library full name:

Determining the Major Version Number from the Library Name

The example above shows that 40 is the major version number of the shared library name.

Using source and target ABI identifiers

You must specify a source and target ABI when using HPE Cray MPIxlate:

  • Source ABI: Identifies the (non-MPICH compatible) MPI library with which the application was built.

  • Target ABI: Identifies the HPE Cray MPICH library which is used to run the application.

See the mpixlate(1) man-page included in the RPM download for supported source and target ABI combination details. Procedures to identify shared library major version numbers are briefly mentioned in the mpixlate(1) man-page and explained in Determining source and target ABI identifiers.

Determining source and target ABI identifiers

ABI identifiers used by HPE Cray MPIxlate follow the $<$MPI-implementation-name>$>$.$<$version>$>$ naming pattern. Determine both the source and target ABI identifiers using this naming pattern before using HPE Cray MPIxlate to obtain an ABI identifier:

  1. Determine the source ABI:

    a. Determine which MPI library implementation is used to build the application. The $<$MPI-implementation-name$>$ field for an application is not reliably recorded in the application binary. After determining which MPI library implementation is used, the required value is readily available from the table of supported ABIs listed in the mpixlate(1) man-page included in the RPM download. For an application built using Open MPI, the field value is ompi.

    b. Determine the value for the $<$version$>$ field. The $<$version$>$ field is a number that uniquely identifies the ABI of an MPI implementation. Note that:

      - Some ABI implementations do not have ABI version numbers. In those situations, HPE Cray MPIxlate uses the corresponding MPI product version number. The *mpixlate(1) man-page* lists MPI implementations and values for the $<$version$>$ field.
    
      - If the MPI library used to build the application has an ABI version, use the ABI major version number for the $<$version$>$ field. On x86-64 systems, applications in the ELF file format record the SONAME of the MPI library used to build the application in the dynamic section of the file. If that MPI library had an ABI version, the SONAME contains the ABI major version number, and it can be extracted using the `readelf(1)` utility. To obtain it for an application built using Open MPI, use, for example:
    
        `user@hostname> readelf --wide --dynamic ./mpiapp.OMPI | egrep "NEEDED" | egrep "mpi"`
        `0x0000000000000001 (NEEDED) Shared library: [libmpi.so.40]`
    
        `user@hostname>`
    
        In the above example, the ABI major version of the Open MPI shared library ABI is 40.
    

    c. Combine the above information to determine that the source ABI identifier ompi.40.

  2. Determine the target ABI:

    a. Determine the $<$MPI-implementation-name$>$. The $<$MPI-implementation-name$>$ for HPE Cray MPI listed in the mpixlate(1) man-page in the RPM download is cmpich.

    b. Obtain the value for the $<$version$>$ field. The $<$version$>$ field for the MPIxlate ABI identifier is obtained from the SONAME field of the library, which is ascertained by determining the path to the library and extract the SONAME using the readelf(1) utility. For example:

    ```screen
    user@hostname> echo ${CRAY_LD_LIBRARY_PATH} | tr ':' '\n' | egrep "mpich"
    /opt/cray/pe/mpich/8.1.20/ofi/cray/10.0/lib
    /opt/cray/pe/mpich/8.1.20/gtl/lib
    user@hostname>
    user@hostname>  readelf --wide --dynamic \
    /opt/cray/pe/mpich/8.1.20/ofi/cray/10.0/lib/libmpi.so | egrep -i "SONAME"
      0x000000000000000e (SONAME)             Library soname: [libmpi_cray.so.12]
    
    user@hostname>
    ```
    
    In the above example, the ABI major version number of the HPE Cray MPI library is `12`.
    

    c. Combine the information to determine the target ABI identifier. For this example, it is cmpich.12.

Loading modules and running the application using HPE Cray MPIxlate

To run the application using HPE Cray MPIxlate:

  1. Load the:

    a. HPE Cray programming environment PrgEnv-<env> (for example, PrgEnv-cray) module. b. HPE Cray MPI cray-mpich module. c. HPE Cray MPIxlate cray-mpixlate module.

  2. (Optional) Load additional modules needed to run the application.

  3. Launch the application using HPE Cray MPIxlate.

See the mpixlate(1) man-page included with the RPM download and product release notes for additional information on using the HPE Cray MPIxlate translator.

Debugging an application

Debug a hung application using gdb4hpc

PREREQUISITES

Before completing this procedure, make sure that the targeted application has been:

  • Compiled with -g to display debugging symbols. Compiling with -O0 to disable compiler optimizations is advised but not required.

  • Launched with srun.

OBJECTIVE

The procedure details how to debug a hung application using gdb4hpc.

PROCEDURE

  1. Load the gdb4hpc module:

    $ module load gdb4hpc
    
  2. Launch gdb4hpc:

    $ gdb4hpc
    
  3. Attach to the application with gdb4hpc:

    a. Choose a process set handle. All process sets are represented as a named debugger variable. Debugger variables are prefixed with a $ in the form of $<name> (for example, $app or $a). This variable name can be any variable name of your choosing, and whatever is easiest to remember.

    b. Determine the application identifier. For Slurm applications launched with srun, this identifier is a jobid.stepid determined using a mechanism, such as squeue for the jobid and sstat jobid for the stepid. By default, jobs only have one stepid and start at 0. For applications run with mpiexec, the pid of the mpiexec process can be supplied.

    c. Using the process set handle and application identifier information in the previous steps, use the attach command to attach onto the running application. In the following example, $a is the process set handle and 1840118.0 is the application identifier.

    ```screen
    $ dbg all> attach $a 1840118.0
    ```
    
  4. Conduct a traditional parallel debugging session. gdb4hpc commands include:

    • backtrace - Displays stack frames. Pass in an argument to limit the number of stack frames displayed (such as backtrace -5). See help backtrace.

    • frame - Displays only the current stack frame.

    • up - Moves the current stack frame up. Pass in an argument to move up the specified number of stack frames (up -3).

    • down - Moves the current stack frame down. Pass in an argument to move down the specified number of stack frames (down -5).

    • watchpoint - Sets an access or write watchpoint.

    • assign - Assigns a debugger convenience variable or application variable (see help assign for more details).

    • gdbmode - Drops directly into the gdb interpreter (see help gdbmode for more details).

    • kill - Kills the application with SIGKILL and ends the debug session.

    • release - Detaches and resumes the application. Ends the debug session.

    • quit - Exits gdb4hpc and releases the application.

  5. Fix any pinpointed bugs, recompile, and verify that the fix works.

    Tips:

    • Prevent printing of backtrace information for all ranks if you are attaching an application:

      $ dbg all> set print entry-frame false
      
    • Set the value to true to re-enable display of entry frames.

    • While gdb4hpc is running, use the help command to get more information about command usage. For example, to find information about the launch command, enter:

      $ help launch
      

Debug hung applications with STAT

PREREQUISITES

  • The STAT module must be loaded. To load the system default version:

    module load cray-stat
    

OBJECTIVE

This procedure debugs a hung application using the Cray Stack Trace Analysis Tool (STAT).

PROCEDURE

  1. Attach to the hung application with STAT with the workload manager job launcher, using either stat-gui or stat-cl. As seen in Figure 1, 19800 is the pid of the srun process.

    $ stat-cl 19800
    

Attach A Hung Application In STAT

STAT launches its daemons, gathers a stack trace from each process, and merges them into a prefix tree, as seen in Figure 2.

STAT Prefix Tree

  1. Analyze the merged backtrace using stat-view or stat-gui.

  2. Choose additional debugging steps based on the nature of the hang. Available stat-gui tools allow you to:

    • Narrow down to the trace steps exhibiting the bug by clicking Shortest Path (or Longest Path).

    • Adjust the sample size to look at the function level, or down to the function and line level, by clicking Sample.

    • Identify stack traces visited by the least or most numbers of tasks to identify outliers by clicking Least Task or Most Task.- Step though the temporal order of the stack trace by clicking Back TO or Forward TO. Right-click on tasks that have made the least progress to View Source code.

    • Gather X number of stack traces over time by clicking Sample Multiple.

    • Choose a subset of equivalent classes to feed to a debugger by clicking Eq C.

  3. Narrow down the search space to a specific function, and use a traditional debugger like gbd4hpc or valgrind4hpc (depending on the bug’s nature) to find the bug.

Debug Crashed Applications with ATP

ATP is the first-line tool to diagnosis crashing applications. When a parallel application crashes, ATP can produce a merged stack trace tree, providing an overview of the entire job state. ATP also selectively produces core files from crashing processes (or ranks). If further debugger support is required, the user may opt to rerun the job under the Cray parallel debugger, gdb4hpc.

PREQUISITES

  • ATP module must be loaded (default).

  • Target application must be dynamically or statically linked against the ATP support library.

  • Target application must have been compiled with -g to keep debugging symbols.

OBJECTIVE

This procedure details how to debug a crashed application using the HPE Cray Abnormal Termination Processing (ATP) debugger.

PROCEDURE

  1. Load the ATP module if not already loaded:

    $ module load atp
    
  2. Set ATP_ENABLED=1.

  3. (Optional) Set environmental variables.

    With the exception of ATP_ENABLED, ATP does not usually need other environment variables. But if necessary, runtime and output behavior can be modified with the following:

    • ATP_CONSOLE_OUTPUT: Default enabled. If enabled, ATP produces an overview of the crashed program and writes it to a standard error. This overview provides rank information, the signal that caused the crash, and crash location and assertion, if available.

    • ATP_HOLD_TIME: Default 0 minutes. If set to a nonzero value, ATP pauses for the specified number of minutes after detecting a crash. The job is held in a stopped state so a debugger like GDB4hpc can attach for further debugging.

    • ATP_MAX_ANALYSIS_TIME: Default 300 seconds. After sending a crash analysis request to the ATP backend process, the ATP frontend process waits the given number of seconds for crash analysis to occur. If this timeout expires, ATP assumes that the backend process was unsuccessful and continues job termination.

    • ATP_MAX_CORES: Default 20 core files. After crash analysis completion, ATP selects a subset of ranks from which to dump core files. The maximum number of such files is limited by this variable. If set to 0, core file dumping is disabled.

    • ATP_CORE_FILE_DIRECTORY: Default current directory. ATP writes core files from the selected subset of ranks to the given directory.

  4. Run the application.

  5. Examine the ATP output.

    If handling a crash, ATP prints the following message:

    Application is crashing. ATP analysis proceeding...
    

    It proceeds to list each process and the reason for its failure, if crashed:

    Processes died with the following statuses:
     <0 > Reason: '<RUNNING>' Address: 0x7ffff7bab697 Assertion: ''
     <1 2 3 > Reason: 'SIGSEGV /SEGV_MAPERR' Address: 0x0 Assertion: ''
    

    In this example, process 0 did not crash and was still running. Processes 1, 2, and 3 experienced segfaults.

  6. Gather the merged backtrace and core files.

    After displaying a summary of the job status, ATP writes selected core files and a graph visualization of the complete stack trace tree:

    Producing core dumps for ranks 3 1 2
     3 cores written in /cray/css/users/adangelo/stash/atp/tests
     View application merged backtrace tree with: stat-view atpMergedBT.dot
     You may need to: module load cray-stat
    

    By default, ATP writes the files atpMergedBT.dot and atpMergedBT_line.dot in the current working directory. atpMergedBT.dot is a function-level view of the stack trace tree, and atpMergedBT_line.dot is a source-line level view of the stack trace tree. Core files can be analyzed using the GNU Debugger, gdb.

  7. Examine those files with stat-view or gdb4hpc:

    $ module load cray-stat
    $ stat-view atpMergedBT_line.dot
    
  8. Fix any pinpointed bugs, recompile, and verify that the fixes work.

Debug crashing applications with gdb4hpc

PREREQUISITES

The targeted application must be compiled with -g -O0 to keep debugging symbols and disable compiler optimizations.

OBJECTIVE

This procedure debugs a crashing application using gdb4hpc.

PROCEDURE

  1. Load the gdb4hpc module:

    $ module load gdb4hpc
    
  2. Before launching the application:

    a. Determine the process set handle. Choose a name that is easy to remember, such as $app or $a. Array syntax notation implies the number of processing elements or threads, equivalent to the Slurm srun -n option.

    b. Determine additional WLM-specific settings.

    c. Determine any additional arguments. See help launch for available launcher arguments.

  3. Launch the application under gdb4hpc control:

    $ launch $a{<number of inferiors/ranks>} [--launcher-args="<optional WLM specific settings>"] [<optional_launch_args>] <path_to_executable>
    

    After launch is complete, the initial entry point is displayed. Note the process set notation of the application {0..1023}; each notation represents a process element.

  4. Conduct a traditional parallel debugging session. gdb4hpc commands include:

    • backtrace - Displays stack frames. Pass in an argument to limit the number of stack frames displayed (such as backtrace -5). See help backtrace.

    • frame - Displays only the current stack frame.

    • up - Moves up the current stack frame. Pass in an argument to move up the specified number of stack frames (up -3).

    • down - Moves the current stack frame down. Pass in an argument to move down the specified number of stack frames (down -5).

    • watchpoint - Sets an access or write watchpoint.

    • assign - Assigns a debugger convenience variable or application variable.

    • gdbmode - Drops directly into the gdb interpreter.

    • kill - Kills the application with SIGKILL and ends the debug session.

    • release - Detaches and resumes the application. Ends the debug session.

    • quit - Exits gdb4hpc and releases the application.

  5. Fix any pinpointed bugs, recompile, and verify that the fix works.

Debug applications with valgrind4hpc to find common errors

PREREQUISITES

The target application must be:

  • Dynamically linked.

  • Compiled with -g to keep debugging symbols.

OBJECTIVE

Find common issues like memory leaks using valgrind4hpc.

Valgrind4hpc is a Valgrind-based debugging tool which detects memory leaks and errors in parallel applications. Valgrind4hpc aggregates any duplicate messages across ranks to help provide an understandable picture of program behavior. Valgrind4hpc manages starting and redirecting output from many copies of Valgrind, as well as deduplicating and filtering Valgrind messages.

PROCEDURE

  1. Load the valgrind4hpc module (if not already loaded):

    $ module load valgrind4phc
    
  2. Run the memcheck tool to look for memory leaks:

    $ valgrind4hpc -n1024 --launcher-args="--exclusive --ntasks-per-node=32" \
    $PWD/build_cray/apps/transpose_matrix -- -c -M 31 -n 1000
    

    Use these common valgrind4hpc arguments:

    • -n, --num-ranks=<n> - Number of job ranks to pass to the workload manager (for example, Slurm).

    • -l, --launcher-args="<args>" - Additional workload manager arguments, such as rank distribution settings.

    • -o, --outputfile=<file> - Redirects all Valgrind4hpc error output to file.

    • -v, --valgrind-args="<args>" - Arguments to pass to the Valgrind instance.

      • For example, --valgrind-args="--track- origins=yes --leak-check=full" will track the exact origin of every memory leak, at the cost of performance.

  3. Examine the Valgrind4hpc output.

    Valgrind4hpc detects a potential memory error, such as an uninitialized read/write or a memory leak, and displays an error block containing the affected ranks, and a backtrace of where the error occurred. Note that with errors stemming from an invalid use of system library routines, the backtrace will mention internal library functions.

  4. Fix any pinpointed bugs, recompile, and verify that the fixes worked.

Debugging applications with Sanitizers4hpc to find common errors

PREREQUISITES

The target application must be:

  • Built with instrumentation for LLVM or GPU sanitizers, e.g. -fsanitize=address.

  • Compiled with -g to keep debugging symbols.

OBJECTIVE

Find memory access or leak issues at runtime using information from LLVM Sanitizers.

Sanitizers4hpc is an aggregation tool to collect and analyze LLVM Sanitizers output at scale. The Clang AddressSanitizer, LeakSanitizer, and ThreadSanitizer tools are supported. Additionally, the AMD GPU Sanitizer library and the Nvidia Compute Sanitizer are also supported. Sanitizers4hpc manages the launch of your job through the currently running workload manager. See the sanitizers4hpc(1) man page for details.

PROCEDURE

  1. Load the sanitizers4hpc module:

    $ module load sanitizers4hpc
    
  2. Run the application by supplying LLVM workload manager job launch arguments and the target binary. For example, to run the binary a.out with four ranks:

    $ sanitizers4hpc --launcher-args="-n4" -- ./a.out binary_argument
    

    When a memory error is encountered, the linked LLVM Sanitizer produces error reports for each affected rank. Sanitizers4hpc processes these error reports and aggregates them for easier analysis. For example, when the application encounters an invalid read off the end of a buffer on four ranks, Sanitizers4hpc generates a single error report noting the error on all four ranks but reporting it, based on AddressSanitizer information, at the same place in the source file.

    RANKS: <0-3>
    AddressSanitizer: heap-buffer-overflow on address
    READ of size 4 at 0x61d000002680 thread T0
    ...
     #1 0x328dc3 in main /source.c:37:22
    ...
    SUMMARY: AddressSanitizer: heap-buffer-overflow /source.c:52:15 in main
    

The sanitizers4hpc(1) man page also includes information on where to find documentation for the supported sanitizer tools.

Debug two versions of the same application side-by-side with CCDB

PREREQUISITES

Two versions of the same application must be available for comparison.

OBJECTIVE

Debug an application by comparing it to a similar, yet different and working application, to see differences in data at every stage of execution.

LIMITATIONS

Applications must be similar enough to step through execution steps in parallel, but different enough to see data changes through those execution steps.

PROCEDURE

  1. Load the CCDB module.

    $ module load cray-ccdb
    
  2. Launch CCDB.

    $ ccdb
    

    The CCDB window appears.

  3. Populate launch specifications for both applications.

    a. Enter Application, Launcher Args, and Number of PEs details if resources have already been allocated and a qsub session started.

    b. Enter a Batch Type in each Launch Specification if resources have not been allocated.

  4. Double-click the source file for each test application to view the source code.

  5. Generate an Assertion Script.

    a. Left-click on any line number.

    b. Click Build Assert.

    The Assertion Script Dialog window opens.

    c. Enter Name of the Source File, Line number, Variable, and Decomposition information for both App0 and App1.

    d. Click Add Assert.

    e. Select Save Script to save the Assertion script.

    f. Select Start to run the assertion script.

  6. Open an Assertion Script.

    a. Select View from the menu bar.

    b. Hover over Assertion Scripts in the View drop-down list.

    c. Select an Assertion Script.

    The CCDB Assertion Script dialog opens with the assertion loaded.

  7. Alternately, use CCDB controls to step through the two applications to determine where they break down or diverge.

  8. Click the red FAILURE boxes to access failure details.

    Tip: For more information on using CCDB, click ? in the current window or Help from the main CCDB window menu bar.

  9. After narrowing down the search space, use a traditional debugger like gbd4hpc or valgrind4hpc (depending on the bug’s nature) to find the bug. After fixing it, run the old and new versions side-by-side in CCDB to verify that the bug was fixed.

Profiling an application

Identify application bottlenecks

PREREQUISITES

  • CPE must be installed.

OBJECTIVE

This procedure instruments applications, runs them, and creates detailed output highlighting application bottlenecks.

PROCEDURE

  1. Load the perftools-base module if it is not already loaded:

    $ module load perftools-base
    
  2. Load the perftools-lite instrumentation module:

    $ module load perftools-lite
    
  3. Compile and link the program:

    $ make program
    
  4. Run the program:

    $ srun a.out
    

    After program execution completes, perftools-lite generates:

    • A text report to stdout, profiling program behavior, identifying where the program spends its execution time, and offering recommendations for further analysis and possible optimizations.

    • An experiment data directory, containing files which can be used to examine program behavior more closely using Cray Apprentice2 or pat_report.

    • A report file, data-directory/rpt-files/RUNTIME.rpt, containing the same information written to stdout.

  5. Review the profiling reports written to stdout. To get additional information without re-running, use the pat_report utility on the experiment directory (such as my_app.mpi+68976-16s) produced from a profiling run to generate new text reports. For example:

    $ pat_report -O calltree+src my_app.mpi+68976-16s
    

    Tip: For additional help, run pat_help from the command line. Also, refer to the HPE Performance Analysis Tools User Guide (S-8014).

Managing the environment lifecycle

List UAS Information

OBJECTIVE

Use the cray uas command to gather information about the User Access Service version, images, and details on running User Access Instances (UAIs). List descriptive information about the User Access Service.

List UAS Version with cray uas mgr-info list

ncn-m001## cray uas mgr-info list
service_name = "cray-uas-mgr",
version = "0.11.3"

List Available UAS Images with cray uas images list

ncn-m001## cray uas images list
default_image = "registry.local/cray/cray-uas-sles15sp1-slurm:latest"
image_list = [ "registry.local/cray/cray-uas-sles15sp1-slurm:latest",
"registry.local/cray/cray-uas-sles15sp1:latest",]

List UAI Information for Current User with cray uas list

ncn-m001## cray uas list
[[results]]
username = "user"
uai_host = "ncn-m001"
uai_status = "Running: Ready"
uai_connect_string = "ssh user@203.0.113.0 -i ~/.ssh/id_rsa"
uai_img = "registry.local/cray/cray-uas-sles15sp1-slurm:latest"
uai_age = "11m"
uai_name = "uai-user-be3a6770"

List UAIs on a Specific Host Node

ncn-m001## cray uas uais list --host ncn-m001
[[results]]
username = "user"
uai_host = "ncn-m001"
uai_status = "Running: Ready"
uai_connect_string = "ssh user@203.0.113.0 -i ~/.ssh/id_rsa"
uai_img = "registry.local/cray/cray-uas-sles15sp1-slurm:latest"
uai_age = "2h56m"
uai_name = "uai-user-f3b8eee0"
[[results]]
username = "user"
uai_host = "ncn-m001"
uai_status = "Running: Ready"
uai_connect_string = "ssh user@203.0.113.0 -i ~/.ssh/id_rsa"
uai_img = "registry.local/cray/cray-uas-sles15sp1-slurm:latest"
uai_age = "1d5h"
uai_name = "uai-user-f8671d33"

ncn-m

Delete a UAI

PREREQUISITES

A UAI must be up and running.

OBJECTIVE

The cray uas command allows users to manage UAIs. This procedure deletes one of the user’s UAIs. To delete all UAIs on the system, see List and Delete UAIs in the Cray System Management Administration Guide for more information. See HPE Cray Supercomputing EX software documentation links for additional information on this guide.

LIMITATIONS

The user must SSH to the system as root.

PROCEDURE

  1. Log in to an NCN as root.

  2. List existing UAIs:

    ncn-m001## cray uas list
    
    username = "user"
    uai_host = "ncn-m001"
    uai_status = "Running: Ready"
    uai_connect_string = "ssh user@203.0.113.0 -i ~/.ssh/id_rsa"
    uai_img = "registry.local/cray/cray-uas-sles15sp1-slurm:latest"
    uai_age = "0m"
    uai_name = "uai-user-be3a6770"
    
    username = "user"
    uai_host = "ncn-s001"
    uai_status = "Running: Ready"
    uai_connect_string = "ssh user@203.0.113.0 -i ~/.ssh/id_rsa"
    uai_img = "registry.local/cray/cray-uas-sles15sp1-slurm:latest"
    uai_age = "11m"
    uai_name = "uai-user-f488eef6"
    
  3. Delete a UAI:

    ncn-m001## cray uas delete --uai-list <UAI_NAME>
    results = [ "Successfully deleted uai-user-be3a6770",]
    

    If a UAI is deleted, note that WLM jobs are not cancelled or cleaned up.

Troubleshoot UAS Issues

This section provides examples of some commands that can be used to troubleshoot UAS related issues.

Troubleshoot Connection Issues

packet_write_wait: Connection to 203.0.113.0 port 30841: Broken pipe

If an error message related to broken pipes returns, enable keep-alives on the client side. The administrator should update the /etc/ssh/sshd_config and /etc/ssh/ssh_config files to add:

TCPKeepAlive yes
ServerAliveInterval 120
ServerAliveCountMax 720

Invalid Credentials

ncn-m001 ## cray auth login --username <USER> --password <WRONGPASSWORD>
Usage: cray auth login [OPTIONS]
Try "cray auth login --help" for help.

Error: Invalid Credentials

To resolve this issue:

  • Log in to Keycloak and verify the user exists.

  • Make sure the username and password are correct.

Retrieve UAS Logs

To retrieve UAS and the remote execution service logs, the system administrator can enter:

ncn-m001## kubectl logs -n services -c cray-uas-mgr -l "app=cray-uas-mgr"

Troubleshoot Default Images Issues when Using the CLI

If the image name provided while creating a new UAI is not registered for use by the system, the system returns an error message. For example:

ncn-m001## cray uas create --publickey ~/.ssh/id_rsa.pub  --imagename fred
Usage: cray uas create [OPTIONS]
Try "cray uas create --help" for help.

Error: Bad Request: Invalid image (fred). Valid images:
['dtr.dev.cray.com:443/cray/cray-uas-sles15sp1:latest'].
Default: dtr.dev.cray.com:443/cray/cray-uas-sles15sp1:latest

Retry the procedure by creating the UAI using the list of images and the name of the default image provided in the error message.

Verify that the User Access Instances (UAIs) are Running

The system administrator can use the kubectl command to check the status of the UAI:

ncn-m001## kubectl get pod -n user -l uas=managed -o wide
NAME                  READY  STATUS             RESTART SAGE  IP      NODE   NOMINATED       READINESS
                                                                             NODE            GATES
uai-user-603-85d-zk6  0/1    ContainerCreating  0       109s  <none>  sms-2  <none>          <none>
uai-user-d7f-6db-7h5  0/1    ContainerCreating  0       116s  <none>  sms-2  <none>          <none>
uai-user-f6b-5dc-grb  0/1    ContainerCreating  0       113s  <none>  sms-2  <none>          <none>

If UAS pods are stuck in the Pending state, the administrator needs to ensure the Kubernetes cluster has nodes available for running UAIs. Check that nodes are labeled with uas=True and are in the Ready state:

ncn-m001## kubectl get nodes -l uas
NAME        STATUS     ROLES    AGE   VERSION
ncn-m001     Ready    master    11d   v1.13.3

If none of the nodes are found or if the nodes listed are marked as NotReady, the UAI pods will not be scheduled and will not start.

Troubleshoot kubectl Certificate Issues

While kubectl is supported in a UAI, a kubeconfig file to access a Kubernetes cluster is not provided. To use kubectl to interface with a Kubernetes cluster, the user must supply their own kubeconfig:

[user@uai ~]## kubectl get nodes
The connection to the server localhost:8080 was refused - did you specify the right host or port?

For instructions to copy certificates into a UAI, see Copy Application Data from an External Workstation to a UAI.

Specify the location of the Kubernetes certificate with KUBECONFIG:

[user@uai ~]## KUBECONFIG=/tmp/<CONFIG> kubectl get nodes
NAME STATUS ROLES AGE VERSION
ncn-m001 Ready master 16d v1.13.3
ncn-m002 Ready master 16d v1.13.3

Users must specify KUBECONFIG with every kubectl command or specify the kubeconfig file location for the life of the UAI. To do this, either set the KUBECONFIG environment variable or set the --kubeconfig flag.

Troubleshoot X11 Issues

The system may return the following error if the user attempts to use an application that requires an X window (such as xeyes):

$ ssh <UAI_USERNAME@UAI_IP_ADDRESS>
   ______ ____   ___ __  __   __  __ ___     ____
  / ____// __ \ /   |\ \/ /  / / / //   |   /  _/
 / /    / /_/ // /| | \  /  / / / // /| |   / /
/ /___ / _, _// ___ | / /  / /_/ // ___ | _/ /
\____//_/ |_|/_/  |_|/_/   \____//_/  |_|/___/
[user@uai ~]$ xeyes
Error: Can't open display:

To resolve this issue, pass the -X option with the ssh command as show below:

$ ssh <UAI_USERNAME@UAI_IP_ADDRESS> -X
   ______ ____   ___ __  __   __  __ ___     ____
  / ____// __ \ /   |\ \/ /  / / / //   |   /  _/
 / /    / /_/ // /| | \  /  / / / // /| |   / /
/ /___ / _, _// ___ | / /  / /_/ // ___ | _/ /
\____//_/ |_|/_/  |_|/_/   \____//_/  |_|/___/
/usr/bin/xauth:  file /home/users/user/.Xauthority does not exist
[user@uai ~]$ echo $DISPLAY
203.0.113.0

The warning, Xauthority does not exist, disappears with subsequent logins.

Troubleshoot SSH Host Key Issues

If strict host key checking is enabled on the user’s client, an error may appear when connecting to a UAI over ssh. For example:

WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED

This issue can occur in a few circumstances but is most likely to occur after the UAI container is restarted. If this occurs, remove the offending ssh hostkey from the local known_hosts file and try to connect again. The error message from ssh will contain the correct path to the known_hosts file and the line number of the problematic key.

Hard limits on UAI Creation

Each Kubernetes worker node has limits on how many pods it can run. Nodes are installed, by default, with a hard limit of 110 pods per node, but the number of pods may be further limited by memory and CPU utilization constraints. For a standard node the maximum number of UAIs per node is 110; if other pods are co-scheduled on the node, the number is reduced.

Determine the hard limit on Kubernetes pods with kubectrl describe node, and look for the Capacity section:

### kubectl describe node <NODE_NAME> -o yaml
...
capacity:
    cpu: "16"
    ephemeral-storage: 1921298528Ki
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 181009640Ki
    pods: "110"
...

When UAIs are created, some UAIs might be left in the Pending state. The Kubernetes scheduler is unable to schedule them to a node, due to CPU, memory, or pod limit constraints. Use kubectl describe pod to check why the pod is pending. For example, this pod is pending because the node has reached the hard limit of 110 pods.

### kubectl describe pod UAI-POD
Warning  Failed
Scheduling  21s (x20 over 4m31s)  default-scheduler  0/4 nodes are available:
1 Insufficient pods, 3 node(s) didn't match node selector.