Using Toolkit Module Workflows

The CPE programming environments for AMD and NVIDIA both have toolkit modules. These toolkit modules can be loaded with or without a compiler module. For Lmod systems, compilers and their associated toolkit follow distinct rules when interacting together. These additional Lmod rules are necessary because flexibility between related modules is not standard Lmod behavior.

CPE Compiler and Toolkit Lmod Workflow Rules:

  • Rule 1: Compiler modules can be loaded on their own.

  • Rule 2: Toolkit modules can act independent and be loaded on their own.

  • Rule 3: Toolkits unload (like an Lmod hierarchy-dependent module), if they are initially loaded after a compiler is loaded.

  • Rule 4: Toolkits stay loaded, if they are loaded before a compiler is loaded. (The toolkit acts like an independent module.)

  • Rule 5: Compiler and toolkit module versions must be compatible.

CPE AMD Compiler and Toolkit Module Examples

PREREQUISITES

System module environment setup must have been completed.

OBJECTIVE

The section demonstrates CPE Lmod rules for compiler and toolkit modules for the CPE AMD programming environment.

EXAMPLES: CPE AMD COMPILER AND TOOLKIT RULES

Rule 1 Example (AMD). Compiler modules can be loaded on their own:

ncn-m001# ml load amd
ncn-m001# ml

Currently Loaded Modules:
1) amd/5.6.1

Rule 2 Example (AMD). Toolkit modules can act independent and be loaded on their own:

ncn-m001# ml load rocm
ncn-m001# ml

Currently Loaded Modules:
1) rocm/5.6.1

Rule 3 Example (AMD). Toolkits unload (like an Lmod hierarchy-dependent module), if they are initially loaded after a compiler is loaded:

ncn-m001# module load amd rocm
ncn-m001# ml
Currently Loaded Modules:
1) amd/5.6.1 2) rocm/5.6.1
ncn-m001# module unload amd
ncn-m001# ml
No modules loaded

Rule 4 Example (AMD). Toolkits stay loaded, if they are loaded before a compiler is loaded (The toolkit acts like an independent module):

ncn-m001# module load rocm amd
ncn-m001# ml
Currently Loaded Modules:
1) rocm/5.6.1 2) amd/5.6.1
ncn-m001# module unload amd
ncn-m001# ml
Currently Loaded Modules:
1) rocm/5.6.1

Rule 5 Example (AMD). Compiler and toolkit module versions must be compatible:

Note, for AMD Compiler and ROCm Toolkit modules the versions must match in order to be compatible.

ncn-m001# ml rocm/5.7.0
ncn-m001# ml amd/5.7.0
ncn-m001# ml
Currently Loaded Modules:
1) rocm/5.7.0 2) amd/5.7.0
ncn-m001# ml swap rocm/5.7.0 rocm/5.6.1

Lmod has detected the following error: This rocm module exists but cannot be loaded as requested: rocm/5.6.1
ROCM module version must match the currently loaded AMD module version.

Note: Please use explicit module commands when loading a non-default rocm module.

To load rocm with the currently loaded amd/5.7.0 module please try:

$ module load rocm/5.7.0

While processing the following module(s):

Module fullname Module Filename
————— —————
rocm/5.6.1 /opt/cray/pe/lmod/modulefiles/core/rocm/5.6.1.lua

> ml amd/5.6.1
The following have been reloaded with a version change:
1) amd/5.7.0 => amd/5.6.1 2) rocm/5.7.0 => rocm/5.6.1

CPE NVIDIA Compiler and Toolkit Module Examples

PREREQUISITES

System module environment setup must have been completed.

OBJECTIVE

This section demonstrates CPE Lmod rules for compiler and toolkit modules for the CPE NVIDIA programming environment.

EXAMPLES: CPE NVIDIA COMPILER AND TOOLKIT RULES

Rule 1 Example (NVIDIA). Compiler modules can be loaded on their own:

ncn-m001# ml load nvidia
ncn-m001# ml

Currently Loaded Modules:
1) nvidia/24.7

Rule 2 Example (NVIDIA). Toolkit modules can act independent and be loaded on their own:

ncn-m001# ml load cuda
ncn-m001# ml

Currently Loaded Modules:
1) cuda/11.0

Rule 3 Example (NVIDIA). Toolkits unload (like a Lmod hierarchy dependent module) if they are loaded after a compiler initially:

ncn-m001# module load nvidia cuda/12.5
ncn-m001# ml
Currently Loaded Modules:
1) nvidia/24.7 2) cuda/12.5
ncn-m001# module unload nvidia
ncn-m001# ml
No modules loaded

Rule 4 Example (NVIDIA). Toolkits stay loaded if they are loaded before a compiler (toolkit acts like an independent module):

ncn-m001# module load cuda nvidia
ncn-m001# ml
Currently Loaded Modules:
1) cuda/12.5 2) nvidia/24.7
ncn-m001# module unload nvidia
ncn-m001# ml
Currently Loaded Modules:
1) cuda/12.5

Rule 5 Example (NVIDIA). Compiler and Toolkit module versions must be compatible:

ncn-m001# ml cuda/12.5
ncn-m001# ml nvidia/24.7
ncn-m001# ml
Currently Loaded Modules:
1) cuda/12.5 2) nvidia/24.7
ncn-m001# ml swap cuda/12.5 cuda/11.0

Lmod has detected the following error: This module exists but cannot be loaded as requested: cuda/11.0

Please ensure the Cuda version is supported by the currently active Nvidia compiler.

Note: Please use explicit module commands when loading a non-default
cuda/11.0 module.

To load cuda/11.0 with the currently loaded nvidia/24.7 module please try one of the following Cuda versions:
11.8, 12.5

While processing the following module(s):

Module fullname Module Filename
————— —————
cuda/11.0 /opt/cray/pe/lmod/modulefiles/core/cuda/11.0.lua

ncn-m001# ml nvidia/23.11
The following have been reloaded with a version change:
1) cuda/12.5 => cuda/12.3 2) nvidia/24.7 => nvidia/23.11