Cray FFTW
Cray FFTW is a C subroutine library with Fortran interfaces for computing complex-to-complex, real-to-complex, complex-to-real, and real-to-real single and multidimensional discrete Fourier transforms (DFTs). The library also includes routines to compute discrete cosine and sine transforms (DCTs/DSTs) on even and odd data, respectively.
To use FFTW 3.x, load the module cray-fftw
or cray-fftw/<version>
, e.g., cray-fftw/3.3.6.1
.
In FFTW 3.x, the single- and double-precision routine names are unique, and therefore both libraries automatically appear on the link line when the cray-fftw module is loaded, so that a user’s program can call single- or double-precision routines. The single- and double- precision libraries are libfftw3f.a and libfftw3.a, respectively.
The Cray FFTW library will quickly generate highly optimized plans on Intel Xeon processor Scalable family, or Xeon Phi CPUs, and AMD Epyc
CPUs, for the following problem types: single-threaded, double precision 1D, 2D, 3D FFTs of data type Complex-to-Complex (forward and
backward), Real-to-Complex, and Complex-to-Real (particularly for, but not limited to, problem sizes that are 5-smooth). The optimized
plans are generated with roughly the same minimal planning time as experienced when using the planner aggressiveness flag FFTW_ESTIMATE.
For requested problems with no highly-optimized plan available, the FFTW planner will operate normally with no additional overhead or
loss of performance. This feature is enabled by default, but may be disabled by setting the environment variable FFTW_CRAY_FASTPLAN=0
.
Environment Variables
FFTW_CRAY_FASTPLAN
Enabled by default, the FFTW planner returns optimized plans
for a large set of problems on Intel Xeon processor Scalable
family, or Xeon Phi CPUs, or AMD Epyc CPUs. This requires that
the planner aggressiveness flag is set to FFTW_ESTIMATE or
FFTW_MEASURE within the user code. To disable this feature,
set the environment variable to 0.
Default: enabled
Threaded Library Linking Options
Starting with cray-fftw/3.3.6.1, the option to choose between OpenMP and Pthreads when using threaded FFTW is available for all Cray
systems. By default, the OpenMP libraries will be linked in when the OpenMP compiler flag is set (e.g., -fopenmp
for CCE-Clang).
Unlike the use of the FFTW Pthread libraries, the FFTW OpenMP libraries require the compiler-specific OpenMP compiler flag to be used
(as demonstrated in the examples below).
For all other architectures, the Pthreads library will be linked by default. If the non-default threaded library is desired, it will be available for use, but must be linked manually. See below for examples on selecting OpenMP versus Pthread libraries for specified architecture targets.
Threaded Library Linking Examples
Example 1: OpenMP for OpenMP default libraries
This is the default threaded library for OpenMP default architectures. To link the OpenMP libraries, simply load the raype architecture target, load the cray-fftw module, and compile as usual. Note that the compiler-specific OpenMP flag is necessary to properly link the FFTW OpenMP library. With the cray-fftw module loaded:
Compiler |
Command Line |
---|---|
CCE-classic |
|
GNU and CCE-clang |
|
Intel |
|
Example 2: Pthreads for OpenMP default libraries
This is the non-default threaded library for OpenMP default architectures. Use of the module will not result in linking the desired
threaded libraries. To link the Pthread libraries, unload the cray-fftw module, load the craype architecture target, and compile with the
explicit paths for the architecture target libraries. If compiling with CCE-classic, the -noomp
flag is required to target the Pthread
libraries.
With the cray-fftw module loaded:
cc myapp.c -I$FFTW_INC \
-L$FFTW_DIR -lfftw3_mpi -lfftw3_threads \
-lfftw3 -lfftw3f_mpi -lfftw3f_threads -lfftw3f
Additional Information
FFTW documentation: http://www.fftw.org