Cray FFTW

Cray FFTW is a C subroutine library with Fortran interfaces for computing complex-to-complex, real-to-complex, complex-to-real, and real-to-real single and multidimensional discrete Fourier transforms (DFTs). The library also includes routines to compute discrete cosine and sine transforms (DCTs/DSTs) on even and odd data, respectively.

To use FFTW 3.x, load the module cray-fftw or cray-fftw/<version>, e.g., cray-fftw/3.3.6.1.

In FFTW 3.x, the single- and double-precision routine names are unique, and therefore both libraries automatically appear on the link line when the cray-fftw module is loaded, so that a user’s program can call single- or double-precision routines. The single- and double- precision libraries are libfftw3f.a and libfftw3.a, respectively.

The Cray FFTW library will quickly generate highly optimized plans on Intel Xeon processor Scalable family, or Xeon Phi CPUs, and AMD Epyc CPUs, for the following problem types: single-threaded, double precision 1D, 2D, 3D FFTs of data type Complex-to-Complex (forward and backward), Real-to-Complex, and Complex-to-Real (particularly for, but not limited to, problem sizes that are 5-smooth). The optimized plans are generated with roughly the same minimal planning time as experienced when using the planner aggressiveness flag FFTW_ESTIMATE. For requested problems with no highly-optimized plan available, the FFTW planner will operate normally with no additional overhead or loss of performance. This feature is enabled by default, but may be disabled by setting the environment variable FFTW_CRAY_FASTPLAN=0.

Environment Variables

FFTW_CRAY_FASTPLAN

    Enabled by default, the FFTW planner returns optimized plans
    for a large set of problems on Intel Xeon processor Scalable
    family, or Xeon Phi CPUs, or AMD Epyc CPUs. This requires that
    the planner aggressiveness flag is set to FFTW_ESTIMATE or
    FFTW_MEASURE within the user code. To disable this feature,
    set the environment variable to 0.

    Default: enabled

Threaded Library Linking Options

Starting with cray-fftw/3.3.6.1, the option to choose between OpenMP and Pthreads when using threaded FFTW is available for all Cray systems. By default, the OpenMP libraries will be linked in when the OpenMP compiler flag is set (e.g., -fopenmp for CCE-Clang). Unlike the use of the FFTW Pthread libraries, the FFTW OpenMP libraries require the compiler-specific OpenMP compiler flag to be used (as demonstrated in the examples below).

For all other architectures, the Pthreads library will be linked by default. If the non-default threaded library is desired, it will be available for use, but must be linked manually. See below for examples on selecting OpenMP versus Pthread libraries for specified architecture targets.

Threaded Library Linking Examples

Example 1: OpenMP for OpenMP default libraries

This is the default threaded library for OpenMP default architectures. To link the OpenMP libraries, simply load the raype architecture target, load the cray-fftw module, and compile as usual. Note that the compiler-specific OpenMP flag is necessary to properly link the FFTW OpenMP library. With the cray-fftw module loaded:

Compiler

Command Line

CCE-classic

cc -homp myapp.c

GNU and CCE-clang

cc -fopenmp myapp.c

Intel

cc -qopenmp myapp.c

Example 2: Pthreads for OpenMP default libraries

This is the non-default threaded library for OpenMP default architectures. Use of the module will not result in linking the desired threaded libraries. To link the Pthread libraries, unload the cray-fftw module, load the craype architecture target, and compile with the explicit paths for the architecture target libraries. If compiling with CCE-classic, the -noomp flag is required to target the Pthread libraries.

With the cray-fftw module loaded:

cc myapp.c -I$FFTW_INC \
    -L$FFTW_DIR -lfftw3_mpi -lfftw3_threads \
    -lfftw3 -lfftw3f_mpi -lfftw3f_threads -lfftw3f

Additional Information

FFTW documentation: http://www.fftw.org