Cray FFTW
Cray FFTW is a C subroutine library with Fortran interfaces for computing complex-to-complex, real-to-complex, complex-to-real, and real-to-real single and multidimensional discrete Fourier transforms (DFTs). The library also includes routines to compute discrete cosine and sine transforms (DCTs/DSTs) on even and odd data, respectively.
To use FFTW 3.x, load the module cray-fftw
or cray-fftw/<version>
, e.g., cray-fftw/3.3.6.1
.
In FFTW 3.x, the single- and double-precision routine names are unique, and therefore both libraries automatically appear on the link line when the cray-fftw module is loaded, so that a user’s program can call single- or double-precision routines. The single- and double- precision libraries are libfftw3f.a and libfftw3.a, respectively.
The Cray FFTW library will quickly generate highly optimized plans on Haswell, Broadwell, Intel Xeon processor Scalable family, or Xeon
Phi CPUs, and AMD Rome or Milan CPUs, for the following problem types: single-threaded, double precision 1D, 2D, 3D FFTs of data type
Complex-to-Complex (forward and backward), Real-to-Complex, and Complex-to-Real (particularly for, but not limited to, problem sizes that
are 5-smooth). The optimized plans are generated with roughly the same minimal planning time as experienced when using the planner
aggressiveness flag FFTW_ESTIMATE. For requested problems with no highly-optimized plan available, the FFTW planner will operate normally
with no additional overhead or loss of performance. This feature is enabled by default, but may be disabled by setting the environment
variable FFTW_CRAY_FASTPLAN=0
.
Environment Variables
FFTW_CRAY_FASTPLAN
Enabled by default, the FFTW planner returns optimized plans
for a large set of problems on Intel Haswell, Broadwell,
Intel Xeon processor Scalable family, or Xeon Phi CPUs, or
AMD Rome and Milan CPUs. This requires that the planner
aggressiveness flag is set to FFTW_ESTIMATE or FFTW_MEASURE
within the user code. To disable this feature, set the
environment variable to 0.
Default: not set (on)
Threaded Library Linking Options
Starting with cray-fftw/3.3.6.1, the option to choose between OpenMP and Pthreads when using threaded FFTW is available for all Cray
systems. By default, when CRAY_CPU_TARGET is set to a Haswell or newer CPU (or ThunderX2 or newer for AArch64 systems), the OpenMP
libraries will be linked in when the OpenMP compiler flag is set (e.g., -fopenmp
for CCE-Clang). Unlike the use of the FFTW Pthread
libraries, the FFTW OpenMP libraries require the compiler-specific OpenMP compiler flag to be used (as demonstrated in the examples
below).
For all other architectures, the Pthreads library will be linked by default. If the non-default threaded library is desired, it will be available for use, but must be linked manually. See below for examples on selecting OpenMP versus Pthread libraries for specified architecture targets.
Threaded Library Linking Examples
Example 1: Pthreads for all architectures with Pthread default libraries (e.g., sandybridge, ivybridge.)
This is the default threaded library for Pthread default architectures. To link the Pthread libraries, simply load the craype architecture target, load the cray-fftw module, and compile as usual. Note that the presence of an OpenMP flag will not affect the linking of the FFTW Pthread libraries:
cc myapp.c
Example 2: OpenMP for all architectures with Pthread default libraries
This is the non-default threaded library for CPU targets with default Pthread libraries. Use of the module will not result in linking the
desired OpenMP libraries. To link the OpenMP libraries, unload the cray-fftw module, load the craype architecture target, and compile with
the explicit paths for the architecture target libraries. Note that in addition to FFTW libraries, the compiler-specific OpenMP flag will
need to be used, demonstrated here with the CCE-classic flag of -homp
.
On CLE 6.x, 7.x, or Urika-GX with CCE-classic:
cc -homp myapp.c -I/opt/cray/pe/fftw/<version>/$CRAY_CPU_TARGET/include \
-L/opt/cray/pe/fftw/version/$CRAY_CPU_TARGET/lib -lfftw3_mpi \
-lfftw3_omp -lfftw3 -lfftw3f_mpi -lfftw3f_omp -lfftw3f
Example 3: OpenMP for OpenMP default libraries
This is the default threaded library for OpenMP default architectures. To link the OpenMP libraries, simply load the raype architecture target, load the cray-fftw module, and compile as usual. Note that the compiler-specific OpenMP flag is necessary to properly link the FFTW OpenMP library.
Compiler |
Command Line |
---|---|
CCE-classic |
|
GNU and CCE-clang |
|
Intel |
|
Example 4: Pthreads for OpenMP default libraries
This is the non-default threaded library for OpenMP default architectures. Use of the module will not result in linking the desired
threaded libraries. To link the Pthread libraries, unload the cray-fftw module, load the craype architecture target, and compile with the
explicit paths for the architecture target libraries. If compiling with CCE-classic, the -noomp
flag is required to target the Pthread
libraries.
On CLE 6.x, 7.x, or Urika-GX with CCE-classic:
cc myapp.c -I/opt/cray/pe/fftw/<version>/$CRAY_CPU_TARGET/include \
-L/opt/cray/pe/fftw/version/$CRAY_CPU_TARGET/lib -lfftw3_mpi -lfftw3_threads \
-lfftw3 -lfftw3f_mpi -lfftw3f_threads -lfftw3f
Additional Information
FFTW documentation: http://www.fftw.org