intro_directives
- Date:
11-19-2018
NAME
intro_directives - Introduction to Cray C/C++ compiler #pragmas and Cray Fortran Compiler directives
IMPLEMENTATION
Cray Linux Environment (CLE)
DESCRIPTION
Directives and pragmas are instructions that may be inserted into source code in order to specify certain kinds of special processing to be performed by the compiler during compilation. Directives are not Fortran statements. Pragmas are directives in C/C++.
This man page provides a high-level overview how pragmas and directives are used in the Cray C/C++ Compiler and the Cray Fortran Compiler. A short description of each directive is given at the end of this man page. For specific information about using a particular directive, see appropriate man page for that directive.
The directives are classified according to the following types:
General
Vectorization
Scalar
Inlining
PGAS
Using Directives: Cray C/C++ Compiler
#pragma directives are expressed in the following form:
#pragma [_CRI] identifier [arguments]
The _CRI specification is optional; it ensures that the compiler will issue a message concerning any directives that it does not recognize. Diagnostics are not generated for directives that do not contain the _CRI specification.
Macro expansion occurs on the directive line after the directive name. That is, macro expansion is applied only to arguments.
Note: OpenMP #pragma directives are described in the Cray C and C++ Reference Manual.At the beginning of each section that describes a directive, information is included about the compilers that allow the use of the directive and the scope of the directive.
Scope and Compiler
At the beginning of each section that describes a directive, information is included about the compilers that allow the use of the directive and the scope of the directive. Unless otherwise noted, the following default information applies to each directive:
- Compiler:
Cray C, Cray C++, or Cray Fortran compilers
- Scope:
Local and global
The scoping list may also indicate that a directive has a lexical block scope. A lexical block is the scope within which a directive is on or off and is bounded by the opening curly brace just before the directive was declared and the corresponding closing curly brace. Only applicable executable statements within the lexical block are affected as indicated by the directive. The lexical block does not include the statements contained within a procedure that is called from the lexical block.
Protecting Directives
To ensure that your directives are interpreted by the Cray C and C++ compilers, use the following coding technique in which directive represents the name of the directive:
#if _CRAYC
#pragma _CRI directive
#endif
This ensures that other compilers used to compile this code will not interpret the directive. Some compilers diagnose any directives that they do not recognize. The Cray C and C++ compilers diagnose directives that are not recognized only if the _CRI specification is used.
Objects and Functions
C++ prohibits referencing undeclared objects or functions. Objects and functions must be declared prior to using them in a #pragma directive. This is not always the case with C.
Some #pragma directives take function names as arguments (for example: #pragma _CRI weak, #pragma _CRI suppress, #pragma _CRI inline_alwaysname[,name…], #pragma _CRI inline_nevername[,name…]). Member functions and qualified names are allowed for these directives.
Loop Directives
Many directives apply to groups. Unless otherwise noted, these directives must appear before a for, while, or do while loop. These directives may also appear before a label for if…goto loops. If a loop directive appears before a label that is not the top of an if…goto loop, it is ignored.
Alternative Directive form: _Pragma
Cray C/C++ Compiler directives can also be specified in the following form, which has the advantage in that it can appear inside macro definitions:
_Pragma("_CRI identifier");
This form has the same effect as using the #pragma form, except that everything that appeared on the line following the #pragma must now appear inside the double quotation marks and parentheses. The expression inside the parentheses must be a single string literal; it cannot be a macro that expands into a string literal. _Pragma is an extension to the C and C++ standards.
The following is an example using the #pragma form:
#pragma _CRI concurrent
The following is the same example using the alternative form:
_Pragma("_CRI concurrent");
In the following example, the loop automatically vectorizes wherever the macro is used:
#define _str( _X ) # _X
#define COPY( _A, _B, _N )
{
int i;
_Pragma( "_CRI concurrent" )
_Pragma( _str( _CRI loop_info cache_nt( _B ) ) )
for ( i = 0; i < _N; i++ ) {
_A[i] = _B[i];
}
}
void
copy_data( int *a, int *b, int n )
{
COPY( a, b, n );
}
Macros are expanded in the string literal argument for _Pragma in an identical fashion to the general specification of a #pragma directive.
Using Directives: Cray Fortran Compiler
A directive line begins with the characters CDIR$ or !DIR$. How you specify a directive depends on the source form you are using.
If you are using fixed source form, indicate a directive line by placing CDIR$ or !DIR$ in columns 1 through 5. If the compiler encounters a non-blank character is column 6, the line is assumed to be a directive continuation line. Columns 7 and beyond can contain one or more directives. Characters entered in columns beyond the default column width are ignored.
If you are using free source form, indicate a directive by placing !DIR$, followed by a space, then one or more directives. If the position following the !DIR$ contains a character other than a blank, tab, or newline character, the line is assumed to be a continuation line. For example, the asterisk (*) in column 6 on the second line indicates that it is a continuation of the first line:
!DIR$ Nosideeffects
!DIR$*ab
The !DIR$ need not start in column 1, but it must be the first text on the line.
The FIXED and FREE directives must appear alone on a directive line and cannot be continued.
Do not use source preprocessor (#) directives within multiline compiler directives.
If you want to specify more than one directive on a line, separate each directive with a comma. Some directives require that you specify one or more arguments; when specifying a directive of this type, no other directive can appear on the line.
Spaces can precede, follow, or be embedded with a directive, regardless of the source form.
Range and Placement of Directives
FIXED and FREE directives can appear anywhere in your source code. All other directives must appear within a program unit.
The following directives must be placed in the declarative portion of a program unit and apply only to that program unit: CACHE, CACHE_NT, COPY_ASSUMED_SHAPE, IGNORE_TKR, MEMORY, NAME, NOSIDEEFFECTS, SAME_TBS, STACK, and WEAK.
The following directives toggle a compiler feature on or off at the point at which the directive appears in the code. These directives remain in effect until the opposite directive appears, until the directive is reset, or until the end of the program unit, at which time the command line settings become the default for the remainder of the compilation: [NO]AUTOTHREAD,[NO]BOUNDS, [NO]CLONE, [NO]COLLAPSE, [NO]FUSION, [NO]INLINE, [NO]PATTERN, [NO]PIPELINE, [NO]UNROLL, [NO]VECTOR.
RESET CLONE and RESET INLINE apply at the point at which they appear and set cloning or inlining back to the default.
The SUPPRESS directive applies at the point at which it appears.
The ID directive does not apply to any particular range of code. It adds information to the .o file generated from the source.
The following directives apply only to the next loop or block of code encountered lexically: BLOCKABLE, BLOCKINGSIZE|NOBLOCKING, CONCURRENT, HAND_TUNED, [NO]INTERCHANGE, IVDEP, NEXTSCALAR, NOFISSION, PERMUTATION, PREFERVECTOR, PROBABILITY, SAFE_ADDRESS, SAFE_CONDITIONAL, LOOP_INFO.
The following directives alter the status of entities in ways that affect compilation. They do not apply to particular ranges of code: IGNORE_TKR, INLINEALWAYS|INLINENEVER, CLONEALWAYS|CLONENEVER, NAME, NOSIDEEFFECTS.
The [NO]MODINLINE directives are in effect for the scope of the program unit in which they are specified, including all contained procedures. If one of these directives is specified in a contained procedure, the contained procedure’s directive overrides the containing procedure’s directive.
Interaction with the ftn and cc Command Line
Note the following interactions between directives and ftn command line options.
-x (ftn only) The -x option accepts one or more
directives as arguments. Directives specified
with the -x option are ignored during
compilation. To ignore all compiler
directives, specify -x all.
-h nopragma (cc only) The -h nopragma= option accepts one
or more directives as arguments. Directives
specified with the -h nopragma= option are
ignored during compilation. To ignore all
directives, specify -h nopragma=all.
-O 0 The -O 0 option disables all compiler
optimizations. All scalar optimization,
vectorization, and tasking directives are
ignored.
-O ipaN The -O ipa0 option disables all inlining and
cloning optimizations. All inlining and
cloning directives are ignored.
-O scalar0 The -O scalar0 option disables all scalar
optimizations. All scalar optimization and
vectorization directives are ignored.
-O vector0 The -O vector0 option disables vectorization.
All vectorization directives are ignored.
Cray C/C++ Compiler and Cray Fortran Compiler Directives Man Pages
The following tables list the directives supported by the Cray C/C++ and Fortran compilers. Fortran directives appear in upper case letters and C directives in lowercase. See the individual directive’s man page for more information about the use and effect of the directive.
Table 1. General Directives
[no]autothread, [NO]AUTOTHREAD Turn autothreading on and off for
selected blocks of code.
blockable, BLOCKABLE Specifies that it is legal to cache
block the subsequent loops.
blockingsize, BLOCKINGSIZE Assert that the loop following the
directive either is or is not
involved in a cache blocking.
[no]bounds,[NO]BOUNDS Specifies that pointer and array
references are to be checked/not
checked.
cache, CACHE Asserts that all memory operations
with the specified symbols as the
base are to be allocated in cache.
cache_nt, CACHE_NT Specifies objects that should use
non-temporal reads and writes.
Advisory.
duplicate, NAME Provides additional, externally
visible names for specified
functions.
FREE, FIXED (ftn only) Specify if the source code
uses the free or fixed format.
[no]fusion, [NO]FUSION Direct the compiler to attempt or not
attempt loop fusion on the following
loop.
ident,ID Directs the compiler to store the
string indicated by text into the
object (.o) file.
IGNORE_TKR (ftn only) Ignore the type, kind, and
rank of specified dummy arguments.
memory Place heap-allocations or variables
in specific types of memory. This is
intended for use only on systems that
support more than one type of
explicitly addressable on-node
memory: for example, systems having
both normal DDR memory and high-
bandwidth MCDRAM memory.
message (cc only) Directs the compiler to
write the message defined by the text
argument to stderr as a warning
message.
[no]opt (cc only) Enables or disables
automatic optimizations and accepts
or ignores optimization directives.
optimize Enable optimization in the function
or program unit in which it appears,
overriding the optimization level set
via the compiler command line.
prefetch, PREFETCH Directs the compiler to prefetch a
variable or array element from local
memory into cache prior to a
reference or to store a value into
cache prior to a local memory write.
PREPROCESS (ftn only) Allow an include file to
be preprocessed.
probability, PROBABILITY, Specify information used by
probability_almost_always, interprocedure analysis (IPA) and the
PROBABILITY_ALMOST_ALWAYS, optimizer to product faster code
probability_almost_never, sequences. The specified probability
PROBABILITY_ALMOST_NEVER is a hint, rather than a statement of
fact.
STACK (ftn only) Allocate storage to the
stack in the program unit containing
the directive.
weak, WEAK Specifies an external identifier that
may remain unresolved throughout the
compilation.
Table 2. Vectorization Directives
COPY_ASSUMED_SHAPE (ftn only) Copy assumed-shape dummy
array arguments into contiguous local
temporary storage
hand_tuned, HAND_TUNED Asserts that the code in the loop
nest has been arranged by hand for
maximum performance, and the compiler
should restrict some of the more
aggressive automatic expression
rewrites.
ivdep, IVDEP Ignore vector dependencies, including
explicit dependencies, when
attempting to vectorize the first
loop that follows the directive.
loop_info, LOOP_INFO Allows additional information to be
specified about the behavior of a
loop, including run time trip count
and hints on cache allocation
strategy.
LOOP_INFO PREFER_THREAD, LOOP_INFO (ftn only) Indicate a preference for
PREFER_NOTHREAD turning threading on or off for
selected loops.
NEXTSCALAR (ftn only) Disable vectorization for
the first DO or DO WHILE loop
following the directive
[no]pattern, [NO]PATTERN Disables pattern matching for the
loop immediately following the
directive
permutation, PERMUTATION Specifies that an integer array has
no repeated values.
[no]pipeline, ,[NO]PIPELINE Enables/disables the compiler
analysis of all vector loops and its
automatic attempt to pipeline a loop
if doing so can be expected to
produce a significant performance
gain.
prefervector, PREFERVECTOR Directs the compiler to vectorize the
loop immediately following the
directive if the loop contains more
than one loop in the nest that can be
vectorized.
safe_address, SAFE_ADDRESS Specifies that it is safe to
speculatively execute memory
references within all conditional
branches of a loop.
safe_conditional, SAFE_CONDITIONAL Specifies that it is safe to execute
all references and operations within
all conditional branches of a loop.
SAME_TBS (ftn only) Specifies that assumed
shape array arguments are of same
type, bounds, and stride allowing
more efficient code.
[no]vector, [NO]VECTOR Controls vectorization of DO loops.
May affect specific optimizations.
Table 3. Scalar Directives
[no]collapse, [NO]COLLAPSE Controls collapse of the immediately
following loop nest.
concurrent, CONCURRENT Indicates that no data dependence
exists between array references in
different iterations of the loop that
follows the directive.
[no]interchange, [NO]INTERCHANGE Specifies whether or not the order of
the following two or more loops
should be interchanged.
suppress, SUPPRESS Suppresses optimization determined by
its use with global or local scope.
[no]unroll, [NO]UNROLL Control unrolling for individual
loops or specifies no unrolling of a
loop.
[no]fusion, [NO]FUSION Instructs the compiler to attempt/not
attempt loop fusion on the following
loop.
nofission, NOFISSION Instructs the compiler not to split
the following loop. Fission is
prevented only for the loop level
specified; loops nested within the
specified loop remain candidates for
fission.
NOSIDEEFFECTS Declare that a called subprogram does
not redefine selected variables
Table 4. Inlining and Cloning Directives
clone_enable, clone_disable, Control whether cloning is attempted
clone_reset, [NO]CLONE, RESETCLONE over a range of code. The clone_reset
(RESETCLONE) directive returns the
cloning state to the state specified
on the compiler command line.
clone_always, clone_never, The clone_always (CLONEALWAYS)
CLONEALWAYS, CLONENEVER directive identifies functions that
the compiler should always attempt to
clone. The clone_never (CLONENEVER)
directive identifies functions that
are never to be cloned.
inline_enable, inline_disable, The inline_enable (INLINE) directive
inline_reset, [NO]INLINE, tells the compiler to attempt to
RESETINLINE inline functions at call sites. The
inline_disable (NOINLINE) directive
tells the compiler to not inline
functions at call sites. The
inline_reset (RESETINLINE) directive
returns the inlining state to the
state specified on the command line
(-h ipan).
inline_always, inline_never, The inline_always directive
INLINEALWAYS, INLINENEVER identifies functions that the
compiler should always attempt to
inline. The inline_never directive
identifies functions that are never
to be inlined.
[NO]MODINLINE (ftn only) Enable or disable the
creation of inlineable templates for
specific module procedures.
Table 5. PGAS Directives
pgas buffered_async, PGAS Batch PGAS operations into buffers to
BUFFERED_ASYNC be processed in bulk. No ordering
guarantees are made until the next
fence instruction. Fences provide
ordering and visibility guarantees
for BA operations.
pgas defer_sync, PGAS DEFER_SYNC Defer the synchronization of PGAS
data until the next fence
instruction.
SEE ALSO
autothread(7), blockable(7), blockingsize(7), bounds(7), buffered_async(7), cache(7), cache_nt(7), clone(7), clonealways(7), collapse(7), concurrent(7), copy_assumed_shape(7), defer_sync(7), duplicate(7), free(7), fusion(7), hand_tuned(7), ident(7), ignore_tkr(7), inline(7), inlinealways(7), instantiate(7), interchange(7), ivdep(7), loop_info(7), memory(7), message(7), nextscalar(7), nofission(7), nosideeffects(7), opt(7), optimize(7), pattern(7), permutation(7), pgo_loop_info(7), pipeline(7), prefervector(7), prefetch(7), preprocess(7), probability(7), safe_address(7), safe_conditional(7), same_tbs(7), shortloop(7), stack(7), suppress(7), unroll(7), vector(7), weak(7)
Cray C and C++ Reference Manual
Cray Fortran Reference Manual