intro_directives

Date:

11-19-2018

NAME

intro_directives - Introduction to Cray C/C++ compiler #pragmas and Cray Fortran Compiler directives

IMPLEMENTATION

Cray Linux Environment (CLE)

DESCRIPTION

Directives and pragmas are instructions that may be inserted into source code in order to specify certain kinds of special processing to be performed by the compiler during compilation. Directives are not Fortran statements. Pragmas are directives in C/C++.

This man page provides a high-level overview how pragmas and directives are used in the Cray C/C++ Compiler and the Cray Fortran Compiler. A short description of each directive is given at the end of this man page. For specific information about using a particular directive, see appropriate man page for that directive.

The directives are classified according to the following types:

  • General

  • Vectorization

  • Scalar

  • Inlining

  • PGAS

Using Directives: Cray C/C++ Compiler

#pragma directives are expressed in the following form:

#pragma [_CRI] identifier [arguments]

The _CRI specification is optional; it ensures that the compiler will issue a message concerning any directives that it does not recognize. Diagnostics are not generated for directives that do not contain the _CRI specification.

Macro expansion occurs on the directive line after the directive name. That is, macro expansion is applied only to arguments.

Note: OpenMP #pragma directives are described in the Cray C and C++ Reference Manual.At the beginning of each section that describes a directive, information is included about the compilers that allow the use of the directive and the scope of the directive.

Scope and Compiler

At the beginning of each section that describes a directive, information is included about the compilers that allow the use of the directive and the scope of the directive. Unless otherwise noted, the following default information applies to each directive:

Compiler:

Cray C, Cray C++, or Cray Fortran compilers

Scope:

Local and global

The scoping list may also indicate that a directive has a lexical block scope. A lexical block is the scope within which a directive is on or off and is bounded by the opening curly brace just before the directive was declared and the corresponding closing curly brace. Only applicable executable statements within the lexical block are affected as indicated by the directive. The lexical block does not include the statements contained within a procedure that is called from the lexical block.

Protecting Directives

To ensure that your directives are interpreted by the Cray C and C++ compilers, use the following coding technique in which directive represents the name of the directive:

#if _CRAYC
    #pragma _CRI directive
#endif

This ensures that other compilers used to compile this code will not interpret the directive. Some compilers diagnose any directives that they do not recognize. The Cray C and C++ compilers diagnose directives that are not recognized only if the _CRI specification is used.

Objects and Functions

C++ prohibits referencing undeclared objects or functions. Objects and functions must be declared prior to using them in a #pragma directive. This is not always the case with C.

Some #pragma directives take function names as arguments (for example: #pragma _CRI weak, #pragma _CRI suppress, #pragma _CRI inline_alwaysname[,name…], #pragma _CRI inline_nevername[,name…]). Member functions and qualified names are allowed for these directives.

Loop Directives

Many directives apply to groups. Unless otherwise noted, these directives must appear before a for, while, or do while loop. These directives may also appear before a label for if…goto loops. If a loop directive appears before a label that is not the top of an if…goto loop, it is ignored.

Alternative Directive form: _Pragma

Cray C/C++ Compiler directives can also be specified in the following form, which has the advantage in that it can appear inside macro definitions:

_Pragma("_CRI identifier");

This form has the same effect as using the #pragma form, except that everything that appeared on the line following the #pragma must now appear inside the double quotation marks and parentheses. The expression inside the parentheses must be a single string literal; it cannot be a macro that expands into a string literal. _Pragma is an extension to the C and C++ standards.

The following is an example using the #pragma form:

#pragma _CRI concurrent

The following is the same example using the alternative form:

_Pragma("_CRI concurrent");

In the following example, the loop automatically vectorizes wherever the macro is used:

#define _str( _X ) # _X
#define COPY( _A, _B, _N )

{
  int i;
  _Pragma( "_CRI concurrent" )
  _Pragma( _str( _CRI loop_info cache_nt( _B ) ) )
  for ( i = 0; i < _N; i++ ) {
    _A[i] = _B[i];
  }
}


void
copy_data( int *a, int *b, int n )
{
  COPY( a, b, n );
}

Macros are expanded in the string literal argument for _Pragma in an identical fashion to the general specification of a #pragma directive.

Using Directives: Cray Fortran Compiler

A directive line begins with the characters CDIR$ or !DIR$. How you specify a directive depends on the source form you are using.

  • If you are using fixed source form, indicate a directive line by placing CDIR$ or !DIR$ in columns 1 through 5. If the compiler encounters a non-blank character is column 6, the line is assumed to be a directive continuation line. Columns 7 and beyond can contain one or more directives. Characters entered in columns beyond the default column width are ignored.

  • If you are using free source form, indicate a directive by placing !DIR$, followed by a space, then one or more directives. If the position following the !DIR$ contains a character other than a blank, tab, or newline character, the line is assumed to be a continuation line. For example, the asterisk (*) in column 6 on the second line indicates that it is a continuation of the first line:

!DIR$ Nosideeffects
!DIR$*ab
  • The !DIR$ need not start in column 1, but it must be the first text on the line.

  • The FIXED and FREE directives must appear alone on a directive line and cannot be continued.

  • Do not use source preprocessor (#) directives within multiline compiler directives.

If you want to specify more than one directive on a line, separate each directive with a comma. Some directives require that you specify one or more arguments; when specifying a directive of this type, no other directive can appear on the line.

Spaces can precede, follow, or be embedded with a directive, regardless of the source form.

Range and Placement of Directives

FIXED and FREE directives can appear anywhere in your source code. All other directives must appear within a program unit.

The following directives must be placed in the declarative portion of a program unit and apply only to that program unit: CACHE, CACHE_NT, COPY_ASSUMED_SHAPE, IGNORE_TKR, MEMORY, NAME, NOSIDEEFFECTS, SAME_TBS, STACK, and WEAK.

The following directives toggle a compiler feature on or off at the point at which the directive appears in the code. These directives remain in effect until the opposite directive appears, until the directive is reset, or until the end of the program unit, at which time the command line settings become the default for the remainder of the compilation: [NO]AUTOTHREAD,[NO]BOUNDS, [NO]CLONE, [NO]COLLAPSE, [NO]FUSION, [NO]INLINE, [NO]PATTERN, [NO]PIPELINE, [NO]UNROLL, [NO]VECTOR.

RESET CLONE and RESET INLINE apply at the point at which they appear and set cloning or inlining back to the default.

The SUPPRESS directive applies at the point at which it appears.

The ID directive does not apply to any particular range of code. It adds information to the .o file generated from the source.

The following directives apply only to the next loop or block of code encountered lexically: BLOCKABLE, BLOCKINGSIZE|NOBLOCKING, CONCURRENT, HAND_TUNED, [NO]INTERCHANGE, IVDEP, NEXTSCALAR, NOFISSION, PERMUTATION, PREFERVECTOR, PROBABILITY, SAFE_ADDRESS, SAFE_CONDITIONAL, LOOP_INFO.

The following directives alter the status of entities in ways that affect compilation. They do not apply to particular ranges of code: IGNORE_TKR, INLINEALWAYS|INLINENEVER, CLONEALWAYS|CLONENEVER, NAME, NOSIDEEFFECTS.

The [NO]MODINLINE directives are in effect for the scope of the program unit in which they are specified, including all contained procedures. If one of these directives is specified in a contained procedure, the contained procedure’s directive overrides the containing procedure’s directive.

Interaction with the ftn and cc Command Line

Note the following interactions between directives and ftn command line options.

-x               (ftn only) The -x option accepts one or more
                 directives as arguments. Directives specified
                 with the -x option are ignored during
                 compilation. To ignore all compiler
                 directives, specify -x all.

-h nopragma      (cc only) The -h nopragma= option accepts one
                 or more directives as arguments. Directives
                 specified with the -h nopragma= option are
                 ignored during compilation. To ignore all
                 directives, specify -h nopragma=all.

-O 0             The -O 0 option disables all compiler
                 optimizations. All scalar optimization,
                 vectorization, and tasking directives are
                 ignored.

-O ipaN          The -O ipa0 option disables all inlining and
                 cloning optimizations. All inlining and
                 cloning directives are ignored.

-O scalar0       The -O scalar0 option disables all scalar
                 optimizations. All scalar optimization and
                 vectorization directives are ignored.

-O vector0       The -O vector0 option disables vectorization.
                 All vectorization directives are ignored.

Cray C/C++ Compiler and Cray Fortran Compiler Directives Man Pages

The following tables list the directives supported by the Cray C/C++ and Fortran compilers. Fortran directives appear in upper case letters and C directives in lowercase. See the individual directive’s man page for more information about the use and effect of the directive.

                      Table 1. General Directives

[no]autothread, [NO]AUTOTHREAD     Turn autothreading on and off for
                                   selected blocks of code.
blockable, BLOCKABLE               Specifies that it is legal to cache
                                   block the subsequent loops.
blockingsize, BLOCKINGSIZE         Assert that the loop following the
                                   directive either is or is not
                                   involved in a cache blocking.
[no]bounds,[NO]BOUNDS              Specifies that pointer and array
                                   references are to be checked/not
                                   checked.
cache, CACHE                       Asserts that all memory operations
                                   with the specified symbols as the
                                   base are to be allocated in cache.
cache_nt, CACHE_NT                 Specifies objects that should use
                                   non-temporal reads and writes.
                                   Advisory.
duplicate, NAME                    Provides additional, externally
                                   visible names for specified
                                   functions.
FREE, FIXED                        (ftn only) Specify if the source code
                                   uses the free or fixed format.
[no]fusion, [NO]FUSION             Direct the compiler to attempt or not
                                   attempt loop fusion on the following
                                   loop.
ident,ID                           Directs the compiler to store the
                                   string indicated by text into the
                                   object (.o) file.
IGNORE_TKR                         (ftn only) Ignore the type, kind, and
                                   rank of specified dummy arguments.
memory                             Place heap-allocations or variables
                                   in specific types of memory. This is
                                   intended for use only on systems that
                                   support more than one type of
                                   explicitly addressable on-node
                                   memory: for example, systems having
                                   both normal DDR memory and high-
                                   bandwidth MCDRAM memory.
message                            (cc only) Directs the compiler to
                                   write the message defined by the text
                                   argument to stderr as a warning
                                   message.
[no]opt                            (cc only) Enables or disables
                                   automatic optimizations and accepts
                                   or ignores optimization directives.
optimize                           Enable optimization in the function
                                   or program unit in which it appears,
                                   overriding the optimization level set
                                   via the compiler command line.
prefetch, PREFETCH                 Directs the compiler to prefetch a
                                   variable or array element from local
                                   memory into cache prior to a
                                   reference or to store a value into
                                   cache prior to a local memory write.
PREPROCESS                         (ftn only) Allow an include file to
                                   be preprocessed.
probability, PROBABILITY,          Specify information used by
probability_almost_always,         interprocedure analysis (IPA) and the
PROBABILITY_ALMOST_ALWAYS,         optimizer to product faster code
probability_almost_never,          sequences. The specified probability
PROBABILITY_ALMOST_NEVER           is a hint, rather than a statement of
                                   fact.
STACK                              (ftn only) Allocate storage to the
                                   stack in the program unit containing
                                   the directive.
weak, WEAK                         Specifies an external identifier that
                                   may remain unresolved throughout the
                                   compilation.

                   Table 2. Vectorization Directives

COPY_ASSUMED_SHAPE                 (ftn only) Copy assumed-shape dummy
                                   array arguments into contiguous local
                                   temporary storage
hand_tuned, HAND_TUNED             Asserts that the code in the loop
                                   nest has been arranged by hand for
                                   maximum performance, and the compiler
                                   should restrict some of the more
                                   aggressive automatic expression
                                   rewrites.
ivdep, IVDEP                       Ignore vector dependencies, including
                                   explicit dependencies, when
                                   attempting to vectorize the first
                                   loop that follows the directive.
loop_info, LOOP_INFO               Allows additional information to be
                                   specified about the behavior of a
                                   loop, including run time trip count
                                   and hints on cache allocation
                                   strategy.
LOOP_INFO PREFER_THREAD, LOOP_INFO (ftn only) Indicate a preference for
PREFER_NOTHREAD                    turning threading on or off for
                                   selected loops.
NEXTSCALAR                         (ftn only) Disable vectorization for
                                   the first DO or DO WHILE loop
                                   following the directive
[no]pattern, [NO]PATTERN           Disables pattern matching for the
                                   loop immediately following the
                                   directive
permutation, PERMUTATION           Specifies that an integer array has
                                   no repeated values.
[no]pipeline, ,[NO]PIPELINE        Enables/disables the compiler
                                   analysis of all vector loops and its
                                   automatic attempt to pipeline a loop
                                   if doing so can be expected to
                                   produce a significant performance
                                   gain.
prefervector, PREFERVECTOR         Directs the compiler to vectorize the
                                   loop immediately following the
                                   directive if the loop contains more
                                   than one loop in the nest that can be
                                   vectorized.
safe_address, SAFE_ADDRESS         Specifies that it is safe to
                                   speculatively execute memory
                                   references within all conditional
                                   branches of a loop.
safe_conditional, SAFE_CONDITIONAL Specifies that it is safe to execute
                                   all references and operations within
                                   all conditional branches of a loop.
SAME_TBS                           (ftn only) Specifies that assumed
                                   shape array arguments are of same
                                   type, bounds, and stride allowing
                                   more efficient code.
[no]vector, [NO]VECTOR             Controls vectorization of DO loops.
                                   May affect specific optimizations.

                       Table 3. Scalar Directives

[no]collapse, [NO]COLLAPSE         Controls collapse of the immediately
                                   following loop nest.
concurrent, CONCURRENT             Indicates that no data dependence
                                   exists between array references in
                                   different iterations of the loop that
                                   follows the directive.
[no]interchange, [NO]INTERCHANGE   Specifies whether or not the order of
                                   the following two or more loops
                                   should be interchanged.
suppress, SUPPRESS                 Suppresses optimization determined by
                                   its use with global or local scope.
[no]unroll, [NO]UNROLL             Control unrolling for individual
                                   loops or specifies no unrolling of a
                                   loop.
[no]fusion, [NO]FUSION             Instructs the compiler to attempt/not
                                   attempt loop fusion on the following
                                   loop.
nofission, NOFISSION               Instructs the compiler not to split
                                   the following loop. Fission is
                                   prevented only for the loop level
                                   specified; loops nested within the
                                   specified loop remain candidates for
                                   fission.
NOSIDEEFFECTS                      Declare that a called subprogram does
                                   not redefine selected variables

                Table 4. Inlining and Cloning Directives

clone_enable, clone_disable,       Control whether cloning is attempted
clone_reset, [NO]CLONE, RESETCLONE over a range of code. The clone_reset
                                   (RESETCLONE) directive returns the
                                   cloning state to the state specified
                                   on the compiler command line.
clone_always, clone_never,         The clone_always (CLONEALWAYS)
CLONEALWAYS, CLONENEVER            directive identifies functions that
                                   the compiler should always attempt to
                                   clone. The clone_never (CLONENEVER)
                                   directive identifies functions that
                                   are never to be cloned.
inline_enable, inline_disable,     The inline_enable (INLINE) directive
inline_reset, [NO]INLINE,          tells the compiler to attempt to
RESETINLINE                        inline functions at call sites. The
                                   inline_disable (NOINLINE) directive
                                   tells the compiler to not inline
                                   functions at call sites. The
                                   inline_reset (RESETINLINE) directive
                                   returns the inlining state to the
                                   state specified on the command line
                                   (-h ipan).
inline_always, inline_never,       The inline_always directive
INLINEALWAYS, INLINENEVER          identifies functions that the
                                   compiler should always attempt to
                                   inline. The inline_never directive
                                   identifies functions that are never
                                   to be inlined.
[NO]MODINLINE                      (ftn only) Enable or disable the
                                   creation of inlineable templates for
                                   specific module procedures.

                        Table 5. PGAS Directives

pgas buffered_async, PGAS          Batch PGAS operations into buffers to
BUFFERED_ASYNC                     be processed in bulk. No ordering
                                   guarantees are made until the next
                                   fence instruction. Fences provide
                                   ordering and visibility guarantees
                                   for BA operations.
pgas defer_sync, PGAS DEFER_SYNC   Defer the synchronization of PGAS
                                   data until the next fence
                                   instruction.

SEE ALSO

autothread(7), blockable(7), blockingsize(7), bounds(7), buffered_async(7), cache(7), cache_nt(7), clone(7), clonealways(7), collapse(7), concurrent(7), copy_assumed_shape(7), defer_sync(7), duplicate(7), free(7), fusion(7), hand_tuned(7), ident(7), ignore_tkr(7), inline(7), inlinealways(7), instantiate(7), interchange(7), ivdep(7), loop_info(7), memory(7), message(7), nextscalar(7), nofission(7), nosideeffects(7), opt(7), optimize(7), pattern(7), permutation(7), pgo_loop_info(7), pipeline(7), prefervector(7), prefetch(7), preprocess(7), probability(7), safe_address(7), safe_conditional(7), same_tbs(7), shortloop(7), stack(7), suppress(7), unroll(7), vector(7), weak(7)

Cray C and C++ Reference Manual

Cray Fortran Reference Manual