prefetch
- Date:
07-28-2012
NAME
prefetch, PREFETCH - a general directive which instructs the compiler to generate explicit prefetch instructions which load data from memory into cache prior to read or write access (x86 only).
SYNOPSIS
#pragma _CRI prefetch [([lines(num)][, level(num)] [, write][, nt])] var[, var]...
!DIR$ PREFETCH [([lines(num)][, level(num)] [, write][, nt])] var[, var]...
IMPLEMENTATION
Cray Linux Environment (CLE)
DESCRIPTION
The general prefetch directive instructs the compiler to generate explicit prefetch instructions which load data from memory into cache prior to read or write access. The memory location to be prefetched is defined by var, which specifies any valid variable, member, or array element reference.
The prefetch directive supports the following options:
- lines(num)
Specifies the number of cache lines to be prefetched. num is an expression that evaluates to an integer constant at compilation time. By default, the number of cache lines prefetched is 1.
- level(num)
Specifies the level of cache into which data is loaded. num is an expression that evaluates to an integer constant at compilation time. The cache level defaults to 1, the level closest to the processing unit. This level specification has little effect for current x86 targets.
- write
Specifies that the prefetch is for data to be written. When data is to be written, a prefetch instruction can move a block into the cache so that the expected store will be to the cache. Prefetch for write generally brings the data into the cache in an exclusive or modified state. By default, the prefetch is for data to be read. If the target architecture does not support prefetch for write, the prefetch will automatically become a prefetch for read.
- nt
Specifies that the prefetch is for non-temporal data. By default, the prefetch is for temporal data. Data with temporal locality (persistence), is expected to be accessed multiple times.
DISCUSSION
The compiler issues the prefetch instruction when it encounters the prefetch directive. The directive allows the user to influence almost every aspect of prefetch behavior. The default behavior prefetches one cache line, into L1 cache, for read access, and assumes temporal locality.
The prefetch directive can be used inside and outside of loops, in a loop preamble, or before a function call to reduce cache-miss memory latency.
The compiler will attempt to avoid multiple prefetches to the same cache line, which can be created as a result of optimization.
All variables specified on the same prefetch directive line share the same behavior. If different behavior is needed for different variables, use multiple prefetch directive lines.
The general prefetch directive supersedes the effects of any relevant loop_info [no]prefetch directives and the -h [no]autoprefetch command line option.
The Cray Fortran compiler command line option -x prefetch can be used to disable all general prefetch directives in Fortran source code. The Cray C and C++ compiler command line option -h nopragma=prefetch can be used to disable all general prefetch directives in C and C++ source code.
EXAMPLES
Example 1: PREFETCH directive in Fortran code
real*8 a(m,n), b(n,p), c(m,p), arow(n)
...
do j = 1, p
!dir$ prefetch (lines(3), nt) arow(1),b(1,j)
do k = 1, n, 4
!dir$ prefetch (nt) arow(k+24),b(k+24,j)
c(i,j) = c(i,j) + arow(k) * b(k,j)
c(i,j) = c(i,j) + arow(k+1) * b(k+1,j)
c(i,j) = c(i,j) + arow(k+2) * b(k+2,j)
c(i,j) = c(i,j) + arow(k+3) * b(k+3,j)
enddo
enddo
Example 2: prefetch pragma in C code
void
add( long * restrict a, long * restrict b, const int n )
{
int i;
#pragma _CRI prefetch (lines(2)) b[0]
for ( i = 0; i < n; i++ ) {
#pragma _CRI prefetch b[i+16]
a[i] += b[i];
}
return;
}
SEE ALSO
intro_directives(7)
loop_info(7)
craycc(1)
crayftn(1)