upc_all_exchange

Date:

05-03-2013

NAME

upc_all_exchange - collectively exchanges shared memory

SYNOPSIS

void upc_all_exchange( shared void * restrict dst, shared const void * restrict src,
                              size_t nbytes, upc_flag_t flags );

IMPLEMENTATION

Cray Linux Environment (CLE)

DESCRIPTION

The upc_all_exchange collective function copies the i-th block of memory from a shared memory area (src) that has affinity to thread j to the j-th block of a shared memory area (dst) that has affinity to thread i. The number of bytes in each block is nbytes, where nbytes > 0.

The upc_all_exchange function treats the src pointer and the dst pointer as if each pointed to a shared memory area of nbytes * THREADS bytes on each thread and therefore had type:

shared [nbytes * THREADS] char[nbytes * THREADS * THREADS]

The targets of the src and dst pointers must have affinity to thread 0. The src and dst pointers are treated as if they have phase 0.

For each pair of threads i and j, the effect is equivalent to copying the i-th block of nbytes bytes that has affinity to thread j pointed to by src to the j-th block of nbytes bytes that has affinity to thread i pointed to by dst.

Controlling Data Synchronization

The argument flag is of type upc_flag_t and is used to specify the data synchronization semantics for the collective function. The value of flag is formed by or-ing together a constant of the form UPC_IN_XSYNC and a constant of the form UPC_OUT_YSYNC, where X and Y may be NO, MY, or ALL. If X is:

NO

The function may begin to read or write data when the first thread has entered the collective function call.

MY

The function may begin to read or write only data which has affinity to threads that have entered the collective function call.

ALL

The function may begin to read or write data only after all threads have entered the function call.

And if Y is:

NO

The function may read and write data until the last thread has returned from the collective function call.

MY

The function call may return in a thread only after all reads and writes of data with affinity to the thread are complete.

ALL

The function call may return only after all reads and writes of data are complete.

For further information, see upc_flag_t(3c).

EXAMPLES

Example 1: upc_all_exchange for the static THREADS translation environment.

#include <upc.h>
#include <upc_collective.h>
#define NELEMS 10
shared [NELEMS*THREADS] int A[THREADS][NELEMS*THREADS];
shared [NELEMS*THREADS] int B[THREADS][NELEMS*THREADS];
// Initialize A.
upc_barrier;
upc_all_exchange( B, A, NELEMS * sizeof(int), UPC_IN_NOSYNC | UPC_OUT_NOSYNC );
upc_barrier;

Example 2: upc_all_exchange for the dynamic THREADS translation environment.

#include <upc.h>
#include <upc_collective.h>
#define NELEMS 10
shared int *Adata, *Bdata;
shared [] int *myA, *myB;
int i;
Adata = upc_all_alloc(THREADS*THREADS, NELEMS*sizeof(int));
myA = (shared [] int *)&A[MYTHREAD];
Bdata = upc_all_alloc(THREADS*THREADS, NELEMS*sizeof(int));
myB = (shared [] int *)&B[MYTHREAD];

// Adata and Bdata contain THREADS*THREADS*NELEMS elements.
// myA and myB are MYTHREAD's rows of Adata and Bdata, resp.

// Initialize MYTHREAD's row of A
for (i=0; i<NELEMS*THREADS; i++)
myA[i] = MYTHREAD*10 + i;
upc_all_exchange( Bdata, Adata, NELEMS*sizeof(int),
                     UPC_IN_ALLSYNC | UPC_OUT_ALLSYNC );

SEE ALSO

intro_pgas(7), upc_all_broadcast(3c), upc_all_gather(3c), upc_all_gather_all(3c), upc_all_permute(3c), upc_all_reduce(3c), upc_all_scatter(3c), upc_flag_t(3c)