cray_upc_team_reduce

Date:

09-23-2013

NAME

cray_upc_team_reduce - Cray UPC team all-reduce collective

SYNOPSIS

#include <upc_collective_cray.h>

cray_upc_return_t cray_upc_team_reduce( shared void *dst, shared void *src,
                                      size_t nelems, cray_upc_dtype_t dtype,
                                      upc_op_t op, int root, cray_upc_team_t team,
                                      upc_handle_t *handle );

IMPLEMENTATION

Cray Linux Environment (CLE)

DESCRIPTION

The cray_upc_team_reduce function collectively performs nelems independent reductions with each rank contributing a single value to each reduction (nelems total). The function is blocking unless a non-NULL pointer is passed as the handle argument, indicating that the reduction should be non-blocking and will be synced later by the user. Both the source and destination pointers must have affinity to the calling thread and are treated as if the blocksize were 0. They do not have to be symmetric across threads. They must not overlap. However, if a NULL shared pointer is passed as the destination, an in-place reduction occurs and the source data is overwritten with the results on the root. The collective implicitly uses UPC_IN_MYSYNC|UPC_OUT_MYSYNC semantics for memory ordering. The results are undefined if any thread references the source or destination buffers during the operation. Results are also undefined if ranks pass differing values of op, dtype, or root. The root argument must be between 0 and the size of the team.

RETURN VALUES

The cray_upc_team_reduce function returns CRAY_UPC_SUCCESS if it successfully performed the reduction, or CRAY_UPC_ERROR if an error occurred.

NOTES

The implementation of the non-blocking form of this collective is deferred. Passing anything other than NULL as a handle is an error.

EXAMPLES

In both examples below, thread 0 gets THREADS as the first element, and -THREADS as the second.

Separate source and destination:

#include<upc_collective_cray.h>

shared [2] int src[THREADS][2];
shared [2] int dst[THREADS][2];

src[MYTHREAD][0] = 1;
src[MYTHREAD][1] = -1;
cray_upc_team_reduce( &dst[MYTHREAD], &src[MYTHREAD], 2,
                            CRAY_UPC_INT, UPC_ADD, 0, CRAY_UPC_TEAM_ALL,
                            NULL )

In-place, results written back into source:

src[MYTHREAD][0] = 1;
src[MYTHREAD][1] = -1;
cray_upc_team_reduce( NULL, &src[MYTHREAD], 2, CRAY_UPC_INT,
                            UPC_ADD, 0, CRAY_UPC_TEAM_ALL, NULL );

SEE ALSO

cray_upc_return_t(3c), cray_upc_team_allreduce(3c), cray_upc_team_barrier(3c), cray_upc_team_t(3c), upc_dtype_t(3c), upc_op_t(3c)