Pencil rotation: sdecomp.transpose
¶
APIs to achieve the pencil rotations are listed in this page.
Constructor¶
construct
¶
Creating a structure
sdecomp_transpose_plan_t
which contains all essential information to achieve pencil rotations and returns a pointer to it.int (* const construct)( const sdecomp_info_t * info, const sdecomp_pencil_t pencil_bef, const sdecomp_pencil_t pencil_aft, const size_t * glsizes, const size_t size_of_element, sdecomp_transpose_plan_t ** plan // out );Details
Example: create a transpose plan to rotate
x1pencil
toy1pencil
, whose global array size is256 x 512 x 1024
and each element is the type ofdouble
:#define NDIMS 3 const size_t glsizes[NDIMS] = {256, 512, 1024}; sdecomp_transpose_plan_t *plan = NULL; sdecomp.transpose.construct( info, SDECOMP_X1PENCIL, SDECOMP_Y1PENCIL, glsizes, sizeof(double), &plan );Note
If the arguments are invalid (e.g., trying to rotate from
SDECOMP_X1PENCIL
toSDECOMP_Z1PENCIL
, which is not applicable in this project), this function returns non-zero exit code. It is strongly recommended to check the returned plan to confirm the desired plan is really created.
Destructor¶
destruct
¶
Destructing a plan created by
sdecomp.transpose.construct
.int (* const destruct)( sdecomp_transpose_plan_t * plan );Details
Example:
sdecomp.transpose.destruct(plan);
Runner¶
execute
¶
Executing pencil rotations based on the plan created by
sdecomp.transpose.construct
.int (* const execute)( sdecomp_transpose_plan_t * restrict plan, const void * restrict sendbuf, void * restrict recvbuf );Details
Example: transpose
x1pencil
toy1pencil
:// global domain size const size_t glsizes[NDIMS] = {256, 512, 1024}; // you should allocate x1pencil and y1pencil beforehand size_t mysizes_x1[NDIMS] = {0}; for(sdecomp_dir_t dir = 0; dir < NDIMS; dir++){ sdecomp.get_pencil_mysize(info, SDECOMP_X1PENCIL, dir, glsizes[dir], mysizes_x1 + dir), } size_t mysizes_y1[NDIMS] = {0}; for(sdecomp_dir_t dir = 0; dir < NDIMS; dir++){ sdecomp.get_pencil_mysize(info, SDECOMP_Y1PENCIL, dir, glsizes[dir], mysizes_y1 + dir), } double *x1pencil = calloc(mysizes_x1[0] * mysizes_x1[1] * mysizes_x1[2], sizeof(double)); double *y1pencil = calloc(mysizes_y1[0] * mysizes_y1[1] * mysizes_y1[2], sizeof(double)); // initialise x1pencil: // // for(size_t k = 0; k < mysizes_x1[2]; k++){ // for(size_t j = 0; j < mysizes_x1[1]; j++){ // for(size_t i = 0; i < mysizes_x1[0]; i++){ // const size_t index = // + k * mysizes_x1[0] * mysizes_x1[1] // + j * mysizes_x1[0] // + i; // x1pencil[index] = ...; // } // } // } sdecomp.transpose.execute( info, x1pencil, y1pencil ); // check y1pencil, now memory is contiguous in y direction // // for(size_t i = 0; i < mysizes_y1[0]; i++){ // for(size_t k = 0; k < mysizes_y1[2]; k++){ // for(size_t j = 0; j < mysizes_y1[1]; j++){ // const size_t index = // + i * mysizes_y1[1] * mysizes_y1[2] // + k * mysizes_y1[1] // + j; // ... = y1pencil[index]; // } // } // }