Pencil rotation: sdecomp.transpose

APIs to achieve the pencil rotations are listed in this page.

Constructor

construct

Creating a structure sdecomp_transpose_plan_t which contains all essential information to achieve pencil rotations and returns a pointer to it.

include/sdecomp.h
int (* const construct)(
    const sdecomp_info_t * info,
    const sdecomp_pencil_t pencil_bef,
    const sdecomp_pencil_t pencil_aft,
    const size_t * glsizes,
    const size_t size_of_element,
    sdecomp_transpose_plan_t ** plan // out
);
Details

Example: create a transpose plan to rotate x1pencil to y1pencil, whose global array size is 256 x 512 x 1024 and each element is the type of double:

#define NDIMS 3
const size_t glsizes[NDIMS] = {256, 512, 1024};

sdecomp_transpose_plan_t *plan = NULL;
sdecomp.transpose.construct(
    info,
    SDECOMP_X1PENCIL,
    SDECOMP_Y1PENCIL,
    glsizes,
    sizeof(double),
    &plan
);

Note

If the arguments are invalid (e.g., trying to rotate from SDECOMP_X1PENCIL to SDECOMP_Z1PENCIL, which is not applicable in this project), this function returns non-zero exit code. It is strongly recommended to check the returned plan to confirm the desired plan is really created.

Destructor

destruct

Destructing a plan created by sdecomp.transpose.construct.

include/sdecomp.h
int (* const destruct)(
    sdecomp_transpose_plan_t * plan
);
Details

Example:

sdecomp.transpose.destruct(plan);

Runner

execute

Executing pencil rotations based on the plan created by sdecomp.transpose.construct.

include/sdecomp.h
int (* const execute)(
    sdecomp_transpose_plan_t * restrict plan,
    const void * restrict sendbuf,
    void * restrict recvbuf
);
Details

Example: transpose x1pencil to y1pencil:

// global domain size
const size_t glsizes[NDIMS] = {256, 512, 1024};
// you should allocate x1pencil and y1pencil beforehand
size_t mysizes_x1[NDIMS] = {0};
for(sdecomp_dir_t dir = 0; dir < NDIMS; dir++){
   sdecomp.get_pencil_mysize(info, SDECOMP_X1PENCIL, dir, glsizes[dir], mysizes_x1 + dir),
}
size_t mysizes_y1[NDIMS] = {0};
for(sdecomp_dir_t dir = 0; dir < NDIMS; dir++){
   sdecomp.get_pencil_mysize(info, SDECOMP_Y1PENCIL, dir, glsizes[dir], mysizes_y1 + dir),
}
double *x1pencil = calloc(mysizes_x1[0] * mysizes_x1[1] * mysizes_x1[2], sizeof(double));
double *y1pencil = calloc(mysizes_y1[0] * mysizes_y1[1] * mysizes_y1[2], sizeof(double));

// initialise x1pencil:
//
//    for(size_t k = 0; k < mysizes_x1[2]; k++){
//        for(size_t j = 0; j < mysizes_x1[1]; j++){
//            for(size_t i = 0; i < mysizes_x1[0]; i++){
//                const size_t index =
//                    + k * mysizes_x1[0] * mysizes_x1[1]
//                    + j * mysizes_x1[0]
//                    + i;
//                x1pencil[index] = ...;
//            }
//        }
//    }

sdecomp.transpose.execute(
    info,
    x1pencil,
    y1pencil
);

// check y1pencil, now memory is contiguous in y direction
//
//    for(size_t i = 0; i < mysizes_y1[0]; i++){
//        for(size_t k = 0; k < mysizes_y1[2]; k++){
//            for(size_t j = 0; j < mysizes_y1[1]; j++){
//                const size_t index =
//                    + i * mysizes_y1[1] * mysizes_y1[2]
//                    + k * mysizes_y1[1]
//                    + j;
//                ... = y1pencil[index];
//            }
//        }
//    }