Introduction¶
Usage¶
In this section, I will explain how you can incorporate this library into your project.
While it is technically feasible to install the library (see below), I recommend taking the simplest approach by copying the source files located in src/sdecomp
and the corresponding header file, include/sdecomp.h
, directly into your project directory:
src
├── your_source_file_1.c
├── your_source_file_2.c
└── sdecomp
├── construct.c
├── destruct.c
├── get
│ ├── main.c
│ └── process_and_pencil.c
├── internal.h
├── kernel.c
├── main.c
├── memory.c
├── sanitise.c
└── transpose
├── 2d.c
├── 3d.c
├── internal.h
├── main.c
└── memory.c
include
├── your_header_file.h
└── sdecomp.h
A typical Makefile
would be as follows:
CC := mpicc
CFLAG := -std=c99 -Wall -Wextra -Werror -O3
INC := -Iinclude
LIB :=
SRCDIR := src
OBJDIR := obj
SRC := $(shell find $(SRCDIR) -type f -name *.c)
OBJ := $(patsubst %.c,$(OBJDIR)/%.o,$(SRC))
DEP := $(patsubst %.c,$(OBJDIR)/%.d,$(SRC))
TARGET := a.out
help:
@echo "all : create \"$(TARGET)\""
@echo "clean : remove \"$(TARGET)\" and object files under \"$(OBJDIR)\""
@echo "help : show this message"
all: $(TARGET)
clean:
$(RM) -r $(OBJDIR) $(TARGET)
$(TARGET): $(OBJ)
$(CC) $(CFLAG) $^ -o $@ $(LIB)
$(OBJDIR)/%.o: %.c
@if [ ! -e $(dir $@) ]; then \
mkdir -p $(dir $@); \
fi
$(CC) $(CFLAG) -MMD $(INC) -c $< -o $@
-include $(DEP)
.PHONY : all clean help
Include include/sdecomp.h
in your source to use the APIs:
#include "sdecomp.h"
int main(void){
...
return 0;
}
(Optional) Installation
While I suggest using the aforementioned approach for simplicity, you also have the option to build and install this library on your machine using the install.sh
script to link this library externally.
First, checkout the repository.
To install,
$ bash install.sh install
To uninstall,
$ bash install.sh uninstall
By default, a dynamic library and a header file are installed under ${HOME}/.local
.
Remember to set paths and link it properly, e.g. -I${HOME}/.local/include
(or set C_INCLUDE_PATH
) and -L${HOME}/.local/lib
(or set LD_LIBRARY_PATH
) with -lsdecomp
.
Error check¶
In all APIs, the final parameter serves as a container for the output, while the return value indicates the occurrence of an error (0
indicates success, any other value signifies failure).
Although error checks are excluded for the sake of simplicity in the following examples, it is essential for the user to consistently verify it:
int retval = sdecomp.xxx(...);
if(0 != retval){
goto err_hndl;
}
I have not provided any specifications regarding the contents of the output parameter in the event of API failure. Consequently, using the parameter in such cases results in undefined behavior.
Data types¶
In MPI
, it is customary for arguments to be integers, even when only non-negative numbers are applicable (such as the number of processes).
However, in this library, an effort has been made to minimise the need for input sanitisation by accepting size_t
or similar unsigned integer types in the APIs.
It is important to exercise caution and avoid passing an int *
where a size_t *
is expected, as this would be incorrect if sizeof(int)
and sizeof(size_t)
are not the same.
Caveats¶
Limited rotations
For three-dimensional domains, there are 6 types of pencils, resulting in a total of \(6 \times 5 = 30\) patterns to encompass all possible rotations. However, in this library, the focus is narrowed down to only considering the clockwise
\[0 \rightarrow 1 \rightarrow 2 \rightarrow 3 \rightarrow 4 \rightarrow 5 \rightarrow 0\]and the counter-clockwise
\[0 \rightarrow 5 \rightarrow 4 \rightarrow 3 \rightarrow 2 \rightarrow 1 \rightarrow 0\]rotations (seee below).
Memory orders
By default (
x1pencil
), the memory is contiguous in the \(x\) direction, followed by the \(y\) direction, with the largest stride occurring in the \(z\) direction. When the pencils are rotated, both the direction of the pencils and the memory order are altered as follows.¶ Pencil
Contiguous
Sparse
x1
,x2
x
y
y1
,y2
y
x
¶ Pencil
Contiguous
Intermediate
Sparse
x1
,x2
x
y
z
y1
,y2
y
z
x
z1
,z2
z
x
y
The purpose is to enhance the cache efficiency for operations like FFTs in the pencil direction. Also this rearrangement simplifies the implementation.
Two pencil types in the same dimension
As shown above, there are two types of
xpencils
,ypencils
, andzpencils
. Notice the difference betweenx1pencil
andx2pencil
.Reserved words
Achieving perfect encapsulation in C can be challenging. Functions in this library are implemented in separate files to enhance maintainability, and as a result the internal functions are visible to the linker. To prevent name collisions and ensure consistency, it is advised not to define variables that begin with
sdecomp
orSDECOMP
in your implementation.