由 Sebastian Ohlmann 创作于
The packing for sending and unpacking from receiving is done with a kernel that implements a general reordering according to a map. The implementation again supports both normal MPI and CUDA-aware MPI.
The packing for sending and unpacking from receiving is done with a kernel that implements a general reordering according to a map. The implementation again supports both normal MPI and CUDA-aware MPI.