src/basic/accel.F90 · 0eb428c93cea3a264183790dfbbfc521d5b06e96 · HPCSource / octopus · 极狐GitLab

查找文件 Blame 历史永久链接

Set threads per block to 256 by default · 0eb428c9

由 Sebastian Ohlmann 创作于 4月 04, 2024

This is recommended by the CUDA best practices guide, see
https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html#thread-and-block-heuristics
Make sure that the limit of threads per block for the corresponding
kernel is respected (though it is unlikely to be smaller than 256).

0eb428c9