Towards Robust Algorithms for Current Deposition and Dynamic Load-Balancing in a GPU Particle In Cell Code

Rossi, Francesco; Londrillo, Pasquale; Sgattoni, A.; Sinigardi, Stefano; Turchetti, Giorgio

doi:10.1063/1.4773692

We present 'jasmine', an implementation of a fully relativistic, 3D, electromagnetic Particle-In-Cell (PIC) code, capable of running simulations in various laser plasma acceleration regimes on Graphics-Processing-Units (GPUs) HPC clusters. Standard energy/charge preserving FDTD-based algorithms have been implemented using double precision and quadratic (or arbitrary sized) shape functions for the particle weighting. When porting a PIC scheme to the GPU architecture (or, in general, a shared memory environment), the particle-to-grid operations (e. g. the evaluation of the current density) require special care to avoid memory inconsistencies and conflicts. Here we present a robust implementation of this operation that is efficient for any number of particles per cell and particle shape function order. Our algorithm exploits the exposed GPU memory hierarchy and avoids the use of atomic operations, which can hurt performance especially when many particles lay on the same cell. We show the code multi-GPU scalability results and present a dynamic load-balancing algorithm. The code is written using a python-based C++ meta-programming technique which translates in a high level of modularity and allows for easy performance tuning and simple extension of the core algorithms to various simulation schemes.

F. Rossi, P. Londrillo, A. Sgattoni, S. Sinigardi, G. Turchetti (2012). Towards Robust Algorithms for Current Deposition and Dynamic Load-Balancing in a GPU Particle In Cell Code [10.1063/1.4773692].