Sequential Optimization and Shared and Distributed Memory Parallelization in Clusters: N-Body/Particle Simulation

Fernando G. Tinetti and Sergio M. Martin


High Performance Computing, Source Code Optimization, Parallel Computing, Cluster Computing, N-Body/Particle Simulation


The particle-particle method for N-Body problems is one of the most commonly used methods in computer driven physics simulation. These algorithms are, in general, very simple to design and code, and highly parallelizable. In this article, we present the most important approaches for the application of the three performance improvement areas on these algorithms when executed on high performance computing (HPC) clusters: 1) sequential optimization (a single core in a node of the cluster), 2) shared memory parallelism (in a single node with multiple CPUs available, just like a multiprocessor), and 3) distributed memory parallelism (in the whole cluster). For each one of the improvement areas we present the employed techniques and the obtained performance gain. Also, we will show how some (sequential/classical) code optimizations are almost essential for obtaining at least acceptable parallel performance and scalability.

Important Links:

Go Back