The International Conference for High Performance Computing, Networking, Storage and Analysis
A Distributed-Memory Fast Multipole Method for Volume Potentials.
Student: Dhairya Malhotra (University of Texas, Austin)
Supervisor: George Biros (University of Texas, Austin)
Abstract: We present a Fast Multipole Method (FMM) for computing volume potentials and use them to construct spatially-adaptive solvers for the Poisson, Stokes and Helmholtz problems. Conventional N-body methods apply to discrete particle interactions. With volume potentials, one replaces the sums with volume integrals. We present new near interaction traversals and incorporate a cache optimized for interaction traversal. Finally, we use vectorization, including the AVX instruction set on the Intel Sandybridge architecture to get over 50% of peak floating point performance. We use task parallelism to employ the Xeon Phi on the Stampede platform at the Texas Advanced Computing Center (TACC). We achieve over 595Gflop/s of double precision performance on a single node. Our largest run on Titan at ORNL took 7.8 secs on 16K nodes for a problem with 74E+9 unknowns for a highly nonuniform particle distribution.