The International Conference for High Performance Computing, Networking, Storage and Analysis
Improving Utilization and Application Performance in High-Performance GPGPU Clusters with GPGPU Assemblies.
Authors: Alexander Merritt (Georgia Institute of Technology), Naila Farooqui (Georgia Institute of Technology), Vishakha Gupta (Intel Corporation), Magdalena Slawinska (Georgia Institute of Technology), Ada Gavrilovska (Georgia Institute of Technology), Karsten Schwan (Georgia Institute of Technology)
Abstract: High-end computing systems are increasingly characterized by machine configurations with nodes comprised of multiple CPUs and GPGPUs. Challenges using such heterogeneous machines include potential mismatches between applications’ needs for and capacities in using these resources versus the fixed hardware configuration: (i) codes must be tuned and configured to match underlying hardware; (ii) schedulers must map parallel jobs to efficiently use heterogeneous resources; and (iii) mismatched workloads and machine configurations may leave hardware under-utilized.
Our work introduces ‘GPGPU Assemblies’, a software abstraction enabling the construction and maintenance of logical machine configurations, managed as ‘slices’ of high-performance clusters. A novel method for extracting and applying application profiles enables our ‘Shadowfax’ runtime to efficiently manage these abstractions. On Keeneland we demonstrate SHOC/S3D scaling up to 111 GPGPUs from a single machine, and up to 4.6x-5x improvements in cluster throughput on 48 and 72 nodes respectively, using configurations of LAMMPS, SHOC/S3D, and NAS-LU.