The International Conference for High Performance Computing, Networking, Storage and Analysis
Optimizing the Barnes-Hut Algorithm for Multicore Clusters.
Authors: Junchao Zhang (University of Illinois at Urbana-Champaign), Babak Behzad (University of Illinois at Urbana-Champaign), Marc Snir (University of Illinois at Urbana-Champaign)
Abstract: Nowadays all supercomputers are multicore clusters with one-sided communication support. It is important to explore new programming models for them. Partitioned global address space (PGAS) languages have drawn much attention in recent years. However, there are few application mapping studies. In this paper, through the Barnes-Hut algorithm, we demonstrate a PGAS + X programing paradigm specifically targeting multicore clusters.
The novelty is that we integrate intranode multithreading with internode one-sided communication. we decompose computation into tasks and hide network latency by descheduling tasks that are blocked on a remote access and reusing the core to run another, ready to execute task. We spawn multiple threads per process, which do either communication or computation. We show how to manage tasks and do thread synchronization. We are in progress of designing runtime abstractions.
We show details of the design, particularly the task management. We use experiment results to show benefits gained.