P-SLURM: Power-Aware Job Scheduling on a Hardware-Overprovisioned HPC Cluster

While traditional supercomputers are worst-case provisioned and guarantee peak power to every node, next-generation supercomputers will require hardware-overprovisioning and intelligent power scheduling. Our results demonstrate that choosing the optimal configuration (number of nodes, number of cores per node, power per node) based on individual application characteristics on an overprovisioned cluster can improve performance under a power bound by up to 62% when compared to worst-case provisioning. In this work, we propose P-SLURM, a power-aware resource manager for an overprovisioned cluster. P-SLURM takes job-level power bounds into account to implement policies that choose job configurations based on metrics such as average turnaround time and utilization. In addition, it facilitates reallocation of power at runtime as jobs enter and leave the system. We extend SLURM's easy backfilling algorithm with seven power-aware policies, and present some early results obtained by using job traces gathered from Lawrence Livermore National Laboratory.

