SC13 Home > SC13 Schedule > SC13 Presentation - Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services

SCHEDULE: NOV 16-22, 2013

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services

SESSION: Fault-Tolerant Computing


TIME: 2:30PM - 3:00PM


AUTHOR(S):Ke Wang, Abhishek Kulkarni, Michael Lang, Dorian Arnold, Ioan Raicu


Owing to the significant high rate of component failures at extreme scales, system services will need to be failure-resistant, adaptive and self-healing. A majority of HPC services are still designed around a centralized paradigm and hence are susceptible to scaling issues. Peer-to-peer services have proved themselves at scale for wide-area internet workloads. Distributed key-value stores (KVS) are widely used as a building block for these services, but are not prevalent in HPC services. In this paper, we simulate KVS for various service architectures and examine the design trade-offs as applied to HPC service workloads to support extreme-scale systems. The simulator is validated against existing distributed KVS-based services. Via simulation, we demonstrate how failure, replication, and consistency models affect performance at scale. Finally, we emphasize the general use of KVS to HPC services by feeding real HPC service workloads into the simulator and presenting a KVS-based distributed job launch prototype.

Chair/Author Details:

Pavan Balaji (Chair) - Argonne National Laboratory

Ke Wang - Illinois Institute of Technology

Abhishek Kulkarni - Indiana University

Michael Lang - Los Alamos National Laboratory

Dorian Arnold - University of New Mexico

Ioan Raicu - Illinois Institute of Technology

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

The full paper can be found in the ACM Digital Library