SC13 Home > SC13 Schedule > SC13 Presentation - Enabling Comprehensive Data-Driven System Management for Large Computational Facilities

SCHEDULE: NOV 16-22, 2013

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Enabling Comprehensive Data-Driven System Management for Large Computational Facilities

SESSION: Improving Large-Scale Computation and Data Resources

EVENT TYPE: Papers

TIME: 4:00PM - 4:30PM

SESSION CHAIR: Robert A. Ballance

AUTHOR(S):James C. Browne, Robert L. DeLeon, Charng-Da Lu, Thomas R. Furlani, Matthew D. Jones, Steven M. Gallo, Abani K. Patra, William L. Barth, John Hammond

ROOM:205/207

ABSTRACT:
This paper presents a comprehensive system meeting the information requirements of all stakeholders (users, application developers, consultants, systems administrators and system management) of large cluster computers. Accounting, scheduler and event logs are integrated with data from a new measurement tool (called TACC_Stats). TACC_Stats periodically records resource use including many hardware counters for each node executing for each job and by aggregation, system level metrics. Analysis of this data by the XDMOD reporting system generates many analyses and reports which have not previously been available for open-source, Linux-based software systems. This paper systematically identifies all of the information that is needed for effective execution of the role of each stakeholder class and generates example reports and discusses their value propositions. We believe this system to be the first fully comprehensive system for supporting the information needs of all of the stakeholders for open-source software based cluster systems.

Chair/Author Details:

Robert A. Ballance (Chair) - Sandia National Laboratories

James C. Browne - University of Texas at Austin

Robert L. DeLeon - SUNY at Buffalo

Charng-Da Lu - SUNY at Buffalo

Thomas R. Furlani - SUNY at Buffalo

Matthew D. Jones - SUNY at Buffalo

Steven M. Gallo - SUNY at Buffalo

Abani K. Patra - SUNY at Buffalo

William L. Barth - University of Texas at Austin

John Hammond - Intel Corporation

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

The full paper can be found in the ACM Digital Library