SC13 Home > SC13 Schedule > SC13 Presentation - Scalable Parallel OPTICS Data Clustering Using Graph Algorithmic Techniques

SCHEDULE: NOV 16-22, 2013

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Scalable Parallel OPTICS Data Clustering Using Graph Algorithmic Techniques

SESSION: Graph Partitioning and Data Clustering

EVENT TYPE: Papers

TIME: 10:30AM - 11:00AM

SESSION CHAIR: Chung-Hsing Hsu

AUTHOR(S):Md. Mostofa Ali Patwary, Diana Palsetia, Ankit Agrawal, Wei-keng Liao, Fredrik Manne, Alok Choudhary

ROOM:401/402/403

ABSTRACT:
OPTICS is a hierarchical density-based data clustering algorithm that discovers arbitrarily-shaped clusters and eliminates noise using adjustable reachability distance thresholds. Parallelizing OPTICS is challenging as the algorithm exhibits a strongly sequential data access order. We present a scalable parallel OPTICS algorithm (POPTICS) designed using graph algorithmic concepts. To break the data access sequentiality, POPTICS exploits the similarities between the OPTICS algorithm and Prim's Minimum Spanning Tree algorithm. Additionally, we use the disjoint-set data structure to achieve a high parallelism for distributed cluster extraction. Using high dimensional datasets containing up to a billion floating point numbers, we show scalable speedups of up to 27.5 for our OpenMP implementation on a 40-core shared-memory machine, and up to 3,008 for our MPI implementation on a 4,096-core distributed-memory machine. We also show that the quality of the results given by POPTICS are comparable to those given by the classical OPTICS algorithm.

Chair/Author Details:

Chung-Hsing Hsu (Chair) - Oak Ridge National Laboratory

Md. Mostofa Ali Patwary - Northwestern University

Diana Palsetia - Northwestern University

Ankit Agrawal - Northwestern University

Wei-keng Liao - Northwestern University

Fredrik Manne - University of Bergen

Alok Choudhary - Northwestern University

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

The full paper can be found in the ACM Digital Library