SC13 Home > SC13 Schedule > SC13 Presentation - Mr. Scan: Extreme Scale Density-Based Clustering Using a Tree-Based Network of GPGPU Nodes

SCHEDULE: NOV 16-22, 2013

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Mr. Scan: Extreme Scale Density-Based Clustering Using a Tree-Based Network of GPGPU Nodes

SESSION: Engineering Scalable Applications

EVENT TYPE: Papers

TIME: 4:30PM - 5:00PM

SESSION CHAIR: Andrew M. Canning

AUTHOR(S):Benjamin R. Welton, Evan H. Samanas, Barton P. Miller

ROOM:401/402/403

ABSTRACT:
Density-based clustering algorithms are a widely-used class of data mining techniques that can find irregularly shaped clusters and cluster data without prior knowledge of the number of clusters it contains. DBSCAN is the most well-known density-based clustering algorithm. We introduce our version of DBSCAN, called Mr. Scan, which uses a hybrid parallel implementation that combines the MRNet tree-based distribution network with GPGPU-equipped nodes. Mr. Scan avoids the problems of existing implementations by effectively partitioning the point space and by optimizing DBSCAN's computation over dense data regions. We tested Mr. Scan on both a geolocated Twitter dataset and image data obtained from the Sloan Digital Sky Survey. At its largest scale, Mr. Scan clustered 6.5 billion points from the Twitter dataset on 8,192 GPU nodes on Cray Titan in 17.3 minutes. All other parallel DBSCAN implementations have only demonstrated the ability to cluster up to 100 million points.

Chair/Author Details:

Andrew M. Canning (Chair) - Lawrence Berkeley National Laboratory

Benjamin R. Welton - University of Wisconsin - Madison

Evan H. Samanas - University of Wisconsin - Madison

Barton P. Miller - University of Wisconsin - Madison

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

The full paper can be found in the ACM Digital Library