SC13 Denver, CO

The International Conference for High Performance Computing, Networking, Storage and Analysis

Building Scalable Data Management and Analysis Infrastructure for Metagenomics.


Authors: Wei Tang (Argonne National Laboratory), Jared Wilkening (Argonne National Laboratory), Jared Bischof (Argonne National Laboratory), Wolfgang Gerlach (Argonne National Laboratory), Andreas Wilke (Argonne National Laboratory), Narayan Desai (Argonne National Laboratory), Folker Meyer (Argonne National Laboratory)

Abstract: Next-generation sequencing technology has reduced the cost of DNA sequencing dramatically and shifted the bottleneck of metagenomics from data generation to data analysis. For example, MG-RAST, a free open-public metagenome annotation system, has been experiencing an increasingly large amount of data being submitted for analysis---a situation that threatens to overwhelm efficient production. To address this situation, we developed a pair of open-source software products: a data management system named Shock and a workflow management system named AWE. Shock and AWE can be used to build scalable infrastructure for biological sequence data management and analysis.

Poster: pdf
Two-page extended abstract: pdf


Poster Index