Next Generation Sequencing: HPC Challenges and Cloud Based Solutions

Pharma companies are faced with unprecedented amounts of genetics data coming out of the public domain that they need to combine with their own sequencing datasets in order to gain better understanding of disease, its translation across species, and corresponding stratification of a given specie population. Most of today's public datasets are stored in the cloud, and it is almost impossible to bring all the data internally for analysis. Thus, new hybrid HPC solutions need to be implemented. This new cloud reality brings an interesting challenge to the risk-averse regulated pharmaceutical industry. The question is not "if" but rather "how" we run computing in the cloud and still protect patients and donors privacy (e.g., HIPAA, EU vs US legislations). This presentation will discuss "how" we got there, describing the challenges we faced, decisions we took, and solutions we built along the way to support our next-generation sequencing efforts.

