Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Author "Lindell, Rony"

Sort by: Order: Results:

  • Lindell, Rony (2016)
    Next-generation sequencing has evolved during the past 10 years to become the go-to method for genome-wide analysis projects. Based on parallelizable PCR methods adopted from the traditional Sanger sequencing, NGS platforms can produce massive amounts of genetic information in a single run and read an entire DNA molecule within a day. The immense amount of nucleotide sequence data produced by a single sample has brought us to an era of algorithmic optimization for analysis and guring out parallelization schema. For cohort projects generally cloud based systems are used due to vast computing power requirements. Anduril is an integration and parallelization framework well suited for NGS analysis, as is shown in this study. After a brief review of the golden standard methods of NGS analysis, we describe the incorporation of the main tools into the new sequencing bundle for Anduril. Tools for alignment (BWA, Bowtie), recalibration (GATK, Picard-tools) and variant calling (GATK, Samtools, VarScan) are in main focus. The Best Practice of Broad Institute, creators of The Genome Analysis Toolkit (GATK), has been a big inspiration in the creation of our sequencing pipeline. The evolution of sequencing bundle tools into a pipeline is discussed through three separate project examples. First, a small group of 8 chronic myeloid leukemia patient samples were analysed after implementation of the main tools of the pipeline. The results were consistent with previous results, but no novel relevant mutations were found. Second, exome sequencing data from 180 breast cancers with controls available in TCGA (The Cancer Genome Atlas) were processed for use in various projects in our lab. The example showed the power of Anduril in gross cohort analysis projects, enabling automatic parallelization and intelligent work ow management system. Third, we analysed exome data from 330 TCGA ovarian cancers with controls and created a prototypical set of database components for creation of a database of annotated variants for use in analytical queries. Compared to other integration frameworks (e.g. GATK, Crossbow and Hadoop), Anduril is a robust contender for the programming oriented scientist. As cloud computing is becoming at an increasing rate a requirement in large genome-wide analysis projects, Anduril provides an e ective generalizable framework for adding tools, creating pipelines and executing entire work ows on multi-nodal computing servers. As technology advances and available computational resources grow, fast multi-processor analysis can be incorporated into health care more and more for detection of disease causing genes, medication kinetics altering polymorphisms and cancer driving mutations in an everyday setting.