Skip to main content
Login | Suomeksi | På svenska | In English

Computational framework for systematic and scalable analysis of deep sequencing transcriptomics data

Show simple item record

dc.date.accessioned 2012-11-28T11:40:56Z und
dc.date.accessioned 2017-10-24T12:24:04Z
dc.date.available 2012-11-28T11:40:56Z und
dc.date.available 2017-10-24T12:24:04Z
dc.date.issued 2012-11-28T11:40:56Z
dc.identifier.uri http://radr.hulib.helsinki.fi/handle/10138.1/2177 und
dc.identifier.uri http://hdl.handle.net/10138.1/2177
dc.title Computational framework for systematic and scalable analysis of deep sequencing transcriptomics data en
ethesis.department.URI http://data.hulib.helsinki.fi/id/225405e8-3362-4197-a7fd-6e7b79e52d14
ethesis.department Institutionen för datavetenskap sv
ethesis.department Department of Computer Science en
ethesis.department Tietojenkäsittelytieteen laitos fi
ethesis.faculty Matematisk-naturvetenskapliga fakulteten sv
ethesis.faculty Matemaattis-luonnontieteellinen tiedekunta fi
ethesis.faculty Faculty of Science en
ethesis.faculty.URI http://data.hulib.helsinki.fi/id/8d59209f-6614-4edd-9744-1ebdaf1d13ca
ethesis.university.URI http://data.hulib.helsinki.fi/id/50ae46d8-7ba9-4821-877c-c994c78b0d97
ethesis.university Helsingfors universitet sv
ethesis.university University of Helsinki en
ethesis.university Helsingin yliopisto fi
dct.creator Cervera Taboada, Alejandra
dct.issued 2012
dct.language.ISO639-2 eng
dct.abstract High-throughput technologies have had a profound impact in transcriptomics. Prior to microarrays, measuring gene expression was not possible in a massively parallel way. As of late, deep RNA sequencing has been constantly gaining ground to microarrays in transcriptomics analysis. RNA-Seq promises several advantages over microarray technologies, but it also comes with its own set of challenges. Different approaches exist to tackle each of the required processing steps of the RNA-Seq data. The proposed solutions need to be carefully evaluated to find the best methods depending on the particularities of the datasets and the specific research questions that are being addressed. In this thesis I propose a computational framework that allows the efficient analysis of RNA-Seq datasets. The parallelization of tasks and organization of the data files was handled by the Anduril framework on which the workflow was implemented. Particular emphasis was bestowed on the quality control of the RNA-Seq files. Several measures were taken to prune the data of low quality bases and reads that hamper the alignment step. Furthermore, various existing processing algorithms for transcript assembly and abundance estimation were tested. The best methods have been coupled together into an automated pipeline that takes the raw reads and delivers expression matrices at isoform and gene level. Additionally, a module for obtaining sets of differentially expressed genes under different conditions or when measuring an experiment across a time course is included. en
dct.language en
ethesis.language.URI http://data.hulib.helsinki.fi/id/languages/eng
ethesis.language English en
ethesis.language englanti fi
ethesis.language engelska sv
ethesis.thesistype pro gradu-avhandlingar sv
ethesis.thesistype pro gradu -tutkielmat fi
ethesis.thesistype master's thesis en
ethesis.thesistype.URI http://data.hulib.helsinki.fi/id/thesistypes/mastersthesis
ethesis.degreeprogram Bioinformatics en
dct.identifier.urn URN:NBN:fi-fe2017112252164
dc.type.dcmitype Text

Files in this item

Files Size Format View
thesisMBItranscriptomics.pdf 1.258Mb PDF

This item appears in the following Collection(s)

Show simple item record