Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Author "Cauchi, Daniel"

Sort by: Order: Results:

  • Cauchi, Daniel (2023)
    Alignment in genomics is the process of finding the positions where DNA strings fit best with one another, that is, where there are the least differences if they were placed side by side. This process, however, remains very computationally intensive, even with more recent algorithmic advancements in the field. Pseudoalignment is emerging as a new method over full alignment as an inexpensive alternative, both in terms of memory needed as well as in terms of power consumption. The process is to instead check for the existence of substrings within the target DNA, and this has been shown to produce good results for a lot of use cases. New methods for pseudoalignment are still evolving, and the goal of this thesis is to provide an implementation that massively parallelises the current state of the art, Themisto, by using all resources available. The most intensive parts of the pipeline are put on the GPU. Meanwhile, the components which run on the CPU are heavily parallelised. Reading and writing of the files is also done in parallel, so that parallel I/O can also be taken advantage of. Results on the Mahti supercomputer, using an NVIDIA A100, shows a 10 times end-to-end querying speedup over the best run of Themisto, using half the CPU cores as Themisto, on the dataset used in this thesis.