Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "data structures"

Sort by: Order: Results:

  • Kulbitski, Mikita (2023)
    Nowadays power consumption is a hot and actual domain. Efficient energy consumption allows you to use resources from the environment wisely and, moreover, switch to alternative energy sources where it is possible. This thesis is aimed at analyzing elevator’s power consumption data (average per every 5 minutes and 1 hour). The data has been gathered for several years, so it is a time series. This thesis includes review of time series models, which then can be used for the consequent analysis. Main directions are forecasting power consumption, capturing trends and anomalies. In addition, time series data may also be used for calculating average power consumption for each elevator inside the elevator group. As an outcome, spread of the power consumption across 4 elevators inside the elevator group may be seen. One of the thesis’ goals is to check whether it is even or not.
  • Heino, Lauri (2020)
    The suffix array is a space-efficient data structure that provides fast access to all occurrences of a search pattern in a text. Typically suffix arrays are queried with algorithms based on binary search. With a pre-computed index data structure that provides fast access to the relevant suffix array interval, querying can be sped-up, because the binary search process operates over a smaller interval. In this thesis a number different ways of implementing such an index data structure are studied, and the performance of each implementation is measured. Our experiments show that with relatively small data structures, one can reduce suffix array query times by almost 50%. There is a trade-off between the size of the data structure and the speed-up potential it offers.
  • Kinnunen, Lauri (2022)
    This thesis is a review of articles focusing on software assisted floor plan design for architecture. I group the articles into optimization, case based design, and machine learning, based on their use of prior examples. I then look into each category and further classify articles based on dimensions relevant to their overall approach. Case based design was a popular research field in the 1990s and early 2000s when several large research projects were conducted. However, since then the research has slowed down. Over the past 20 years, optimization methods to solve architectural floor plans have been researched extensively using a number of different algorithms and data models. The most popular approach is to use a stochastic optimization method such as a genetic algorithm or simulated annealing. More recently, a number of articles have investigated the possibility of using machine learning on architectural floor plans. The advent of neural networks and GAN models, in particular, has spurred a great deal of new research. Despite considerable research efforts, assisted floor plan design has not found its way into commercial applications. To aid industry adoption, more work is needed on the integration of computational design tools into the existing design workflows.
  • Kortelainen, Milla (2023)
    Sequence alignment is widely studied problem in the field of bioinformatics. The exact solution takes quadratic time to compute, and thus is not practical for long sequences. A number of heuristic approaches have been developed to conquer the quadratic time-complexity. This thesis reviews the average-case time analysis of two such heuristics, banded alignment by Ganesh and Sy in ''Near-Linear Time Edit Distance for Indel Channels'' WABI 2020, and seed-chain-extend by Shaw and Yu in ''Proving sequence aligners can guarantee accuracy in almost O(m log n) time through an average-case analysis of the seed-chain-extend heuristic'' Genome Research 2023. These heuristics reduce the quadratic average-case time complexity of the sequence alignment to log-linear. The approach of the thesis reviews is to outline the proofs of the original analysis, and provide supporting materials to aid the reader in studying the analysis. The experiments of this thesis compare four different approaches to compute the exact match anchors of the seed-chain-extend sequence alignment heuristic. A Bi-Directional Burrows-Wheeler Transformation (BDBWT), suffix tree based Mummer and Minimap2 based exact match anchors are computed. The anchors are then given to a chaining algorithm, to compare the performance of each anchoring technique. The qualities of the chains are compared using a Jaccard index applied to the sequences. The highest Jaccard index is obtained for the maximal exact match and the unique maximal exact match anchors of Mummer and BDBWT approaches. An increasing minimum length of the exact matches seem to increase the Jaccard index and reduce the running time of the chaining algorithm.
  • Kähkönen, Harri (2023)
    The volume of data generated by high-throughput DNA sequencing has grown to a magnitude that leads to substantial computational challenges in storing and searching the data. To tackle this problem, various computational methodologies have been developed in recent years to space-efficiently index collections of data sets and enable efficient searches. One of the most recent indexing methods, Spectral Burrows-Wheeler Transform (SBWT), presents all distinct k-mers of a DNA sequence using only 4 bits and a small additional space for the rank data structures per k-mer. In addition to being space-efficient, it also enables k-mer membership queries in linear time relative to k, and constant time relative to the number of distinct k-mers in the sequence. The queries rely on rank queries over bit vectors. Experiments run on a single CPU thread have shown that in one second, hundreds of thousands of k-mer membership queries can be performed over SBWT. By parallelizing the queries on a CPU, it is possible to execute millions of queries per second. However, Graphic Processing Units (GPUs) have much more parallelization potential. The main contribution of the thesis is an implementation of the k-mer membership queries over SBWT with GPU computing. Optimizing the queries to be performed on a GPU made it possible to perform over a billion queries per second. Furthermore, the thesis presents a new enhancement for the queries over SBWT called presearching, which doubles the speed of the original SBWT search query. The rank query needed for the membership queries is implemented using space-efficient poppy rank data structures, and its derivative cumulative-poppy data structure which is one of the contributions of the thesis.
  • Pulsa, Veikka (2023)
    The scholarship allotment problem describes the goal of strategically offering scholarships to prospective students of a university in a way that optimises the expected return for that investment. The goal of this thesis is to formulate the scholarship allotment problem in multiple variations of increasing complexity while also introducing algorithms to solve those variations optimally as efficiently as possible. The thesis also offers some insight into the way more complex variations and generalisations heighten the difficulty of finding an optimal solution in a reasonable amount of time. The main focus and the main tool used to tackle these problems is the classic knapsack algorithm and different variations of it, like multiple-choice knapsack and multidimensional knapsack. In addition to the theoretical side, the thesis contains an empirical study into the performance and feasibility of the algorithms introduced. Concrete implementations of the algorithms discussed are all available on a public GitHub repository online: https://github.com/SirVeggie/scholarship-allotment.