Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by master's degree program "Tietojenkäsittelytieteen maisteriohjelma"

Sort by: Order: Results:

  • Harhio, Säde (2022)
    The importance of software architecture design decisions has been known for almost 20 years. Knowledge vaporisation is a problem in many projects, especially in the current fast-paced culture, where developers often switch from project to another. Documenting software architecture design decisions helps developers understand the software better and make informed decisions in the future. However, documenting architecture design decisions is highly undervalued. It does not create any revenue in itself, and it is often the disliked and therefore neglected part of the job. This literature review explores what methods, tools and practices are being suggested in the scientific literature, as well as, what practitioners are recommending within the grey literature. What makes these methods good or bad is also investigated. The review covers the past five years and 36 analysed papers. The evidence gathered shows that most of the scientific literature concentrates on developing tools to aid the documentation process. Twelve out of nineteen grey literature papers concentrate on Architecture Decision Records (ADR). ADRs are small template files, which as a collection describe the architecture of the entire system. The ADRs appear to be what practitioners have become used to using over the past decade, as they were first introduced in 2011. What is seen as beneficial in a method or tool is low-cost and low-effort, while producing concise, good quality content. What is seen as a drawback is high-cost, high-effort and producing too much or badly organised content. The suitability of a method or tool depends on the project itself and its requirements.
  • Bankowski, Victor (2021)
    WebAssembly (WASM) is a binary instruction format for a stack-based virtual machine originally designed for the Web but also capable of being run on outside of the browser contexts. The WASM binary format is designed to be fast to transfer, load and execute. WASM programs are designed to be safe to execute by running them in a memory safe sandboxed environment. Combining dynamic linking with WebAssembly could allow the creation of adaptive modular applications that are cross-platform and sandboxed but still fast to execute and load. This thesis explores implementing dynamic linking in WebAssembly. Two artifacts are presented: a dynamic linking runtime prototype which exposes a POSIX-like host function interface for modules and an Android GUI interfacing prototype built on top of the runtime. In addition the results of measurements which were performed on both artefacts are presented. Dynamic linking does improve the memory usage and the startup time of applications when only some modules are needed. However if all modules are needed immediately then dynamic linked applications. perform worse than statically linked applications. Based on the results, dynamically linking WebAssembly modules could be a viable technology for PC and Android. The poor performance of A Raspberry Pi in the measurements indicates that dynamic linking might not be viable for resource contrained system especially if applications are performance critical.
  • Sinikallio, Laura (2022)
    Parlamentaaristen aineistojen digitointi ja rakenteistaminen tutkimuskäyttöön on nouseva tutkimuksenala, jonka tiimoilta esimerkiksi Euroopassa on tällä hetkellä käynnissä useita kansallisia hankkeita. Tämä tutkielma on osa Semanttinen parlamentti -hanketta, jossa Suomen eduskunnan täysistuntojen puheenvuorot saatetaan ensimmäistä kertaa yhtenäiseksi, harmonisoiduksi aineistoksi koneluettavaan muotoon aina eduskunnan alusta vuodesta 1907 nykypäivään. Puheenvuorot ja niihin liittyvät runsaat kuvailutiedot on julkaistu kahtena versiona, parlamentaaristen aineistojen kuvaamiseen käytetyssä Parla-CLARIN XML -formaatissa sekä linkitetyn avoimen datan tietämysverkkona, joka kytkee aineiston osaksi laajempaa kansallista tietoinfrastruktuuria. Yhtenäinen puheenvuoroaineisto tarjoaa ennennäkemättömiä mahdollisuuksia tarkastella suomalaista parlamentarismia yli sadan vuoden ajalta monisyisesti ja automatisoidusti. Aineisto sisältää lähes miljoona erillistä puheenvuoroa ja linkittyy tiiviisti eduskunnan toimijoiden biografisiin tietoihin. Tässä tutkielmassa kuvataan puheenvuorojen esittämistä varten kehitetyt tietomallit ja puheenvuoroaineistojen keräys- ja muunnosprosessi sekä tarkastellaan prosessin ja syntyneen aineiston haasteita ja mahdollisuuksia. Toteutetun aineistojulkaisun hyödyllisyyden arvioimiseksi on Parla-CLARIN-muotoista aineistoa jo hyödynnetty poliittiseen kulttuuriin liittyvässä digitaalisten ihmistieteiden tutkimuksessa. Linkitetyn datan pohjalta on kehitetty semanttinen portaali, Parlamenttisampo, aineistojen julkaisemista ja tutkimista varten verkossa.
  • Talonpoika, Ville (2020)
    In recent years, virtual reality devices have entered the mainstream with many gaming-oriented consumer devices. However, the locomotion methods utilized in virtual reality games are yet to gain a standardized form, and different types of games have different requirements for locomotion to optimize player experience. In this thesis, we compare some popular and some uncommon locomotion methods in different game scenarios. We consider their strengths and weaknesses in these scenarios from a game design perspective. We also create suggestions on which kind of locomotion methods would be optimal for different game types. We conducted an experiment with ten participants, seven locomotion methods and five virtual environments to gauge how the locomotion methods compare against each other, utilizing game scenarios requiring timing and precision. Our experiment, while small in scope, produced results we could use to construct useful guidelines for selecting locomotion methods for a virtual reality game. We found that the arm swinger was a favourite for situations where precision and timing was required. Touchpad locomotion was also considered one of the best for its intuitiveness and ease of use. Teleportation is a safe choice for games not requiring a strong feeling of presence.
  • Harjunpää, Jonas (2022)
    Ohjelmistotuotannon ammattilaiset tarvitsevat monenlaisia kompetensseja. Yksi näistä kompetensseista on kyky elinikäiseen oppimiseen, joka on tarpeellinen laajalla ja jatkuvasti muutoksessa olevalla alalla. ICT-aloille muodostuneen osaajatarpeen myötä elinikäisen oppimisen rooli onkin alkanut korostumaan entisestään. Tutkielman tarkoituksena on ollut lisätä ymmärrystä elinikäisen oppimisen roolista ohjelmistotuotannon ammattilaisen näkökulmasta. Tutkielmassa on pyritty tunnistamaan, mitä oppimisen muotoja hyödynnetään sekä millaisiin tarkoituksiin niitä käytetään, mitkä elinikäisen oppimisen kompetenssin osatekijät ovat tärkeitä sekä mitä haasteita elinikäiseen oppimiseen liittyy. Tutkimuksen aineisto on kerätty puolistrukturoiduilla haastatteluilla ohjelmistotuotannon ammattilaisten kanssa. Näiden haastattelujen tuloksia on verrattu tutkielmaa varten suoritetun kirjallisuuskatsauksen tuloksiin. Oppimisen muodoista informaalia oppimista hyödynnetään eniten ja erityisesti pienempiin oppimistarpeisiin. Nonformaalia ja formaalia oppimista taas hyödynnetään isompiin tarpeisiin, mutta harvemmin. Motivaatio, tiedonhaku ja metaoppiminen korostuvat keskeisinä elinikäisen oppimisen kompetenssin osatekijöinä. Ajanpuute ja itsensä motivoiminen mielletään yleisimmiksi haasteiksi elinikäistä oppimista koskien. Myös tiedonlähteisiin liittyvät puutteet sekä puutteellinen ymmärrys metaoppimisesta mielletään vaikeuttavan elinikäistä oppimista. Tutkielman havainnot tukevat elinikäisen oppimisen kompetenssin keskeistä roolia ohjelmistotuotannon ammattilaisilla. Kehitettävää löytyy kuitenkin vielä ohjelmistotuotannon ammattilaisten valmiuksista elinikäiseen oppimiseen, esimerkiksi metaoppimista koskien. Havainnot perustuvat kuitenkin lyhyemmän aikaa ohjelmistotuotannon ammattilaisina työskennelleiden kokemuksiin, joten lisää tutkimusta tarvitaan etenkin pitempään työskennelleiltä ohjelmistotuotannon ammattilaisilta.
  • Walder, Daniel (2021)
    Cloud vendors have many data centers around the world and offer in each data center possibilities to rent computational capacities with different prices depending on the needed power and time. Most vendors offer flexible pricing, where prices can change hourly, for instance, Amazon Web Services. According to those vendors, price changes depend highly on the current workload. The more workload, the pricier it is. In detail, this paper is about the offered spot services. To get the most potential out of this flexible pricing, we build a framework with the name ELMIT, which stands for Elastic Migration Tool. ELMIT’s job is to perform price forecasting and eventually perform migrations to cheaper data centers. In the end, we monitored seven spot instances with ELMIT’s help. For three instances no migration was needed, because no other data center was ever cheaper. However, for the other four instances ELMIT performed 38 automatic migrations within around 42 days. Around 160$ were saved. In detail, three out of four instances reduced costs by 14.35%, 4.73% and 39.6%. The fourth performed unnecessary migrations and cost at the end more money due to slight inaccuracies in the predictions. In total, around 50 cents more. Overall, the outcome of ELMIT’s monitoring job is promising. It gives reason to keep developing and improving ELMIT, to increase the outcome even more.
  • Ikonen, Eetu (2023)
    The maximum constraint satisfaction problem (MaxCSP) is a combinatorial optimization problem in which the set of feasible solutions is expressed using decision variables and constraints on how the variables can be assigned. It can be used to represent a wide range of other combinatorial optimization problems. The maximum satisfiability problem (MaxSAT) is a restricted variant of the maximum constraint satisfaction problem with the additional restrictions that all variables must be Boolean variables, and all constraints must be logical Boolean formulas. Because of this, expressing problems using MaxSAT can be unintuitive. The known solving methods for the MaxSAT problem are more efficient than the known solving methods for MaxCSP. Therefore, it is desirable to express problems using MaxSAT. However, every MaxCSP instance that only has finite-domain variables can be encoded into an equivalent MaxSAT instance. Encoding a MaxCSP instance to a MaxSAT instance allows users to combine the strengths of both approaches by expressing problems using the more intuitive MaxCSPs but solving them using the more efficient MaxSAT solving methods. In this thesis, we overview three common MaxCSP to MaxSAT encodings, the sparse, log, and order encodings, that differ in how they encode an integer variable into a set of Boolean variables. We use correlation clustering as a practical example for comparing the encodings. We first represent correlation clustering problems using MaxCSPs, and then encode them into MaxSATs instances. State-of-the-art MaxSAT solvers are then used to solve the MaxSAT instances. We compare the encodings by measuring the time it takes to encode a MaxCSP instance into a MaxSAT instance and the time it takes to solve the MaxSAT instance. The scope of our experiments is too small to draw general conclusions but in our experiments, the log encoding was the best overall choice.
  • Vidjeskog, Martin (2022)
    The traditional way of computing the Burrows-Wheeler transform (BWT) has been to first build a suffix array, and then use this newly formed array to obtain the BWT. While this approach runs in linear time, the space requirement is far from optimal. When the length of the input string increases, the required working space quickly becomes too large for normal computers to handle. To overcome this issue, researchers have proposed many different types of algorithms for building the BWT. In 2009, Daisuke Okanohara and Kunihiko Sadakane presented a new linear time algorithm for BWT construction. This algorithm is relatively fast and requires far less working space than the traditional way of computing the BWT. It is based on a technique called induced sorting and can be seen as a state-of-the-art approach for internal memory BWT construction. However, a proper exploration of how to implement the algorithm efficiently has not been undertaken. One 32-bit implementation of the algorithm is known to exist, but due to the limitations of 32-bit programs, it can only be used for input strings under the size of 4 GB. This thesis introduces the algorithm from Okanohara and Sadakane and implements a 64-bit version of it. The implemented algorithm can in theory support input strings that are thousands of gigabytes in size. In addition to the explanation of the algorithm, the time and space requirements of the 64-bit implementation are compared to some other fast BWT algorithms.
  • Luopajärvi, Kalle (2022)
    In independent component analysis the data is decomposed into its statistically independent components. In recent years, statistical models have been developed that solve a non-linear version of the independent component analysis. This thesis focuses on the estimation methods of a particular non-linear independent component analysis model called iVAE. It is shown on simulated data that the generative adversarial networks can significantly improve the iVAE model estimation compared with the previously used default iVAE estimation method. The improved model estimation might enable new applications for the iVAE model.
  • Luopajärvi, Kalle (2022)
    In independent component analysis the data is decomposed into its statistically independent components. In recent years, statistical models have been developed that solve a non-linear version of the independent component analysis. This thesis focuses on the estimation methods of a particular non-linear independent component analysis model called iVAE. It is shown on simulated data that the generative adversarial networks can significantly improve the iVAE model estimation compared with the previously used default iVAE estimation method. The improved model estimation might enable new applications for the iVAE model.
  • Laakso, Atte (2023)
    This thesis conducts a systematic literature review on ethical issues of large language models (LLM). These models are a very prudent topic, as both their presence and demand have skyrocketed since the release of ChatGPT - a free to use generative language model. The literature review of 116 studies, both conceptual and empirical, identifies 39 recurring ethical issues. The issues range from methodological to fundamental ones, for example Environmental impacts" and "Biased training data or outputs". These identified issues are analyzed based on the Ethics guidelines for trustworthy AI (Artificial Intelligence), released by the European Commission’s High-Level Expert Group on AI. The guidelines detail requirements that all trustworthy and ethical AI applications should adhere to, e.g., Human agency, Transparency, Accountability. All identified issues are mapped to these requirements, and the conclusion is that LLMs have significant challenges relating to each one. The findings indicate that the use LLMs comes with significant issues, both demonstrated and theorized. While some methods for mitigating these issues are identified, many still remain unanswered. One of these unanswered issues is the most identified one - inherent biases in LLMs. Since there is no universal understanding on biases, there is no way to make LLMs seem unbiased to everyone. This thesis collates the current talking points and issues identified with LLMs. It provides a comprehensive, but not exhaustive, list of these issues and shows that there is much discussion on the topic. The conclusion is that more discussion is required, but more vitally, even more (regulatory) action is needed along with it.
  • Pukkila, Eero (2022)
    Etuuspohjaisen eläkejärjestelyn laskennan tavoitteena on selvittää eläkevakuutuksen ottajan säästö- ja eläkesuunnitelmien yhteensopivuus ottaen samalla huomioon sopimukseen kuuluvat turvat ja muut kulut. Vapaaehtoisiin eläkesopimuksiin tehtyjen lakimuutosten seurauksena tällainen laskenta on monimutkaistunut huomattavasti 2000-luvun aikana ja vanhoille järjestelmille luodut laskentamallit eivät aina suoriudu toivotulla nopeudella. Tämän tutkielman aiheena on Profit Software Oy:n Profit Life & Pension -vakuutustenhallintajärjestelmän optimointi edellä kuvatun laskennan osalta.
  • Auvo, Markus (2022)
    As everyday life becomes digital, more and more daily things are done online. In particular, the increased use of mobile devices has accelerated this development. People are increasingly leaving information online about themselves that can be used to identify a person. On 25 May 2018, the European Union’s General Data Protection Regulation, the GDPR, was repealed in the European Union, repealing the previous European Union Data Protection Directive. The GDPR sets out how personal information should be stored and who can process it. The thesis examined how the introduction of GDPR has affected the customer data storage solutions and IT processes of Finnish SMEs during 2018-2020. The companies were examined in three phases: before, during and after the introduction of the GDPR. The study looked at the number of data breaches in the EU and the penalties imposed for them, and compared the situation in Finland. In addition, Finnish SMEs were interviewed for the dissertation. The interview was conducted as a questionnaire interview with 15 companies. The thesis found that Finland did not stand out in any way among other EU countries in GDPR violations. The answers received as a result of the survey revealed that there has been a clear variation in the interpretation of the content of the GDPR in Finland, which has affected the measures taken by companies. Based on the survey, the measures have also been influenced by the organization and organizational culture. However, the reliability of the results is affected by the small sample size.
  • Karis, Peter (2020)
    This thesis presents a user study to evaluate the usability and effectiveness of a novel search interface as compared to a more traditional solution. InnovationMap is a novel search interface by Khalil Klouche, Tuukka Ruotsalo and Giulio Jacucci (University of Helsinki). It is a tool for aiding the user to perform ‘exploratory searching’; a type of search activity where the user is exploring an information space unknown to them and thus cannot form a specific search phrase to perform a traditional ‘lookup’ search as with the conventional search interfaces. In this user study InnovationMap is compared against TUHAT, a search portal that is currently in use at the University of Helsinki for searching for research works and research personnel from the university databases. The user evaluation is conducted as a qualitative within-subject study using volunteer users from the University of Helsinki. Each participant uses both systems in an alternating order over the course of two sessions. During the two sessions the volunteer user carries out information finding tasks defined in the experiment design, answers to a SUS (System Usability Scale) questionnaire and participates in a semi-structured interview. The answers from the assigned tasks are then evaluated and scored by field experts. The combined results from these methods are then used to formulate an educated assessment of the usability, effectiveness and future development potential of the InnovationMap search system.
  • Aula, Kasimir (2019)
    Air pollution is considered to be one of the biggest environmental risks to health, causing symptoms from headache to lung diseases, cardiovascular diseases and cancer. To improve awareness of pollutants, air quality needs to be measured more densely. Low-cost air quality sensors offer one solution to increase the number of air quality monitors. However, they suffer from low accuracy of measurements compared to professional-grade monitoring stations. This thesis applies machine learning techniques to calibrate the values of a low-cost air quality sensor against a reference monitoring station. The calibrated values are then compared to a reference station’s values to compute error after calibration. In the past, the evaluation phase has been carried out very lightly. A novel method of selecting data is presented in this thesis to ensure diverse conditions in training and evaluation data, that would yield a more realistic impression about the capabilities of a calibration model. To better understand the level of performance, selected calibration models were trained with data corresponding to different levels of air pollution and meteorological conditions. Regarding pollution level, using homogeneous training and evaluation data, the error of a calibration model was found to be even 85% lower than when using diverse training and evaluation pollution environment. Also, using diverse meteorological training data instead of more homogeneous data was shown to reduce the size of the error and provide stability on the behavior of calibration models.
  • Joswig, Niclas (2021)
    Simultaneous Localization and Mapping (SLAM) research is gaining a lot of traction as the available computational power and the demand for autonomous vehicles increases. A SLAM system solves the problem of localizing itself during movement (Visual Odometry) and, at the same time, creating a 3D map of its surroundings. Both tasks can be solved on the basis of expensive and spacious hardware like LiDaRs and IMUs, but in this subarea of visual SLAM research aims at replacing those costly sensors by, ultimately, inexpensive monocular cameras. In this work I applied the current state-of-the-art in end-to-end deep learning-based SLAM to a novel dataset comprising of images recorded from cameras mounted to an indoor crane from the Konecranes CXT family. One major aspect that is unique about our proposed dataset is the camera angle that resembles a classical bird’s-eye view towards the ground. This orientation change coming alongside with a novel scene structure has a large impact on the subtask of mapping the environment, which is in this work done through monocular depth prediction. Furthermore, I will assess which properties of the given industrial environments have the biggest impact on the system’s performance to identify possible future research opportunities for improvement. The main performance impairments I examined, that are characteristic for most types of industrial premise, are non-lambertian surfaces, occlusion and texture-sparse areas alongside the ground and walls.
  • Joswig, Niclas (2021)
    Simultaneous Localization and Mapping (SLAM) research is gaining a lot of traction as the available computational power and the demand for autonomous vehicles increases. A SLAM system solves the problem of localizing itself during movement (Visual Odometry) and, at the same time, creating a 3D map of its surroundings. Both tasks can be solved on the basis of expensive and spacious hardware like LiDaRs and IMUs, but in this subarea of visual SLAM research aims at replacing those costly sensors by, ultimately, inexpensive monocular cameras. In this work I applied the current state-of-the-art in end-to-end deep learning-based SLAM to a novel dataset comprising of images recorded from cameras mounted to an indoor crane from the Konecranes CXT family. One major aspect that is unique about our proposed dataset is the camera angle that resembles a classical bird’s-eye view towards the ground. This orientation change coming alongside with a novel scene structure has a large impact on the subtask of mapping the environment, which is in this work done through monocular depth prediction. Furthermore, I will assess which properties of the given industrial environments have the biggest impact on the system’s performance to identify possible future research opportunities for improvement. The main performance impairments I examined, that are characteristic for most types of industrial premise, are non-lambertian surfaces, occlusion and texture-sparse areas alongside the ground and walls
  • Nieminen, Jeremi (2023)
    This thesis examines the render speeds of WebViews in React Native applications. React Native is a popular cross-platform framework for developing mobile applications, and WebViews allow embedding web content within mobile applications. While WebViews offer the advantage of bringing readily available web content in applications, the cost of using this technology in terms of applications responsiveness is not well researched. The goal of this thesis is to evaluate this cost so that developers and stakeholders can make more informed decisions regarding the use of WebViews in React Native applications. A series of tests was performed using a React Native application that was developed for the purpose of this study. In these tests, we rendered WebViews and similarly appearing views that consist of React Native components, and measured their mean render times. Our analysis of these results revealed that using React Native components instead of WebViews offers significant benefits in terms of rendering performance on both, iOS and Android platforms. The use of WebViews in rendering user interfaces can bring a notable disadvantage in the matter of user experience, especially on Android devices. These findings suggest that rendering Native user interface components instead of WebViews should be preferred if we want to maximize user experience across different devices and platforms.
  • Kangas, Vilma (2020)
    Software testing is an important process when ensuring a program's quality. However, testing has not traditionally been a very substantial part of computer science education. Some attempts to integrate it into the curriculum has been made but best practices still prove to be an open question. This thesis discusses multiple attempts of teaching software testing during the years. It also introduces CrowdSorcerer, a system for gathering programming assignments with tests from students. It has been used in introductory programming courses in University of Helsinki. To study if the students benefit from creating assignments with CrowdSorcerer, we analysed the number of assignments and tests they created and if they correlate with their performance in a testing-related question in the course exam. We also gathered feedback from the students on their experiences from using CrowdSorcerer. Looking at the results, it seems that more research on how to teach testing would be beneficial. Improving CrowdSorcerer would also be a good idea.
  • Ahlfors, Dennis (2022)
    While the role of IT and computer science in the society is on the rise, interest in computer science education is also on the rise. Research covering study success and study paths is important for understanding both student needs and developing the educational programmes further. Using a data set covering student records from 2010 to 2020, this thesis aims to find key insights and base research in the topic of computer science study success and study paths in the University of Helsinki. Using novel visualizations and descriptive statistics this thesis builds a picture of the evolution of study paths and student success during a 10-year timeframe, providing much needed contextual information to be used as inspiration for future focused research into the phenomena discovered. The visualizations combined with statistical results show that certain student groups seem to have better study success and that there are differences in the study paths chosen by the student groups. It is also shown that the graduation rates from the Bachelor’s Programme in Computer Science are generally low, with some student groups showing higher than average graduation rates. Time from admission to graduation is longer than suggested and the sample study paths provided by the university are not generally followed, leading to the conclusion that the programme structure would need some assessment to better incorporate students with diverse academic backgrounds and differing personal study plans.