Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "Machine Learning"

Sort by: Order: Results:

  • Louhi, Jarkko (2023)
    The rapid growth of artificial intelligence (AI) and machine learning (ML) solutions has created a need to develop, deploy and maintain AI/ML those to production reliably and efficiently. MLOps (Machine Learning Operations) framework is a collection of tools and practices that aims to address this challenge. Within the MLOps framework, a concept called the feature store is introduced, serving as a central repository responsible for storing, managing, and facilitating the sharing and reuse of extracted features derived from raw data. This study gives first an overview of the MLOps framework and delves deeper into feature engineering and feature data management, and explores the challenges related to these processes. Further, feature stores are presented, what they exactly are and what benefits do they introduce to organizations and companies developing ML solutions. The study also reviews some of the currently popular feature store tools. The primary goal of this study is to provide recommendations for organizations to leverage feature stores as a solution to the challenges they encounter in managing feature data currently. Through an analysis of the current state-of-the-art and a comprehensive study of organizations' practices and challenges, this research presents key insights into the benefits of feature stores in the context of MLOps. Overall, the thesis highlights the potential of feature stores as a valuable tool for organizations seeking to optimize their ML practices and achieve a competitive advantage in today's data-driven landscape. The research aims to explore and gather practitioners' experiences and opinions on the aforementioned topics through interviews conducted with experts from Finnish organizations.
  • Uvarova, Elizaveta (2024)
    Asteroids within our Solar System attract considerable attention for their potential impact on Earth and their role in elucidating the Solar System's formation and evolution. Understanding asteroids' composition is crucial for determining their origin and history, making spectral classification a cornerstone of asteroid categorization. Spectral classes, determined by asteroids' reflectance spectrum, offer insights into their surface composition. Early attempts at classification, predating 1973, utilized photometric observations in ultraviolet and visible wavelengths. The Chapman-McCord-Johnson classification system of 1973 marked the beginning of formal asteroid taxonomy, employing reflectance spectrum slopes for classification. Subsequent developments included machine learning techniques, such as principal component analysis and artificial neural networks, for improved classification accuracy. Gaia mission's Data Release 3 has significantly expanded asteroid datasets, allowing more extensive analyses. In this study, I examine the relationship between asteroid photometric slopes, spectra, and taxonomy using a feed-forward neural network trained on known spectral types to classify asteroids of unknown types. Our classification gained the mean accuracy of 80.4 ± 2.0 % over 100 iterations and separated successfully three asteroid taxonomic groups (C, S, and X) and the asteroid class D.
  • Alho, Riku (2021)
    Modularity is often used to manage the complexity of monolithic software systems. This is done through reducing maintenance costs by minimizing the entanglement in software code and functionality. Modularity also lowers future development costs through enabling the reuse and stacking of different types of modular functionality and software code for different environments and software engineering problems. Although there are important differences between the problem solving processes and practices of machine learning system developers and software engineering developers, machine learning system developers have been shown to be able to adopt a lot from traditional software engineering. A systematic literature review is used to identify 484 studies published in four electronic sources from January 1990 to October 2021. After examination of papers, statistical and qualitative results are formed for selected 86 studies which provide sufficient information regarding the presence of modular operators and comparison to monolithic solutions. The selected studies addressed a wide number of different tasks and domains, which saw performance benefits compared to monolithic machine learning and deep learning methods. Nearly two thirds of studies discovered Modular Neural Networks (MNNs) providing improvements in task accuracy when compared to monolithic solutions. Only 16,3\% of studies reported efficiency values in their comparisons. Over 82,5\% of studies that reported their MNNs efficiency found benefits in computation time, memory/size and energy consumption when compared to monolithic solutions. The majority of studies were carried out in laboratory environments on singular focused tasks and static requirements, which may have limited the visibility of modular operators. MNNs show positive promise for performance and efficiency in machine learning. More comparable studies are needed, especially from the industry, that use MMNs in constantly changing requirements and thus apply multiple modular operators.
  • Malmivirta, Titti (2020)
    Continuous thermal imaging is a way to measure psycho-physiological signals in humans. Psycho-physiological signals refer to physical signals caused by some psychological or mental situation or change. One of the technical challenges with the thermal psycho-physiological signal measurement is that the devices used to measure the temperature changes need to be sensitive and accurate enough to actually detect them. This is generally true for laboratory equipment, but the current relatively cheap and small mobile devices, including uncooled thermal cameras, could be used to make this kind of measurements cheaper, less intrusive and mobile. Currently the customer-priced mobile devices still tend to produce a lot of noise and other inaccuracies to measurements. The focus of this thesis is to evaluate the usefulness of the FLIR One thermal camera integrated in the Caterpillar Cat S60 phone for cognitive load measurement and the possibility to improve the measurement accuracy with additional calibration correction. We developed a deep learning based calibration correction method as an attempt to improve the quite noisy initial measurements of the thermal camera. Then an experiment measuring cognitive load was organised. The calibration correction method was used to reduce errors in the data from the cognitive load experiment to see if the performance of the thermal camera can be improved enough for accurate cognitive load detection. Our results show that while our calibration correction method does improve the measurement accuracy when compared to the ground truth, the fluctuations in the measurements do not decrease enough to improve the performance of the thermal camera with regards to the cognitive load sensing.
  • Muiruri, Dennis (2021)
    Ubiquitous sensing is transforming our societies and how we interact with our surrounding envi- ronment; sensors provide large streams of data while machine learning techniques and artificial intelligence provide the tools needed to generate insights from the data. These developments have taken place in almost every industry sector with topics such as smart cities and smart buildings becoming key topical issues as societies seek more sustainable ways of living. Smart buildings are the main context of this thesis. These are buildings equipped with various sensors used to collect data from the surrounding environment allowing the building to adapt itself and increasing its operational efficiency. Previously, most efforts in realizing smart buildings have focused on energy management and au- tomation where the goal is to improve costs associated with heating, ventilation, and air condi- tioning. A less studied area involves smart buildings and their indoor environments especially relative to sub-spaces within a building. Increased developments in low-cost sensor technologies have created new opportunities to sense indoor environments in more granular ways that provide new possibilities to model finer attributes of spaces within a building. This thesis focuses on modeling indoor environment data obtained from a multipurpose building that serves primarily as a school. The aim is to explore the quality of the indoor environment relative to regulatory guidelines and also exploring suitable predictive models for thermal comfort and indoor air quality. Additionally, design science methodology is applied in the creation of a proof of concept software system. This system is aimed at demonstrating the use of Web APIs to provide sensor data to clients that may use the data to render analytics among other insights to a building’s stakeholders. Overall, the main technical contributions of this thesis are twofold: (i) a potential web-application design for indoor air quality IoT data and (ii) an exposition of modeling of indoor air quality data based on a variety of sensors and multiple spaces within the same building. Results indicate a software-based tool that supports monitoring the indoor environment of a building would be beneficial in maintaining the correct levels of various indoor parameters. Further, modeling data from different spaces within the building shows a need for heterogeneous models to predict variables in these spaces. This implies parameters used to predict thermal comfort and air quality are different in varying spaces especially where the spaces differ in size, indoor climate control settings, and other attributes such as occupancy control.
  • Bolanos Mejia, Tlahui Alberto (2021)
    Credit rating is one of the core tools for risk management within financial firms. Ratings are usually provided by specialized agencies which perform an overall study and diagnosis on a given firm’s financial health. Dealing with unrated entities is a common problem, as several risk models rely on the ratings’ completeness, and agencies can not realistically rate every existing company. To solve this, credit rating prediction has been widely studied in academia. However, research in this topic tends to separate models amongst the different rating agencies due to the difference in both rating scales and composition. This work uses transfer learning, via label adaptation, to increase the number of samples for feature selection, and appends these adapted labels as an additional feature to improve the predictive power and stability of previously proposed methods. Accuracy on exact label prediction was improved from 0.30, in traditional models, up to 0.33 in the transfer learning setting. Furthermore, when measuring accuracy with a tolerance of 3 grade notches, accuracy increased almost 0.10, from 0.87 to 0.96. Overall, transfer learning displayed better out-of-sample generalization.