Browsing by study line "ingen studieinriktning"

Now showing items 41-60 of 233

Comparing and Understanding Machine Learning Models with Visualization Methods

Hyvärinen, Linda (2023)

With the increased usage of machine learning models in various tasks and domains, the demand of understanding the models is emphasized. However, often modern machine learning models are difficult to understand and therefore do not provoke trust. Models can be understood by revealing their inner logic with explanations, but explanations can be difficult to interpret for non-expert users. We introduce an interactive visual interface to help non-expert users to understand and compare machine learning models. The interface visualizes explanations for multiple models in order to help the user to understand how the models generate predictions and whether the predictions can be trusted. We also explore current research in explainable AI visualizations, in order to compare our prototype to comparable systems present in research. The contributions of this paper are a system description and a use case for an interactive visualization interface to compare and explain machine learning models, as well as providing an understanding of the current state of research in explainable AI visualization systems and recommendations for future studies. We conclude that our system enables efficient visualizations for regression models unlike the papers covered in our survey. Another conclusion is that the field lacks precise terminology.
Comparing descriptors for molecular clusters in unsupervised learning

Jääskeläinen, Matias (2020)

This thesis is about exploring descriptors for atmospheric molecular clusters. Descriptors are needed for applying machine learning methods for molecular systems. There is a collection of descriptors readily available in the DScribe-library developed in Aalto University for custom machine learning applications. The question of which descriptors to use is up to the user to decide. This study takes the first steps in integrating machine learning into existing procedure of configurational sampling that aims to find the optimal structure for any given molecular cluster of interest. The structure selection step forms a bottleneck in the configurational sampling procedure. A new structure selection method presented in this study uses k-means clustering to find structures that are similar to each other. The clustering results can be used to discard redundant structures more effectively than before which leaves fewer structures to be calculated with more expensive computations. Altogether that speeds up the configurational sampling procedure. To aid the selection of suitable descriptor for this application, a comparison of four descriptors available in DScribe is made. A procedure for structure selection by representing atmospheric clusters with descriptors and labeling them into groups with k-means was implemented. The performance of descriptors was compared with a custom score suitable for this application, and it was found that MBTR outperforms the other descriptors. This structure selection method will be utilized in the existing configurational sampling procedure for atmospheric molecular clusters but it is not restricted to that application.
Comparison of Interactive Visualization Techniques for Origin-Destination Data Exploration

Nissilä, Viivi (2020)

Origin-Destination (OD) data is a crucial part of price estimation in the aviation industry, and an OD flight is any number of flights a passenger takes in a single journey. OD data is a complex set of data that is both flow and multidimensional type of data. In this work, the focus is to design interactive visualization techniques to support user exploration of OD data. The thesis work aims to find which of the two menu designs suit better for OD data visualization: breadth-first or depth-first menu design. The two menus follow Schneiderman’s Task by Data Taxonomy, a broader version of the Information Seeking Mantra. The first menu design is a parallel, breadth-first menu layout. The layout shows the variables in an open layout and is closer to the original data matrix. The second menu design is a hierarchical, depth-first layout. This layout is derived from the semantics of the data and is more compact in terms of screen space. The two menu designs are compared in an online survey study conducted with the potential end users. The results of the online survey study are inconclusive, and therefore are complemented with an expert review. Both the survey study and expert review show that the Sankey graph is a good visualization type for this work, but the interaction of the two menu designs requires further improvements. Both of the menu designs received positive and negative feedback in the expert review. For future work, a solution that combines the positives of the two designs could be considered. ACM Computing Classification System (CCS): Human-Centered Computing → Visualization → Empirical Studies in Visualization Human-centered computing → Interaction design → Interaction design process and methods → Interface design prototyping
Comparison of Two Open Source Feature Stores for Explainable Machine Learning

Rahikainen, Tintti (2023)

Machine learning operations (MLOps) tools and practices help us continuously develop and de- ploy machine learning models as part of larger software systems. Explainable machine learning can support MLOps, and vice versa. The results of machine learning models are dependent on the data and features the models use, so understanding the features is important when we want to explain the decisions of the model. In this thesis, we aim to understand how feature stores can be used to help understand the features used by machine learning models. We compared two existing open source feature stores, Feast and Hopsworks, from an explainability point of view to explore how they can be used for explainable machine learning. We were able to use both Feast and Hopsworks to aid us in understanding the features we extracted from two different datasets. The feature stores have significant differences, Hopsworks being a part of a larger MLOps platform, and having more extensive functionalities. Feature stores provide useful tools for discovering and understanding the features for machine learning models. Hopsworks can help us understand the whole lineage of the data – where it comes from and how it has been transformed – while Feast focuses on serving the features consistently to models and needs complementing services to be as useful from an explainability point of view.
Computational studies of dibenzotetraaza[14]annulene Ni(II) and Zn(II) complexes

Pruikkonen, Sanni (2021)

Stacking of antiaromatic molecules leads to enhanced stability and higher conductivity due to reversed antiarotmaticity. It has been shown that cyclophenes consisting of antiaromatic Ni(II) norrcorrole subunits have a vertical current-density flux between the two metal ions. The Ni(II) meso- substituted dibenzotetraaza[14]annulene complex fulfills the Hückel rule for being antiaromatic. Upon increasing the temperature above 13 K, the effective magnetic moment of solid state Ni(II) meso-substituted dibenzotetraaza[14]annulene changes from being diamagnet to paramagnetic. A suggested explanation for this is that there might be weak interaction between the Ni atoms. In this study the possibility of the existence of vertical current-density flux between the two metal ions in the Ni(II) meso-substituted dibenzotetraaza[14]annulene is investigated. In addition, the effect of the Ni and N atoms in Ni(II) 1,5,9,13-tetraaza[16]annulene was studied by replacing Ni and Zn and N with O. Electronic motion in molecules that are under the influence of a magnetic field is investigated computationally, since at present there is no routine experimental method for doing that. TURBOMOLE, the Gauge-including Magnetically Induced Currents method and Paraview were employed in this study for structure optimization of the molecules, calculation of current-density flux and current strength in the molecules and visualisation of the current-density pathways respectively. The results of this study does not show any current transport between the subunits in the Ni(II) meso-substituted dibenzotetraaza[14]annulene complex. Both the Ni(II) 1,5,9,13-tetraaza[16]annulene and the Zn(II) 1,5,9,13-tetraaza[16]annulene are aromatic but they were not stacked due to their distorted structure. The (2Z,7Z,10Z,14Z)-1,9-dioxa-5,13-diazacyclohexadeca-2,7,10,14-tetraene-5,13-diide complexes with either Zn(II) or Ni(II) were both non-aromatic as well as the Ni(II) (2Z,7Z,10Z,14Z)-1,9-dioxa-5,13-diazacyclohexadeca-2,7,10,14-tetraene-5,13-diide dimer.
Conditional Neural Headline Generation for Finnish

Koppatz, Maximilian (2022)

Automatic headline generation has the potential to significantly assist editors charged with head- lining articles. Approaches to automation in the headlining process can range from tools as creative aids, to complete end to end automation. The latter is difficult to achieve as journalistic require- ments imposed on headlines must be met with little room for error, with the requirements depending on the news brand in question. This thesis investigates automatic headline generation in the context of the Finnish newsroom. The primary question I seek to answer is how well the current state of text generation using deep neural language models can be applied to the headlining process in Finnish news media. To answer this, I have implemented and pre-trained a Finnish generative language model based on the Transformer architecture. I have fine-tuned this language model for headline generation as autoregression of headlines conditioned on the article text. I have designed and implemented a variation of the Diverse Beam Search algorithm, with additional parameters, to perform the headline generation in order to generate a diverse set of headlines for a given text. The evaluation of the generative capabilities of this system was done with real world usage in mind. I asked domain-experts in headlining to evaluate a generated set of text-headline pairs. The task was to accept or reject the individual headlines in key criteria. The responses of this survey were then quantitatively and qualitatively analyzed. Based on the analysis and feedback, this model can already be useful as a creative aid in the newsroom despite being far from ready for automation. I have identified concrete improvement directions based on the most common types of errors, and this provides interesting future work.
Contrastive pretraining in discourse change detection

Lipsanen, Mikko (2022)

The thesis presents and evaluates a model for detecting changes in discourses in diachronic text corpora. Detecting and analyzing discourses that typically evolve over a period of time and differ in their manifestations in individual documents is a challenging task, and existing approaches like topic modeling are often not able to reach satisfactory results. One key problem is the difficulty of properly evaluating the results of discourse detection methods, due in large part to the lack of annotated text corpora. The thesis proposes a solution where synthetic datasets containing non-stable discourse patterns are generated from a corpus of news articles. Using the news categories as a proxy for discourses allows both to control the complexity of the data and to evaluate the model results based on the known discourse patterns. The complex task of extracting topics from texts is commonly performed using generative models, which are based on simplifying assumptions regarding the process of data generation. The model presented in the thesis explores instead the potential of deep neural networks, combined with contrastive learning, to be used for discourse detection. The neural network model is first trained using supervised contrastive loss function, which teaches the model to differentiate the input data based on the type of discourse pattern it belongs to. This pretrained model is then employed for both supervised and unsupervised downstream classification tasks, where the goal is to detect changes in the discourse patterns at the timepoint level. The main aim of the thesis is to find out whether contrastive pretraining can be used as a part of a deep learning approach to discourse change detection, and whether the information encoded into the model during contrastive training can generalise to other, closely related domains. The results of the experiments show that contrastive pretraining can be used to encode information that directly relates to its learning goal into the end products of the model, although the learning process is still incomplete. However, the ability of the model to generalise this information in a way that could be useful in the timepoint level classification tasks remains limited. More work is needed to improve the model performance, especially if it is to be used with complex real world datasets.
Cost-Effective Decision Making in Weather Routing using Machine Learning-generated Simulation Data

Zhao, Zhao (2023)

This thesis aims to offer a practical solution for making cost-effective decisions regarding weather routing deployment to optimize computational costs. The study focuses on developing three collaborative model components that collectively address the challenge of rerouting decision-making. Model 1 involves training a neural network-based Ship Performance Model, which forms the foundation for the weather routing model. Model 2 is centered around constructing a time-dependent path-finding model that integrates real-time weather forecasts. This model optimizes routing within a designated experimental area, generating simulation training samples. Model 3 utilizes the outcomes of Model 2 to train a practical machine learning decision-making model. This model seeks to address the question: should the weather routing system be activated and the route be adjusted based on updated weather forecasts? The integration of these models supports informed maritime decision-making. While these methods represent a preliminary step towards optimizing weather routing deployment frequencies, they hold the potential for enhancing operational efficiency and responsible resource usage in maritime sector.
Customer Segmentation with Subscription-based Online Media Customers

Haatanen, Henri (2022)

In the modern era, using personalization when reaching out to potential or current customers is essential for businesses to compete in their area of business. With large customer bases, this personalization becomes more difficult, thus segmenting entire customer bases into smaller groups helps businesses focus better on personalization and targeted business decisions. These groups can be straightforward, like segmenting solely based on age, or more complex, like taking into account geographic, demographic, behavioral, and psychographic differences among the customers. In the latter case, customer segmentation should be performed with Machine Learning, which can help find more hidden patterns within the data. Often, the number of features in the customer data set is so large that some form of dimensionality reduction is needed. That is also the case with this thesis, which includes 12802 unique article tags that are desired to be included in the segmentation. A form of dimensionality reduction called feature hashing is selected for hashing the tags for its ability to be introduced new tags in the future. Using hashed features in customer segmentation is a balancing act. With more hashed features, the evaluation metrics might give better results and the hashed features resemble more closely the unhashed article tag data, but with less hashed features the clustering process is faster, more memory-efficient and the resulting clusters are more interpretable to the business. Three clustering algorithms, K-means, DBSCAN, and BIRCH, are tested with eight feature hashing bin sizes for each, with promising results for K-means and BIRCH.
Data Platform for Accelerating Machine Learning Workflows on Fusion Data

Jurinec, Fran (2023)

This thesis explores the applicability of open-source tools on addressing the challenges of data-driven fusion research. The issue is explored through a survey of the fusion data ecosystem and exploration of possible data architectures, which were used to derive the goals and requirements of a proof-of-concept data platform. This platform, developed using open-source software, namely InvenioRDM and Apache Airflow, enabled transforming existing machine learning (ML) workloads into reusable data-generating workflows, and the cataloging of resulting clean ML datasets. Through a survey of the fusion data ecosystem, a set of challenges and goals was established for the development of a fusion data platform. It was identified that many of the challenges for data-driven research stem from a heterogeneous and geographically scattered source data layer combined with a monolithic approach to ML research. These challenges could be alleviated through improved ML infrastructure, for which two approaches were identified: a query-based approach, which offers more data retrieval flexibility but requires improvements in querying functionality and source data access speeds, and a persisted dataset approach, which uses a centralized workflow to collect and clean data, but requires additional storage resources. Additionally, by cataloging metadata in a central location it would be possible to combine data discovery across heterogeneous sources, combining the benefits of various infrastructure developments. Building on these identified goals and the metadata-driven platform architecture, a proof-of-concept data platform was implemented and examined through a case study. This implementation used InvenioRDM as a metadata catalog to index and provide a dashboard for discovering ML-ready datasets, and Apache Airflow as a workflow orchestration platform to manage the data collection workflows. The case study, grounded in real-world fusion ML research, showcased the platform's ability to convert existing ML workloads into reusable data-generating workflows and to publish clean ML datasets without introducing significant complexity into the research workflows.
Deep Learning Enhanced GNSS Tomography

Matakos, Alexandros (2024)

This thesis presents DeepGT, a 3D Convolutional Neural Network designed to enhance the spatial resolution of GNSS Tropospheric Tomography, a technique for estimating atmospheric water vapor distribution using GNSS signals. By utilizing Slant Wet Delays from dense GNSS networks and boundary meteorological data from Numerical Weather Prediction models, DeepGT refines low-resolution tomographic wet refractivity fields. The proposed method quadruples the horizontal resolution, while improving the accuracy of the tomographic reconstruction. Two experiments are conducted to validate this: one with real-world SWEPOS data and another with a hypothetical dense GNSS network. The results demonstrate the potential of deep learning models such as DeepGT in enhancing GNSS Meteorology, with implications for improved weather forecasting and climate studies.
Designing a Machine Learning Pipeline with Continuous Training for Time Series Forecasting

Koskinen, Jan (2024)

Machine Learning Operations (MLOps) emerged as a practice for applying DevOps practices and culture for machine learning (ML) systems to increase the speed and reliability of deployments. These practices include advocating for automation and monitoring at all steps of the ML system construction, including integration, testing, deployment, and infrastructure management. In addition to continuous integration (CI) and continuous delivery (CD), MLOps introduces continuous training (CT), which is unique to ML systems and is concerned with automatically training and serving ML models. Operating ML systems in production requires continuously adapting to the evolving input data. This is especially evident in time series data, which can experience frequent drifts. Moreover, implementing CT in practice is challenging and heavily dependent on the task and available data. Depending on the complexity of the model and the amount of data, the training process can be computationally costly. Using a scheduled interval for retraining is inefficient if the model still performs adequately. We designed an ML pipeline capable of efficient continuous training using an error-based trigger for retraining the model. The ML pipeline is designed for a time series forecasting task, where the data is prone to frequent drifts. We applied the design science research methodology to identify the problem, design and develop a solution artifact, and evaluate its utility and efficacy. The resulting solution utilizes an open-source MLOps platform that runs on Kubernetes. The solution includes a custom retrainer component to enable CT. We demonstrated the efficacy of the solution using real energy demand data from a university property in Finland. Our evaluation shows that the system is capable of efficient continuous training.
Designing an open-source cloud-native MLOps pipeline

Mäkinen, Sasu (2021)

Deploying machine learning models is found to be a massive issue in the field. DevOps and Continuous Integration and Continuous Delivery (CI/CD) has proven to streamline and accelerate deployments in the field of software development. Creating CI/CD pipelines in software that includes elements of Machine Learning (MLOps) has unique problems, and trail-blazers in the field solve them with the use of proprietary tooling, often offered by cloud providers. In this thesis, we describe the elements of MLOps. We study what the requirements to automate the CI/CD of Machine Learning systems in the MLOps methodology. We study if it is feasible to create a state-of-the-art MLOps pipeline with existing open-source and cloud-native tooling in a cloud provider agnostic way. We designed an extendable and cloud-native pipeline covering most of the CI/CD needs of Machine Learning system. We motivated why Machine Learning systems should be included in the DevOps methodology. We studied what unique challenges machine learning brings to CI/CD pipelines, production environments and monitoring. We analyzed the pipeline’s design, architecture, and implementation details and its applicability and value to Machine Learning projects. We evaluate our solution as a promising MLOps pipeline, that manages to solve many issues of automating a reproducible Machine Learning project and its delivery to production. We designed it as a fully open-source solution that is relatively cloud provider agnostic. Configuring the pipeline to fit the client needs uses easy-to-use declarative configuration languages (YAML, JSON) that require minimal learning overhead.
Design of an automated pipeline to improve the process of cross-platform mobile building and deployment

Laaja, Oskari (2022)

Mobile applications have become common and end-users expect to be able to use either of the major platforms: iOS or Android. The expectation of finding the application in their respected platform stores is strongly present. The process of publishing mobile applications into these application stores can be cumbersome. The frequency of mobile application updates can be damaged by the heaviness of the process, reducing the end-user satisfaction. As manually completed processes are prone to human errors, the robustness of the process decreases and the quality of the application may diminish. This thesis presents an automated pipeline to complete the process of publishing cross-platform mobile application into App Store and Play Store. The goal of this pipeline is to make the process faster to complete, more robust and more accessible to people without technical knowhow. The work was done with design science methodology. As results, two artifacts are generated from this thesis: a model of a pipeline design to improve the process and implementation of said model to functionally prove the possibility of the design. The design is evaluated against requirements set by the company for which the implementation was done. As a result, the process used in the project at which the implementation was taken into use got faster, simpler and became possible for non-development personnel to use.
Detecting Anomalies in GNSS Signals with Complex-valued LSTM Networks

Savolainen, Outi (2022)

Today, Global Navigation Satellite Systems (GNSS) provide services that many critical systems [1] as well as normal users, need in everyday life. These signals are threatened by unintentional and intentional interference. The received satellite signals are complex-valued by nature, however, state-of-the-art anomaly detection approaches operate in the real domain. Changing the anomaly detection into the complex domain allows for preserving the phase component of the signal data. In this thesis, I developed and tested a fully complex-valued Long Short-Term Memory (LSTM) based autoencoder for anomaly detection. I also developed a method for scaling of complex-numbers that forces both real and imaginary units into the range [-1,1] and does not change the direction of a complex vector. The model is trained and tested both in the time and frequency domains, and the frequency domain is divided into two parts: real and complex domain. The developed model’s training data consists only of clean sample data, and the output of the model is the reconstruction of the model’s input. In testing, it can be determined whether the output is clean or anomalous based on the reconstruction error and the computed threshold value. The results show that the autoencoder model in the real domain outperforms the model trained in the complex domain. This does not indicate that the anomaly detection in the complex domain does not work; rather, the model’s architecture needs improvements, and the amount of training data must be increased to reduce the overfitting of the complex domain and thus improve the anomaly detection capability. It was also investigated that some anomalous sample sequences contain a few large valued spikes while other values in the same data snapshot are smaller. After scaling, the values other than in the spikes get closer to zero. This phenomenon causes small reconstruction errors in the model and yields false predictions in the complex domain.
Detecting Bat Calls from Audio Recordings

Rannisto, Meeri (2020)

Bat monitoring is commonly based on audio analysis. By collecting audio recordings from large areas and analysing their content, it is possible estimate distributions of bat species and changes in them. It is easy to collect a large amount of audio recordings by leaving automatic recording units in nature and collecting them later. However, it takes a lot of time and effort to analyse these recordings. Because of that, there is a great need for automatic tools. We developed a program for detecting bat calls automatically from audio recordings. The program is designed for recordings that are collected from Finland with the AudioMoth recording device. Our method is based on a median clipping method that has previously shown promising results in the field of bird song detection. We add several modifications to the basic method in order to make it work well for our purpose. We use real-world field recordings that we have annotated to evaluate the performance of the detector and compare it to two other freely available programs (Kaleidoscope and Bat Detective). Our method showed good results and got the best F2-score in the comparison.
Detecting Collusive Behavior in Government Procurement in Indonesia: An Evaluation of Anomaly Detection Techniques

Unknown author (2023)

This study focused on detecting horizontal and vertical collusion within Indonesian government procurement processes, leveraging data-driven techniques and statistical methods. Regarding horizontal collusion, we applied clustering techniques to categorize companies based on their supply patterns, revealing clusters with similar bidding practices that may indicate potential collusion. Additionally, we identified patterns where specific supplier groups consistently won procurements, raising questions about potential competitive advantages or strategic practices that need further examination for collusion. For vertical collusion, we examined the frequency of associations between specific government employees and winning companies. While high-frequency collaborations were observed, it is essential to interpret these results with caution as they do not definitively indicate collusion, and legitimate factors might justify such associations. Despite revealing important patterns, the study acknowledges its limitations, including the representativeness of the dataset and the reliance on quantitative methods. Nevertheless, our findings carry substantial implications for enhancing procurement monitoring, strengthening anti-collusion regulations, and promoting transparency in Indonesian government procurement processes. Future research could enrich these findings by incorporating qualitative methods, exploring additional indicators of collusion, and leveraging machine learning techniques to detect collusion.
Detecting spatial patterns of land cover and methane fluxes with remote sensing in Pallastunturi, Finland

Rauth, Ella (2022)

Northern peatlands are a large source of methane (CH4) to the atmosphere and can vary strongly depending on local environmental conditions. However, few studies have mapped fine-grained CH4 fluxes at the landscape-level. The aim of this study was to predict land cover and CH4 flux patterns in Pallastunturi, Finland, in a study area dominated by forests, peatlands, fells, and lakes. I used random forest models to map land cover types and CH4 fluxes with multi-source remote sensing data and upscaled CH4 fluxes based on land cover maps. The random forest classifier reliably detected the same land cover patterns as the CORINE Land Cover maps. The main differences between the land cover maps were forest type classification, misclassification between neighboring peatland types, and detection of sparsely vegetated areas on fells. The upscaled CH4 fluxes of sinks were very robust to changes in land cover classification, but shrub tundra and peatland CH4 fluxes were sensitive to the level of detail in the land cover classification. The random forest regression performed well (NRMSE 6.6%, R2 82%) and predicted similar CH4 flux patterns as the upscaled CH4 flux maps, despite predicting larger areas that act as CH4 sources than the upscaled CH4 flux maps. The random forest regressor also better predicted CH4 fluxes in peatlands due to added information about soil moisture content from the remote sensing data. Random forests are a good model choice to detect landscape patterns and predict CH4 patterns in northern peatlands based on remote sensing and topographic data.
Determination of Amino Acids in Foods and Beverages

Sillanpää, Meri (2021)

The literature study of this thesis focuses on the different analytical methods used to analyse amino acids in food and beverage samples. Amino acids are essential organic molecules and their concentrations in foods and beverages constitute, inter alia, the product’s nutritional value, quality, freshness, and flavour. Amino acid analysis of foodstuff has various applications, which exploit several analytical methods. These reviewed methods are founded on academic articles published during the past two decades. This literature review discusses the different sample matrixes, sample preparation methods, ways to derivate analytes, and different separation and detection methods utilized in the recent amino acid studies. The experimental part of this thesis was a modification of L-asparagine and L-aspartic acid test (L-Asp/L-AspAc) in Thermo Fisher Scientific Oy industrial R&D laboratory. An enzymatic photometric method is used to determine L-Asp/L-AspAc amino acids in food samples. The modification process entailed pre-testing of several candidate methods, from which the most suitable one was selected. The feasibility of the chosen test was affirmed before verification and validation of the modified test.
Differentially Private Metropolis–Hastings Algorithms

Räisä, Ossi (2021)

Differential privacy has over the past decade become a widely used framework for privacy-preserving machine learning. At the same time, Markov chain Monte Carlo (MCMC) algorithms, particularly Metropolis-Hastings (MH) algorithms, have become an increasingly popular method of performing Bayesian inference. Surprisingly, their combination has not received much attention in the litera- ture. This thesis introduces the existing research on differentially private MH algorithms, proves tighter privacy bounds for them using recent developments in differential privacy, and develops two new differentially private MH algorithms: an algorithm using subsampling to lower privacy costs, and a differentially private variant of the Hamiltonian Monte Carlo algorithm. The privacy bounds of both new algorithms are proved, and convergence to the exact posterior is proven for the latter. The performance of both the old and the new algorithms is compared on several Bayesian inference problems, revealing that none of the algorithms is clearly better than the others, but subsampling is likely only useful to lower computational costs.

Now showing items 41-60 of 233

Browsing by study line "ingen studieinriktning"

Yhteystiedot

HELSINGIN YLIOPISTO