Browsing by Subject "machine learning"

Now showing items 1-20 of 77

A Data-driven Approach to Assessing Factors Affecting Food Safety Compliance of Finnish Food Business Operators

Viitanen, Pauliina (2023)

The field of food control is currently facing challenges due to phenomena such as climate change, globalization of food, and scarcity of official control resources. One important approach in improving efficient control of the safety and quality of food chains is risk-based food control. In this study, data-driven approaches were utilized to provide insights into some of the factors that have affected Finnish food business operators’ (FBOs’) compliance with food safety requirements in recent years. Qlik Sense, a data analytics solution built on the concept of visual analytics, and Python, a programming language suitable for data analysis, were used to analyze food inspection data recorded by local food safety authorities. Interactive data visualizations built in Qlik Sense aimed in supporting open government data (OGD) policies and risk-based control of the Finnish food chain, and so far, received good feedback from end-users. Additional insights into FBOs’ food safety compliance were gained by further analyzing the inspection data in Python along with municipal data. A logistic regression model fit to a subset of the study data found multiple statistically significant predictor variables from both datasets, but its performance was weak. The factors that affected food safety compliance most significantly were the number of years an FBO had operated and the basis for conducting an inspection. Operating years showed a positive correlation with compliance, while a negative relationship was observed with a variety of unplanned inspections, especially when they were conducted based on food poisoning suspects, inspection requests, or some other forms of contact. Out of all inspected food sectors, the one that increased the odds of compliance the most was the food transportation sector. The results of the study advocate for the potential of data-driven approaches in improving risk-based food control, as they are an effective way to gain insights into factors affecting the safety and quality of complex food chains.
A Data-driven Approach to Assessing Factors Affecting Food Safety Compliance of Finnish Food Business Operators

Viitanen, Pauliina (2023)

The field of food control is currently facing challenges due to phenomena such as climate change, globalization of food, and scarcity of official control resources. One important approach in improving efficient control of the safety and quality of food chains is risk-based food control. In this study, data-driven approaches were utilized to provide insights into some of the factors that have affected Finnish food business operators’ (FBOs’) compliance with food safety requirements in recent years. Qlik Sense, a data analytics solution built on the concept of visual analytics, and Python, a programming language suitable for data analysis, were used to analyze food inspection data recorded by local food safety authorities. Interactive data visualizations built in Qlik Sense aimed in supporting open government data (OGD) policies and risk-based control of the Finnish food chain, and so far, received good feedback from end-users. Additional insights into FBOs’ food safety compliance were gained by further analyzing the inspection data in Python along with municipal data. A logistic regression model fit to a subset of the study data found multiple statistically significant predictor variables from both datasets, but its performance was weak. The factors that affected food safety compliance most significantly were the number of years an FBO had operated and the basis for conducting an inspection. Operating years showed a positive correlation with compliance, while a negative relationship was observed with a variety of unplanned inspections, especially when they were conducted based on food poisoning suspects, inspection requests, or some other forms of contact. Out of all inspected food sectors, the one that increased the odds of compliance the most was the food transportation sector. The results of the study advocate for the potential of data-driven approaches in improving risk-based food control, as they are an effective way to gain insights into factors affecting the safety and quality of complex food chains.
Affect Detection in Daily Life Using Machine Learning and Wearable Devices

Lohilahti, Jonne Antti Kristian (2022)

Tavoitteet. Tämän tutkimuksen tavoitteena on arvioida tunteiden havaitsemisen mahdollisuutta arkielämässä puettavien laitteiden ja koneoppimismallien avulla. Tunnetiloilla on tärkeä rooli päätöksenteossa, havaitsemisessa ja käyttäytymisessä, mikä tekee objektiivisesta tunnetilojen havaitsemisesta arvokkaan tavoitteen, sekä mahdollisten sovellusten että tunnetiloja koskevan ymmärryksen syventämisen kannalta. Tunnetiloihin usein liittyy mitattavissa olevia fysiologisia ja käyttäymisen muutoksia, mikä mahdollistaa koneoppimismallien kouluttamisen muutoksia aiheuttaneen tunnetilan havaitsemiseksi. Suurin osa tunteiden havaitsemiseen liittyvästä tutkimuksesta on toteutettu laboratorio-olosuhteissa käyttämällä tunteita herättäviä ärsykkeitä tai tehtäviä, mikä herättää kysymyksen siitä että yleistyvätkö näissä olosuhteissa saadut tulokset arkielämään. Vaikka puettavien laitteiden ja kännykkäkyselyiden kehittyminen on helpottanut aiheen tutkimista arkielämässä, tutkimusta tässä ympäristössä on vielä niukasti. Tässä tutkimuksessa itseraportoituja tunnetiloja ennustetaan koneoppimismallien avulla arkielämässä havaittavissa olevien tunnetilojen selvittämiseksi. Lisäksi tutkimuksessa käytetään mallintulkintamenetelmiä mallien hyödyntämien yhteyksien tunnistamiseksi. Metodit. Aineisto tätä tutkielmaa varten on peräisin tutkimuksesta joka suoritettiin osana Helsingin Yliopiston ja VTT:n Sisu at Work projektia, missä 82:ta tietotyöläistä neljästä suomalaisesta organisaatiosta tutkittiin kolmen viikon ajan. Osallistujilla oli jakson aikana käytettävissään mittalaitteet jotka mittasivat fotoplethysmografiaa (PPG), ihon sähkönjohtavuutta (EDA) ja kiihtyvyysanturi (ACC) signaaleita, lisäksi heille esitettiin kysymyksiä koetuista tunnetiloista kolmesti päivässä puhelinsovelluksen avulla. Signaalinkäsittelymenetelmiä sovellettiin signaaleissa esiintyvien liikeartefaktien ja muiden ongelmien korjaamiseksi. Sykettä (HR) ja sykevälinvaihtelua (HRV) kuvaavia piirteitä irroitettiin PPG signaalista, fysiologista aktivaatiota kuvaavia piirteitä EDA signaalista, sekä liikettä kuvaavia piirteitä ACC signaalista. Seuraavaksi koneoppimismalleja koulutettiin ennustamaan raportoituja tunnetiloja irroitetujen piirteiden avulla. Mallien suoriutumista vertailtiin suhteessa odotusarvoihin havaittavissa olevien tunnetilojen määrittämiseksi. Lisäksi permutaatiotärkeyttä sekä Shapley additive explanations (SHAP) arvoja hyödynnettiin malleille tärkeiden yhteyksien selvittämiseksi. Tulokset ja johtopäätökset. Mallit tunnetiloille virkeä, keskittynyt ja innostunut paransivat suoriutumistaan yli odotusarvon, joista mallit tunnetilalle virkeä paransivat suoriutumista tilastollisesti merkitsevästi. Permutaatiotärkeys korosti liike- ja HRV-piirteiden merkitystä, kun SHAP arvojen tarkastelu nosti esiin matalan liikkeen, matalan EDA:n, sekä korkean HRV:n merkityksen mallien ennusteille. Nämä tulokset ovat lupaavia korkean aktivaation positiivisten tunnetilojen havaitsemiselle arkielämässä, sekä nostavat esiin mahdollisia yhteyksiä jatkotutkimusta varten.
Affect Detection in Daily Life Using Machine Learning and Wearable Devices

Lohilahti, Jonne Antti Kristian (2022)

Tavoitteet. Tämän tutkimuksen tavoitteena on arvioida tunteiden havaitsemisen mahdollisuutta arkielämässä puettavien laitteiden ja koneoppimismallien avulla. Tunnetiloilla on tärkeä rooli päätöksenteossa, havaitsemisessa ja käyttäytymisessä, mikä tekee objektiivisesta tunnetilojen havaitsemisesta arvokkaan tavoitteen, sekä mahdollisten sovellusten että tunnetiloja koskevan ymmärryksen syventämisen kannalta. Tunnetiloihin usein liittyy mitattavissa olevia fysiologisia ja käyttäymisen muutoksia, mikä mahdollistaa koneoppimismallien kouluttamisen muutoksia aiheuttaneen tunnetilan havaitsemiseksi. Suurin osa tunteiden havaitsemiseen liittyvästä tutkimuksesta on toteutettu laboratorio-olosuhteissa käyttämällä tunteita herättäviä ärsykkeitä tai tehtäviä, mikä herättää kysymyksen siitä että yleistyvätkö näissä olosuhteissa saadut tulokset arkielämään. Vaikka puettavien laitteiden ja kännykkäkyselyiden kehittyminen on helpottanut aiheen tutkimista arkielämässä, tutkimusta tässä ympäristössä on vielä niukasti. Tässä tutkimuksessa itseraportoituja tunnetiloja ennustetaan koneoppimismallien avulla arkielämässä havaittavissa olevien tunnetilojen selvittämiseksi. Lisäksi tutkimuksessa käytetään mallintulkintamenetelmiä mallien hyödyntämien yhteyksien tunnistamiseksi. Metodit. Aineisto tätä tutkielmaa varten on peräisin tutkimuksesta joka suoritettiin osana Helsingin Yliopiston ja VTT:n Sisu at Work projektia, missä 82:ta tietotyöläistä neljästä suomalaisesta organisaatiosta tutkittiin kolmen viikon ajan. Osallistujilla oli jakson aikana käytettävissään mittalaitteet jotka mittasivat fotoplethysmografiaa (PPG), ihon sähkönjohtavuutta (EDA) ja kiihtyvyysanturi (ACC) signaaleita, lisäksi heille esitettiin kysymyksiä koetuista tunnetiloista kolmesti päivässä puhelinsovelluksen avulla. Signaalinkäsittelymenetelmiä sovellettiin signaaleissa esiintyvien liikeartefaktien ja muiden ongelmien korjaamiseksi. Sykettä (HR) ja sykevälinvaihtelua (HRV) kuvaavia piirteitä irroitettiin PPG signaalista, fysiologista aktivaatiota kuvaavia piirteitä EDA signaalista, sekä liikettä kuvaavia piirteitä ACC signaalista. Seuraavaksi koneoppimismalleja koulutettiin ennustamaan raportoituja tunnetiloja irroitetujen piirteiden avulla. Mallien suoriutumista vertailtiin suhteessa odotusarvoihin havaittavissa olevien tunnetilojen määrittämiseksi. Lisäksi permutaatiotärkeyttä sekä Shapley additive explanations (SHAP) arvoja hyödynnettiin malleille tärkeiden yhteyksien selvittämiseksi. Tulokset ja johtopäätökset. Mallit tunnetiloille virkeä, keskittynyt ja innostunut paransivat suoriutumistaan yli odotusarvon, joista mallit tunnetilalle virkeä paransivat suoriutumista tilastollisesti merkitsevästi. Permutaatiotärkeys korosti liike- ja HRV-piirteiden merkitystä, kun SHAP arvojen tarkastelu nosti esiin matalan liikkeen, matalan EDA:n, sekä korkean HRV:n merkityksen mallien ennusteille. Nämä tulokset ovat lupaavia korkean aktivaation positiivisten tunnetilojen havaitsemiselle arkielämässä, sekä nostavat esiin mahdollisia yhteyksiä jatkotutkimusta varten.
A histochemical double staining protocol for GABA-α2 and CAMKII in hippocampal neurons using PGC-1α transgenic mice as an example

Koivisto, Maria (2020)

Immunohistochemistry (IHC) is a widely used research tool for detecting antigens and can be used in medical and biochemical research. The co-localization of two separate proteins is sometimes crucial for analysis, requiring a double staining. This comes with a number of challenges since staining results depend on the pre-treatment of samples, host-species where the antibody was raised and spectral differentiation of the two proteins. In this study, the proteins GABAR-α2 and CAMKII were stained simultaneously to study the expression of the GABA receptor in hippocampal pyramidal cells. This was performed in PGC-1α transgenic mice, possibly expressing GABAR-α2 excessively compared to wildtype mice. Staining optimization was performed regarding primary and secondary antibody concentration, section thickness, antigen retrieval and detergent. Double staining was performed successfully and proteins of interest were visualized using a confocal microscope after which image analyses were performed using two different methods: 1) a traditional image analysis based on intensity and density of stained dots and 2) a novel convolutional neural network (CNN) machine learning approach. The traditional image analysis did not detect any differences in the stained brain slices, whereas the CNN model showed an accuracy of 72% in categorizing the images correctly as transgenic/wildtype brain slices. The results from the CNN model imply that GABAR-α2 is expressed differently in PGC-1α transgenic mice, which might impact other factors such as behaviour and learning. This protocol and the novel method of using CNN as an image analysis tool can be of future help when performing IHC analysis on brain neuronal studies.
A histochemical double staining protocol for GABA-α2 and CAMKII in hippocampal neurons using PGC-1α transgenic mice as an example

Koivisto, Maria (2020)

Immunohistochemistry (IHC) is a widely used research tool for detecting antigens and can be used in medical and biochemical research. The co-localization of two separate proteins is sometimes crucial for analysis, requiring a double staining. This comes with a number of challenges since staining results depend on the pre-treatment of samples, host-species where the antibody was raised and spectral differentiation of the two proteins. In this study, the proteins GABAR-α2 and CAMKII were stained simultaneously to study the expression of the GABA receptor in hippocampal pyramidal cells. This was performed in PGC-1α transgenic mice, possibly expressing GABAR-α2 excessively compared to wildtype mice. Staining optimization was performed regarding primary and secondary antibody concentration, section thickness, antigen retrieval and detergent. Double staining was performed successfully and proteins of interest were visualized using a confocal microscope after which image analyses were performed using two different methods: 1) a traditional image analysis based on intensity and density of stained dots and 2) a novel convolutional neural network (CNN) machine learning approach. The traditional image analysis did not detect any differences in the stained brain slices, whereas the CNN model showed an accuracy of 72% in categorizing the images correctly as transgenic/wildtype brain slices. The results from the CNN model imply that GABAR-α2 is expressed differently in PGC-1α transgenic mice, which might impact other factors such as behaviour and learning. This protocol and the novel method of using CNN as an image analysis tool can be of future help when performing IHC analysis on brain neuronal studies.
Analysing Finnish Multi-Word Expressions with Word Embeddings

Itkonen, Sami (2020)

Sanayhdistelmät ovat useamman sanan kombinaatioita, jotka ovat jollakin tavalla jähmeitä ja/tai idiomaattisia. Tutkimuksessa tarkastellaan suomen kielen verbaalisia idiomeja sanaupotusmenetelmän (word2vec) avulla. Työn aineistona käytetään Gutenberg-projektista haettuja suomenkielisiä kirjoja. Työssä tutkitaan pääosin erityisesti idiomeja, joissa esiintyy suomen kielen sana ‘silmä’. Niiden idiomaattisuutta mitataan komposiittisuuden (kuinka hyvin sanayhdistelmän merkitys vastaa sen komponenttien merkitysten kombinaatiota) ja jähmeyttä leksikaalisen korvaustestin avulla. Vastaavat testit tehdään myös sanojen sisäisen rakenteen huomioonottavan fastText-algoritmin avulla. Työssä on myös luotu Gutenberg-korpuksen perusteella pienehkö luokiteltu lausejoukko, jota lajitellaan neuroverkkopohjaisen luokittelijan avulla. Tämä lisäksi työssä tunnustellaan eri ominaisuuksien kuten sijamuodon vaikutusta idiomin merkitykseen. Mittausmenetelmien tulokset ovat yleisesti ottaen varsin kirjavia. fastText-algoritmin suorituskyky on yleisesti ottaen hieman parempi kuin perusmenetelmän; sen lisäksi sanaupotusten laatu on parempi. Leksikaalinen korvaustesti antaa parhaimmat tulokset, kun vain lähin naapuri otetaan huomioon. Sijamuodon todettiin olevan varsin tärkeä idiomin merkityksen määrittämiseen. Mittauksien heikot tulokset voivat johtua monesta tekijästä, kuten siitä, että idiomien semanttisen läpinäkyvyyden aste voi vaihdella. Sanaupotusmenetelmä ei myöskään normaalisti ota huomioon sitä, että myös sanayhdistelmillä voi olla useita merkityksiä (kirjaimellinen ja idiomaattinen/kuvaannollinen). Suomen kielen rikas morfologia asettaa menetelmälle myös ylimääräisiä haasteita. Tuloksena voidaan sanoa, että sanaupotusmenetelmä on jokseenkin hyödyllinen suomen kielen idiomien tutkimiseen. Testattujen mittausmenetelmien käyttökelpoisuus yksin käytettynä on rajallinen, mutta ne saattaisivat toimia paremmin osana laajempaa tutkimusmekanismia.
Analysing Finnish Multi-Word Expressions with Word Embeddings

Itkonen, Sami (2020)

Sanayhdistelmät ovat useamman sanan kombinaatioita, jotka ovat jollakin tavalla jähmeitä ja/tai idiomaattisia. Tutkimuksessa tarkastellaan suomen kielen verbaalisia idiomeja sanaupotusmenetelmän (word2vec) avulla. Työn aineistona käytetään Gutenberg-projektista haettuja suomenkielisiä kirjoja. Työssä tutkitaan pääosin erityisesti idiomeja, joissa esiintyy suomen kielen sana ‘silmä’. Niiden idiomaattisuutta mitataan komposiittisuuden (kuinka hyvin sanayhdistelmän merkitys vastaa sen komponenttien merkitysten kombinaatiota) ja jähmeyttä leksikaalisen korvaustestin avulla. Vastaavat testit tehdään myös sanojen sisäisen rakenteen huomioonottavan fastText-algoritmin avulla. Työssä on myös luotu Gutenberg-korpuksen perusteella pienehkö luokiteltu lausejoukko, jota lajitellaan neuroverkkopohjaisen luokittelijan avulla. Tämä lisäksi työssä tunnustellaan eri ominaisuuksien kuten sijamuodon vaikutusta idiomin merkitykseen. Mittausmenetelmien tulokset ovat yleisesti ottaen varsin kirjavia. fastText-algoritmin suorituskyky on yleisesti ottaen hieman parempi kuin perusmenetelmän; sen lisäksi sanaupotusten laatu on parempi. Leksikaalinen korvaustesti antaa parhaimmat tulokset, kun vain lähin naapuri otetaan huomioon. Sijamuodon todettiin olevan varsin tärkeä idiomin merkityksen määrittämiseen. Mittauksien heikot tulokset voivat johtua monesta tekijästä, kuten siitä, että idiomien semanttisen läpinäkyvyyden aste voi vaihdella. Sanaupotusmenetelmä ei myöskään normaalisti ota huomioon sitä, että myös sanayhdistelmillä voi olla useita merkityksiä (kirjaimellinen ja idiomaattinen/kuvaannollinen). Suomen kielen rikas morfologia asettaa menetelmälle myös ylimääräisiä haasteita. Tuloksena voidaan sanoa, että sanaupotusmenetelmä on jokseenkin hyödyllinen suomen kielen idiomien tutkimiseen. Testattujen mittausmenetelmien käyttökelpoisuus yksin käytettynä on rajallinen, mutta ne saattaisivat toimia paremmin osana laajempaa tutkimusmekanismia.
Analysis and implementation of quantum Boltzmann machines

Lehtonen, Leevi (2021)

Quantum computing has an enormous potential in machine learning, where problems can quickly scale to be intractable for classical computation. A Boltzmann machine is a well-known energy-based graphical model suitable for various machine learning tasks. Plenty of work has already been conducted for realizing Boltzmann machines in quantum computing, all of which have somewhat different characteristics. In this thesis, we conduct a survey of the state-of-the-art in quantum Boltzmann machines and their training approaches. Primarily, we examine variational quantum Boltzmann machine, a specific variant of quantum Boltzmann machine suitable for the near-term quantum hardware. Moreover, as variational quantum Boltzmann machine heavily relies on variational quantum imaginary time evolution, we effectively analyze variational quantum imaginary time evolution to a great extent. Compared to the previous work, we evaluate the execution of variational quantum imaginary time evolution with a more comprehensive collection of hyperparameters. Furthermore, we train variational quantum Boltzmann machines using a toy problem of bars and stripes, representing more multimodal probability distribution than the Bell states and the Greenberger-Horne-Zeilinger states considered in the earlier studies.
Analyzing the effectiveness of machine learning in banking crisis prediction

Peltonen, Henri (2024)

Banking crises have been found to cause significant fiscal and real costs for the economy. For this reason, macroprudential policymakers have developed various analytical models for predicting new banking crises ahead of time. With this information policymakers can undertake targeted countermeasures to reduce the negative impacts. In the prediction exercise binary regression models (especially logit) have been the main analytical tool for long. However, due to the complex dynamics and rare occurrences, accurate crisis prediction remains a difficult task for these models. In line with the recent developments in technology and artificial intelligence, scholars have started investigating the possibilities of using machine learning methods in banking crisis prediction. Despite the promise of more flexible distributional assumptions and enhanced modeling of non-linear relationships, the early results on predictive performance have been mixed. One explanation for this could be the large variety of models and empirical setups that different authors have used. As a result, it remains unclear whether the results are driven by changes in the underlying empirical setups, or the superiority of the machine learning models themselves. To investigate this problem, this thesis collects out-of-sample prediction results from eleven banking crisis papers published between 2017 and 2023. After implementing a normalization procedure to enhance comparability between the papers, the results are pooled for analysis to gain insights into which machine learning models perform the best. Additional robustness checks are also carried out to investigate the stability of the results. This thesis makes two main contributions to the literature. The first one is finding systematic differences in predictive performance between machine learning models. Neural network, random forest and boosted/bagged tree models have on average delivered the best predictive performance in comparison to logit models. In contrast, k-nearest-neighbors, decision tree and support vector machine models consistently underperform the logit benchmarks. The second contribution is creating novel connections between the banking crisis and machine learning literatures. The empirical results obtained in this thesis are contrasted and found to be aligned with the machine learning literature. In addition, a critical review of the practical implications resulting from the use of machine learning is conducted. Issues with interpretability, modeling and class-imbalances are highlighted.
Analyzing the effectiveness of machine learning in banking crisis prediction

Peltonen, Henri (2024)

Banking crises have been found to cause significant fiscal and real costs for the economy. For this reason, macroprudential policymakers have developed various analytical models for predicting new banking crises ahead of time. With this information policymakers can undertake targeted countermeasures to reduce the negative impacts. In the prediction exercise binary regression models (especially logit) have been the main analytical tool for long. However, due to the complex dynamics and rare occurrences, accurate crisis prediction remains a difficult task for these models. In line with the recent developments in technology and artificial intelligence, scholars have started investigating the possibilities of using machine learning methods in banking crisis prediction. Despite the promise of more flexible distributional assumptions and enhanced modeling of non-linear relationships, the early results on predictive performance have been mixed. One explanation for this could be the large variety of models and empirical setups that different authors have used. As a result, it remains unclear whether the results are driven by changes in the underlying empirical setups, or the superiority of the machine learning models themselves. To investigate this problem, this thesis collects out-of-sample prediction results from eleven banking crisis papers published between 2017 and 2023. After implementing a normalization procedure to enhance comparability between the papers, the results are pooled for analysis to gain insights into which machine learning models perform the best. Additional robustness checks are also carried out to investigate the stability of the results. This thesis makes two main contributions to the literature. The first one is finding systematic differences in predictive performance between machine learning models. Neural network, random forest and boosted/bagged tree models have on average delivered the best predictive performance in comparison to logit models. In contrast, k-nearest-neighbors, decision tree and support vector machine models consistently underperform the logit benchmarks. The second contribution is creating novel connections between the banking crisis and machine learning literatures. The empirical results obtained in this thesis are contrasted and found to be aligned with the machine learning literature. In addition, a critical review of the practical implications resulting from the use of machine learning is conducted. Issues with interpretability, modeling and class-imbalances are highlighted.
Application of Machine Learning methods to identify important biomarkers from untargeted metabolomics data

Hämäläinen, Kreetta (2021)

Personalized medicine tailors therapies for the patient based on predicted risk factors. Some tools used for making predictions on the safety and efficacy of drugs are genetics and metabolomics. This thesis focuses on identifying biomarkers for the activity level of the drug transporter organic anion transporting polypep-tide 1B1 (OATP1B1) from data acquired from untargeted metabolite profiling. OATP1B1 transports various drugs, such as statins, from portal blood into the hepatocytes. OATP1B1 is a genetically polymorphic influx transporter, which is expressed in human hepatocytes. Statins are low-density lipoprotein cholesterol-lowering drugs, and decreased or poor OATP1B1 function has been shown to be associated with statin-induced myopathy. Based on genetic variability, individuals can be classified to those with normal, decreased or poor OATP1B1 function. These activity classes were employed to identify metabolomic biomarkers for OATP1B1. To find the most efficient way to predict the activity level and find the biomarkers that associate with the activity level, 5 different machine learning models were tested with a dataset that consisted of 356 fasting blood samples with 9152 metabolite features. The models included both a Random Forest regressor and a classifier, Gradient Boosted Decision Tree regressor and classifier, and a Deep Neural Network regressor. Hindrances specific for this type of data was the collinearity between the features and the large amount of features compared to the number of samples, which lead to issues in determining the important features of the neural network model. To adjust to this, the data was clustered according to their Spearman’s rank-order correlation ranks. Feature importances were calculated using two methods. In the case of neural network, the feature importances were calculated with permutation feature importance using mean squared error, and random forest and gradient boosted decision trees used gini impurity. The performance of each model was measured, and all classifiers had a poor ability to predict decreasead and poor function classes. All regressors performed very similarly to each other. Gradient boosted decision tree regressor performed the best by a slight margin, but random forest regressor and neural network regressor performed nearly as well. The best features from all three models were cross-referenced with the features found from y-aware PCA analysis. The y-aware PCA analysis indicated that 14 best features cover 95% of the explained variance, so 14 features were picked from each model and cross-referenced with each other. Cross-referencing highest scoring features reported by the best models found multiple features that showed up as important in many models.Taken together, machine learning methods provide powerful tools to identify potential biomarkers from untargeted metabolomics data.
A real-world network traffic dataset for detecting Denial of Service attacks in a web server environment

Kahilakoski, Marko (2022)

Various Denial of Service (DoS) attacks are common phenomena in the Internet. They can consume resources of servers, congest networks, disrupt services, or even halt systems. There are many machine learning approaches that attempt to detect and prevent attacks on multiple levels of abstraction. This thesis examines and reports different aspects of creating and using a dataset for machine learning purposes to detect attacks in a web server environment. We describe the problem field, origins and reasons behind the attacks, typical characteristics, and various types of attacks. We detail ways to mitigate the attacks and provide a review of current benchmark datasets. For the dataset used in this thesis, network traffic was captured in a real-world setting, and flow records were labeled. Experiments performed include selecting important features, comparing two supervised learning algorithms, and observing how a classifier model trained on network traffic on a specific date performs in detecting new malicious records over time in the same environment. The model was also tested with a recent benchmark dataset.
A Reinforcement Learning Application for Portfolio Optimization in the Stock Market

Huertas, Andres (2020)

Investment funds are continuously looking for new technologies and ideas to enhance their results. Lately, with the success observed in other fields, wealth managers are taking a closes look at machine learning methods. Even if the use of ML is not entirely new in finance, leveraging new techniques has proved to be challenging and few funds succeed in doing so. The present work explores de usage of reinforcement learning algorithms for portfolio management for the stock market. It is well known the stochastic nature of stock and aiming to predict the market is unrealistic; nevertheless, the question of how to use machine learning to find useful patterns in the data that enable small market edges, remains open. Based on the ideas of reinforcement learning, a portfolio optimization approach is proposed. RL agents are trained to trade in a stock exchange, using portfolio returns as rewards for their RL optimization problem, thus seeking optimal resource allocation. For this purpose, a set of 68 stock tickers in the Frankfurt exchange market was selected, and two RL methods applied, namely Advantage Actor-Critic(A2C) and Proximal Policy Optimization (PPO). Their performance was compared against three commonly traded ETFs (exchange-traded funds) to asses the algorithm's ability to generate returns compared to real-life investments. Both algorithms were able to achieve positive returns in a year of testing( 5.4\% and 9.3\% for A2C and PPO respectively, a European ETF (VGK, Vanguard FTSE Europe Index Fund) for the same period, reported 9.0\% returns) as well as healthy risk-to-returns ratios. The results do not aim to be financial advice or trading strategies, but rather explore the potential of RL for studying small to medium size stock portfolios.
Artificial Neural Networks for Parametrisation of a Kinetic Monte Carlo Model of Surface Diffusion

Romppainen, Jonna (2020)

Surface diffusion in metals can be simulated with the atomistic kinetic Monte Carlo (KMC) method, where the evolution of a system is modeled by successive atomic jumps. The parametrisation of the method requires calculating the energy barriers of the different jumps that can occur in the system, which poses a limitation to its use. A promising solution to this are machine learning methods, such as artificial neural networks, which can be trained to predict barriers based on a set of pre-calculated ones. In this work, an existing neural network based parametrisation scheme is enhanced by expanding the atomic environment of the jump to include more atoms. A set of surface diffusion jumps was selected and their barriers were calculated with the nudged elastic band method. Artificial neural networks were then trained on the calculated barriers. Finally, KMC simulations of nanotip flattening were run using barriers which were predicted by the neural networks. The simulations were compared to the KMC results obtained with the existing scheme. The additional atoms in the jump environment caused significant changes to the barriers, which cannot be described by the existing model. The trained networks also showed a good prediction accuracy. However, the KMC results were in some cases more realistic or as realistic as the previous results, but often worse. The quality of the results also depended strongly on the selection of training barriers. We suggest that, for example, active learning methods can be used in the future to select the training data optimally.
Assuring Model Documentation in Continuous Machine Learning System Development

Holopainen, Markus (2023)

Context: Over the past years, the development of machine learning (ML) enabled software has seen a rise in popularity. Alongside this trend, new challenges have been identified, such as growing concerns about the use, including the ethical concerns, of ML models, as misuse can lead to severe consequences for human beings. To alleviate this problem, more comprehensive model documentation has been suggested, but how can that documentation be made part of a modern, continuous development process? Objective: We design and develop a solution, which consists of a software artefact and its surrounding process, which enables and moderates continuous documentation of ML models. The solution needs to comply with the modern way-of-working of software development. Method: We apply the design science research methodology to divide the design and development into six separate tasks, i.e., problem identification, objective definition, design and development, demonstration, evaluation, and communication. Results: The solution uses model cards for storing model details. These model cards are tested automatically and manually, forming a quality gate and ensuring integrity of the documentation. The software artefact is implemented in the form of a GitHub Action. Conclusion: We conclude that the software artefact supports and assures proper model documentation in the form of a model card. The artefact allows for customization by the user, thereby supporting domain-specific use cases.
A Survey of Machine Learning Methods for Relational Database Tuning

Nygren, Saara (2020)

A relational database management system’s configuration is essential while optimizing database performance. Finding the optimal knob configuration for the database requires tuning of multiple interdependent knobs. Over the past few years, relational database vendors have added machine learning models to their products and Oracle announced the first autonomous (i.e self-driving) database in 2017. This thesis clarifies the autonomous database concept and surveys the latest research on machine learning methods for relational database knob tuning. The study aimed to find solutions that can tune multiple database knobs and be applied to any relational database. The survey found three machine learning implementations that tune multiple knobs at a time. These are called OtterTune, CDBTune, and QTune. Ottertune uses traditional machine learning techniques, while CDBTune and QTune rely on deep reinforcement learning. These implementations are presented in this thesis, along with a discussion of the features they offer. The thesis also presents an autonomic system’s basic concepts like self-CHOP and MAPE-K feedback loop and a knowledge model to define the knowledge needed to implement them. These can be used in the autonomous database contexts along with Intelligent Machine Design and Five Levels of AI-Native Database to present requirements for the autonomous database.
At the Intersection of AI and Trademarks : The Impact of AI Technologies in Online Retail Platforms on EU Trademarks

Kivi-Koskinen, Anni Sofia (2023)

The rapid advancement of AI technologies has sparked inquiries across various legal domains, including intellectual property (IP) law. While the impact of AI on copyright and patents has received considerable attention, the influence of AI on EU trademark law is also anticipated. The World Intellectual Property Organization (WIPO) has acknowledged this impact cautiously, recognizing that AI may affect certain aspects of trademark 1law. One significant domain where AI will exert its influence on trademarks is online retail platforms. These platforms have already implemented various AI applications to provide consumers with highly personalized product recommendations. Moreover, AI-driven shopping experiences, facilitated by virtual assistants like Amazon's Alexa, have redefined the role of consumers in the purchasing process, diverging from traditional shopping practices. This thesis aims to explore how the emergence of AI technologies will impact fundamental doctrines of trademark law, including the functions of trademarks, the concept of the average consumer, and the assessment of trademark infringement. Additionally, it seeks to identify the types of AI applications deployed by e-commerce platforms and advocate for necessary actions that EU legislators must undertake to address these implications within trademark law. The findings of this thesis indicate that while many prominent doctrines of EU trademark law will remain relevant amidst the rise of AI technologies, some require re-examination in the context of advanced AI applications. The transformative nature of AI necessitates a comprehensive assessment and potential recalibration of these doctrines to ensure their efficacy in regulating AI-driven environments.
At the Intersection of AI and Trademarks : The Impact of AI Technologies in Online Retail Platforms on EU Trademarks

Kivi-Koskinen, Anni Sofia (2023)

The rapid advancement of AI technologies has sparked inquiries across various legal domains, including intellectual property (IP) law. While the impact of AI on copyright and patents has received considerable attention, the influence of AI on EU trademark law is also anticipated. The World Intellectual Property Organization (WIPO) has acknowledged this impact cautiously, recognizing that AI may affect certain aspects of trademark 1law. One significant domain where AI will exert its influence on trademarks is online retail platforms. These platforms have already implemented various AI applications to provide consumers with highly personalized product recommendations. Moreover, AI-driven shopping experiences, facilitated by virtual assistants like Amazon's Alexa, have redefined the role of consumers in the purchasing process, diverging from traditional shopping practices. This thesis aims to explore how the emergence of AI technologies will impact fundamental doctrines of trademark law, including the functions of trademarks, the concept of the average consumer, and the assessment of trademark infringement. Additionally, it seeks to identify the types of AI applications deployed by e-commerce platforms and advocate for necessary actions that EU legislators must undertake to address these implications within trademark law. The findings of this thesis indicate that while many prominent doctrines of EU trademark law will remain relevant amidst the rise of AI technologies, some require re-examination in the context of advanced AI applications. The transformative nature of AI necessitates a comprehensive assessment and potential recalibration of these doctrines to ensure their efficacy in regulating AI-driven environments.
Attribute-based Sales Forecasting in Fashion

Mukhtar, Usama (2020)

Sales forecasting is crucial for run any retail business efficiently. Profits are maximized if popular products are available to fulfill the demand. It is also important to minimize the loss caused by unsold stock. Fashion retailers face certain challenges which make sales forecasting difficult for the products. Some of these challenges are the short life cycle of products and introduction of new products all around the year. The goal of this thesis is to study forecasting methods for fashion. We use the product attributes for products in a season to build a model that can forecast sales for all the products in the next season. Sales for different attributes are analysed for three years. Sales for different variables vary for values which indicate that a model fitted on product attributes may be used for forecasting sales. A series of experiments are conducted with multiple variants of the datasets. We implemented multiple machine learning models and compared them against each other. Empirical results are reported along with the baseline comparisons to answer research questions. Results from first experiment indicate that machine learning models are almost doing as good as the baseline model that uses mean values as predictions. The results may improve in the upcoming years when more data is available for training. The second experiment shows that models built for specific product groups are better than the generic models that are used to predict sales for all kinds of products. Since we observed a heavy tail in the data, a third experiment was conducted to use logarithmic sales for predictions, and the results do not improve much as compared to results from previous methods. The conclusion of the thesis is that machine learning methods can be used for attribute-based sales forecasting in fashion industry but more data is needed, and modeling specific groups of products bring better results.

Now showing items 1-20 of 77

Browsing by Subject "machine learning"

Yhteystiedot

HELSINGIN YLIOPISTO