Browsing by master's degree program "Datatieteen maisteriohjelma"

Now showing items 101-120 of 138

Predicting Field Value with Interpretable Models

Kailamäki, Kalle (2022)

This thesis explores predicting current prices of individual agricultural fields in Finland based on historical data. The task is to predict field prices accurately with the data we have available while keeping model predictions interpretable and well explainable. The research question is to find which out of several different models we try out is most optimal for the task. The motivation behind this research is the growing agricultural land market and the lack of publicly available field valuation services that can assist market participants to determine and identify reasonable asking prices. Previous studies on the topic have used standard statistics to establish relevant factors that affect field prices. Rather than creating a model whose predictions can be used on their own in every case, the primary purpose of previous works has indeed been to identify information that should be considered in manual field valuation. We, on the other hand, focus on the predictive ability of models that do not require any manual labor. Our modelling approaches focus mainly but not exclusively on algorithms based on Markov–Chain Monte Carlo. We create a nearest neighbors model and four hierarchical linear models of varying complexity. Performance comparisons lead us to recommend a nearest neighbor -type model for this task.
Predicting galaxy properties from halo properties using machine learning

Zetterman, Elina (2024)

When studying galaxy formation and evolution, the relationship between galaxy properties and dark matter halo properties are important, since galaxies form and evolve within these halos. This relationship can be figured out using numerical simulations, but unfortunately, they are computationally expensive and require vast amounts of computational resources. This provides incentive to use machine learning instead, since training a machine learning model requires significantly less time and resources. If machine learning could be used to predict galaxy properties from halo properties, numerical simulations would still be needed to find the halo population, but the more expensive hydrodynamical simulations would no longer be necessary. In this thesis, we use data from the IllustrisTNG hydrodynamical simulation to train five different types of machine learning models. The goal is to predict four different galaxy properties from multiple halo properties, and measure how accurate and reliable the predictions are. We also compare the different types of models with each other to find out which ones have the best performance. Additionally, we calculate confidence intervals for the predictions to evaluate the uncertainty of the models. We find that out of the four galaxy properties, stellar mass is the easiest to predict, whereas color is the most difficult one. From the five different types of models, light gradient boosting is in all cases either the best performing model, or its performance is almost as good as that of the best performing model. This, combined with the fact that training this type of model is extremely fast, light gradient boosting has good potential to be utilized in practice.
Predicting programming assignment difficulty

Harju, Esa (2019)

Teaching programming is increasingly more widespread and starts at primary school level on some countries. Part of that teaching consist of students writing small programs that will demonstrate learned theory and how various things fit together to form a functional program. Multiple studies indicate that programming is difficult skill to learn and master. Some part of difficulty comes from plethora of concepts that students are expected to learn in relatively short time. Part of practicing to write programs involves feedback, which aids students’ learning of assignment’s topic, and motivation, which encourages students to continue the course and their studies. For feedback it would be helpful to know students’ opinion of a programming assignment difficulty. There are few studies that attempt to find out if there is correlation between metrics that are obtained from students’ writing a program and their reported difficulty of it. These studies use statistical models on data after the course is over. This leads to an idea if such a thing could be done while students are working on programming assignments. To do this some sort of machine learning model would be possible solution but as of now no such models exist. Due to this we will utilize idea from one of these studies to create a model, which could do such prediction. We then improve that model, which is coarse, with two additional models that are more tailored for the job. Our main results indicate that this kind of models show promise in their prediction of a programming assignment difficulty based on collected metrics. With further work these models could provide indication of a student struggling on some assignment. Using this kind of model as part of existing tools we could provide a student subtle help before his frustration grows too much. Further down the road such a model could be used to provide further exercises, if need by a student, or progress forward once he masters certain topic.
Predicting the price of Bitcoin using the sentiment of popular Bitcoin-related Tweets

Keningi, Eino (2022)

In little over a decade, cryptocurrencies have become a highly speculative asset class in global financial markets, with Bitcoin leading the way. Throughout its relatively brief history, the price of bitcoin has gone through multiple cycles of growth and decline. As a consequence, Bitcoin has become a widely discussed – and polarizing – topic on Twitter. This work studies whether the sentiment of popular Bitcoin-related tweets can be used to predict the future price movements of bitcoin. In total, seven different algorithms are evaluated: Vector Autoregression, Vector Autoregression Moving-Average, Random Forest, XGBoost, LightGBM, Long Short-Term Memory, and Gated Recurrent Unit. By applying lexicon-based sentiment analysis, and heuristic filtering of tweets, it was discovered that sentiment-based features of popular tweets improve the prediction accuracy over baseline features (open-high-low-close data) in five of the seven algorithms tested. The tree-based algorithms (Random Forest, XGBoost, LightGBM) generally had the lowest prediction errors, while the neural network algorithms (Light Short-Term Memory and Gated Recurrent Unit) had the poorest performance. The findings suggest that the sentiment of popular Bitcoin-related tweets can be an important feature in predicting the future price movements of bitcoin.
Pricing for short-term parking in Finnish shopping centre parking halls

Pirilä, Pauliina (2024)

This thesis discusses short-term parking pricing in the context of Finnish shopping centre parking halls. The focus is on one shopping centre located in Helsinki where parking fees are high and there is a constant need for raising the prices. Therefore, it is important to have a strategy that maximises parking hall income without compromising the customers' interest. If the prices are too high, customers will choose to park elsewhere or reduce their parking in private parking halls. There is a lot of competition with off-street parking competing against on-street parking and access parking, not to mention other parking halls. The main goal of this thesis is to raise problems with parking pricing and discuss how to find the most beneficial pricing method. To achieve this, this thesis project conducted an analysis on one Finnish shopping centre parking hall data. This data was analysed to discover the average behaviour of the parkers and how the raised parking fees affect both the parker numbers and the income of the parking hall. In addition, several pricing strategies from literature and real-life examples were discussed and evaluated, and later combined with the analysis results. The results showed that there are some similarities with results from literature but there were some surprising outcomes too. It seems that higher average hourly prices are correlated with longer stays, but still the parkers who tend to park longer have more inelastic parking habits than those who park for shorter durations. The calculated price elasticity of demand values show that compared to other parking halls, parking is on average more elastic in the analysed parking hall. This further emphasises the importance of milder price raises at least for the shorter parking durations. Moreover, there are noticeable but explainable characteristics in parker behaviour. Most of the parkers prefer to park for under one hour to take advantage of the first parking hour being free. This leads to profit losses in both the shopping centre and parking hall income. Therefore, a dynamic pricing strategy is suggested as one pricing option, since it adjusts the prices automatically based on occupancy rates. Although there are some challenges with this particular method, in the long run it could turn out to be the most beneficial for both the parking hall owners and the parkers. To conclude, choosing a suitable pricing strategy and model for a parking hall is crucial and the decisions should be based on findings from data.
PriorSDF: Fast and Accurate 3D Shape Recovery by SfM-informed Neural SDFs

Chelak, Ilia (2024)

Recently, 3D reconstruction has become a popular topic due to applications in Virtual Reality, Augmented Reality, and historical heritage preservation. Yet, high-quality reconstruction is not available to the general public because of the cost of laser scanners. The goal of this thesis is to make the democratization of 3D reconstruction closer with the use of photogrammetry (reconstruction from multi-view images.) However, current approaches are very slow and tend to oversmooth the geometry. Our method involves learning the scene via a neural representation by taking posed multi-view images as input. We note that state-of-the-art (SOTA) approaches rely on traditional Structure-from-Motion (SfM) algorithms to extract camera poses. We also observe that SfM can generate a coarsely correct mesh for the underlying object. Nevertheless, SOTA techniques start training the neural representation from a sphere. Therefore, we propose a novel initialization method that takes the mesh obtained from SfM and initializes the neural representation from it. We validate our approach through extensive experiments on a widely used multi-view stereo DTU dataset. We show that our method outperforms both traditional and SOTA neural techniques in terms of reconstruction quality. It manages to learn the underlying geometry and recover small details like cracks and dents. We also show that it speeds up convergence by 4 times. All the datasets, reconstructed meshes, and learned model weights are available at this link.
Probabilistic Predictive Elicitation

Agiashvili, Georgi (2021)

Unlike the traditional machine learning approaches that rely solely on data, Bayesian machine learning models can utilize prior knowledge on the data generating process, for instance in form of information about plausible outcomes. More importantly, Bayesian machine learning models use the prior information as the base knowledge, on top of which the learning from observations is built on. The process of forming the prior distribution based on subjective probabilities is called prior elicitation, and that is the focus of this thesis. Although previous research has produced methods for prior elicitation, there has not been a general-purpose solution. Particularly, the methods introduced previously have focused on specific models. This has limited the applicability of prior elicitation, and in some cases, required the expert to have a deep understanding of different aspects of the Bayesian modelling. Additionally, the more general predictive elicitation methods in previous research have not accounted for the uncertainty regarding experts' judgements. This is important, since even the most accurate elicitation methods cannot remove all imprecision in expert judgements. Because of these reasons, prior elicitation has remained somewhat underrated and underused in the modern Bayesian workflow. This thesis provides a theoretical basis and validation of a novel prior elicitation method, which was first introduced by Hartmann et al. Particularly, this principled statistical framework called probabilistic predictive elicitation 1) makes prior elicitation independent on the specific structure of the probabilistic model, 2) handles complex models with many parameters and potentially multivariate priors, 3) fully accounts for uncertainty in experts' probabilistic judgements on the data, and 4) provides a formal quality measure indicating if the chosen predictive model is able to reproduce experts' probabilistic judgements. We extend the published work in multiple ways. First, we provide more thorough literature reviews on different prior elicitation approaches as well as on methods for the expert elicitation. Second, we continue the discussion about technicalities, implementation and applications of the proposed methodology. Third, we report two unpublished experiments using the proposed methodology. In addition, we discuss the methodology in the context of the modern Bayesian workflow.
Quantum data for quantum machine learning

Hotari, Juho (2024)

Quantum computing has an enormous potential in machine learning, where problems can quickly scale to be intractable for classical computation. Quantum machine learning is research area that combines the interplay of ideas from quantum computing and machine learning. Powerful and useful machine learning is dependent on having large-scale datasets used to train the models to be able to solve real-life problems. Currently, quantum machine learning lacks a plethora of large-scale quantum datasets required to further develop the models and test the quantum machine learning algorithms. Lack of large datasets is currently limiting the quantum advantage in the field of quantum machine learning. In this thesis, the concept of quantum data and different types of applied quantum datasets used to develop quantum machine learning models is studied. The research methodology is based on a systematic and comparative literature review of the state of the art articles in quantum computing and quantum machine learning in the recent years. We classify datasets into inherent and non-inherent quantum data based on the nature of the data. The preliminary literature review addresses patterns in the applied quantum machine learning. Testing and benchmarking QML models primarily uses non-inherent quantum data, or classical data encoded into the quantum system, while separate research is focused on generating inherent quantum datasets.
Quantum safe authentication of quantum key distribution protocol

Lintulampi, Anssi (2023)

Secure data transmissions are crucial part of modern cloud services and data infrastructures. Securing communication channel for data transmission is possible if communicating parties can securely exchange a secret key. Secret key is used in a symmetric encryption algorithm to encrypt digital data that is transmitted over an unprotected channel. Quantum key distribution is a method that communicating parties can use to securely share a secret cryptographic key with each other. Security of quantum key distribution requires that the communicating parties are able to ensure the authenticity and integrity of messages they exchange on the classical channel during the protocol. For this purpose they use cryptographic authentication techniques such as digital signatures or message authentication codes. Development of quantum computers affects how traditional authentication solutions can be used in the future. For example, traditional digital signature algorithms will become vulnerable if quantum computer is used to solve the underlying mathematical problems. Authentication solutions used in the quantum key distribution should be safe even against adversaries with a quantum computer to ensure the security of the protocol. This master’s thesis studies quantum safe authentication methods that could be used with quantum key distribution. Two different quantum safe authentication methods were implemented for quantum key distribution protocol BB84. The implemented authentication methods are compared based on their speed and size of the authenticated messages. Security aspects related to the authentication are also evaluated. Results show that both authentication methods are suitable to be used in quantum key distribution. Results also show that the implemented method that uses message authentication codes is faster than the method using digital signatures.
Reducing Technical Debt with Machine Learning Frameworks

Anttila, Kamilla (2020)

Most machine learning projects consist of four distinct phases: data preparation, model training, model validation, and inference serving. Even though all of these phases are vital components of a successful machine learning project, the focus of most machine learning work is solely on the training of models. The other phases often need to be implemented with ad-hoc solutions, which can easily lead to technical debt. Technical debt is a metaphor for describing the quality of a software project. It describes the state of a project by comparing it to a financial loan. During software development, a loan can be taken to add value to the present state of the system. However, the loan comes with interest and has to be payed back. A loan can be taken, for example, by writing low quality code to meet a deadline. The loan has to be payed back by rewriting the code later, or else it will start to grow interest. The interest can be seen in the code functioning poorly or requiring substantial amounts of time to be understood. If a loan is not payed back, the interest keeps increasing, making it more and more difficult to pay the loan back later. In this thesis, we study the effect machine learning frameworks have on technical debt. We describe the machine learning project lifecycle and the various sources of technical debt associated with it. We review available machine learning frameworks and their mitigation strategies for the technical debt in machine learning projects. Our insights demonstrate how frameworks can be used to reduce the overall technical debt in machine learning projects.
(Re)lexicalization of auto-written news with contextual and cross-lingual word embeddings

Rämö, Miia (2020)

In news agencies, there is a growing interest towards automated journalism. Majority of the systems applied are template- or rule-based, as they are expected to produce accurate and fluent output transparently. However, this approach often leads to output that lacks variety. To overcome this issue, I propose two approaches. In the lexicalization approach new words are included in the sentences, and in relexicalization approach some existing words are replaced with synonyms. Both of the approaches utilize contextual word embeddings for finding suitable words. Furthermore, the above approaches require linguistic resources, which are only available for high- resource languages. Thus, I present variants of the (re)lexicalization approaches that allow their utilization for low-resource languages. These variants utilize cross-lingual word embeddings to access linguistic resources of a high-resource language. The high-resource variants achieved promising results. However, the sampling of words should be further enhanced to improve reliability. The low-resource variants did show some promising results, but the quality suffered from complex morphology of the example language. This is a clear next issue to address and resolving it is expected to significantly improve the results.
Remote Sensing of Acid Sulfate Soils using Deep Learning and Aerial LiDAR Data

Muilu, Petteri (2021)

Acidic sulfate soils (a.s. soils) decrease the soil pH and generate an acidic environment that is toxic for vegetation and aquatic life. Therefore, knowledge of potentially dangerous soils is necessary during, e.g. construction planning to reduce ecosystem service loss. Identifying a.s. soils accurately is a difficult task and require expensive and laborious field and lab tests. In order to identify larger areas cost-effectively, remote sensing methods could bring benefits to the construction planning field. This work shows that point clouds acquired with aerial LiDAR scanning can be used to remote sense acid sulfate soils with PointNet and PointNet++ deep learning models. However, additional research with larger data sets is required to improve the accuracy of the models for real-world applications. Therefore, work also suggests further research and improvement ideas for collecting data for such models.
Semantic Segmentation with Neural Networks in Environment Monitoring

Elmnäinen, Johannes (2020)

The Finnish Environment Institute (SYKE) has at least two missions which require surveying large land areas: finding invasive alien species and monitoring the state of Finnish lakes. Various methods to accomplish these tasks exist, but they traditionally rely on manual labor by experts or citizen activism, and as such do not scale well. This thesis explores the usage of computer vision to dramatically improve the scaling of these tasks. Specifically, the aim is to fly a drone over selected areas and use a convolutional neural network architecture (U-net) to create segmentations of the images. The method performs well on select biomass estimation task classes due to large enough datasets and easy-to-distinguish core features of the classes. Furthermore, a qualitative study of datasets was performed, yielding an estimate for a lower bound of number of examples for an useful dataset. ACM Computing Classification System (CCS): CCS → Computing methodologies → Machine learning → Machine learning approaches → Neural networks
Sensitivity of loss given default in multi-stage modelling

Jeskanen, Juuso-Markus (2021)

Developing reliable, regulatory compliant and customer-oriented credit risk models requires thorough knowledge of credit risk phenomenon. Tight collaboration between stakeholders is necessary and hence models need to be transparent, interpretable and explainable as well as accurate, for experts without statistical background. In the context of credit risk, one can speak of explainable artificial intelligence (XAI). Hence, practice and market standards are also underlined in this study. So far, credit risk research has mainly focused on the estimation of the probability of default parameter. However, as systems and processes have evolved to comply with regulation in the last decade, recovery data has improved, which has raised loss given default (LGD) up to the heart of credit risk. In the context of LGD, most of the studies have emphasized estimation of one-stage models. However, in practice, market standards support a multi-stage approach which follows the institution's simplified recovery processes. Generally, multi-stage models are more transparent and have better predictive power and compliant status with the regulation. This thesis presents a framework to analyze and execute sensitivity analysis for multi-stage LGD model. The main contribution of the study is to increase the knowledge of LGD modelling by giving insights to the sensitivity of discriminatory power between risk drivers, model components and LGD score. The study aims to answer two questions. Firstly, how sensitive the predictive power of multi-stage LGD model is on the correlation of risk drivers and individual components? Secondly, how to identify the most driving risk factors that need to be considered in multi-stage LGD modelling to achieve adequate level LGD score? The experimental part of this thesis is divided into two parts. The first one presents the motivation, study design and experimental setup used in this thesis to execute the study. The second part focuses on the sensitivity analysis of risk drivers, components and LGD score. Sensitivity analysis presented in this study gives important knowledge of behavior of multi-stage LGD and dependencies between independent risk drivers, components and LGD score with regards to the correlations and model performance metrics. Introduced sensitivity framework can be utilised in assessing the need and schedule for model calibrations with related to the changes in application portfolio. In addition, framework and results can be used in recognizing the needs for monthly performed IFRS 9 ECL calculation updates. The study also gives input for model stress testing where different scenarios and impacts are analyzed regarding the changes in macroeconomic conditions. Even though the focus of this study is in credit risk, the methods presented are also applicable in the different fields outside the financial sector.
Sentence Embeddings via Token Inference

Lassila, Juuso (2024)

Calculating sentence similarities is an essential task for natural language processing. It allows for implementing similarity searches, where the most similar sentence is found out of many for some query sentence, it allows for clustering text by semantic meaning, and finally, sentence embeddings, which are used for calculating the similarities, can also be used as input for any text classification models. There is much room for improvement in sentence embedding model architectures and training methods, both in terms of accuracy and training efficiency. This thesis experiments with a novel unsupervised training method called Sentence Embeddings via Token Inference (SETI), which is efficient by design, to see if it can compete with other methods in accuracy. Using the same data, our experiments train SETI and three other existing training methods: TSDAE, QuickThoughts, and generic MLM. We then compare these models to each other in different sentence similarity and downstream classification tasks. Based on our experiments, SETI is comparable to TSDAE in sentence similarity tasks and better than generic MLM and QuickThoughts training methods in sentence similarity tasks. However, TSDAE has the highest accuracy for downstream classification tasks, while SETI still beats the generic MLM and QuickThoughts models.
Sentiment Analysis with Language Models on Finnish Workplace Well-Being Surveys

Kortesalmi, Ville (2024)

Improving employee well-being is a key part of pension agency Keva’s mission statement. Recently, Keva has launched a tool for conducting repeated small-scale employee well-being surveys called ”Pulssi”. With the number of responses reaching thousands Keva has identified processing and organizing this data as a part of this process that could be improved using machine learning methods. In this thesis, we conducted a comprehensive investigation into using language models and sentiment classifications as a solution. We tested three different methodologies for this purpose, traditional machine learning with learned embeddings, generative language methods, and fine-tuned BERT models. To our knowledge, this is the first study evaluating the use of language models on the Finnish sentiment analysis task. Additionally, we evaluated the feasibility of implementing these methods based on their operating costs and the time it took to create classifications. We found that the traditional machine learning trained on learned embeddings performed surprisingly well, achieving an accuracy of 91%. These models offer a fast and cost-effective alternative to the more cumbersome language models. Our fine-tuned BERT model the ”KevaBERT” achieved an impressive accuracy of 93.6%, when trained on GPT-4 generated predictions, suggesting a potential pathway for training data creation. Overall our best performance was achieved by the ”GPT-4 few-shot with context” model at 93.9% accuracy. Our accuracies rival or even surpass the state-of-the-art accuracies achieved on other datasets. Despite the near human-level performance, this model was slow and expensive to operate. Based on these findings we recommend the use of our ”KevaBERT” model for sentiment classifications and a separate GPT-4 based model for text summarization.
Simulation Environment for Reinforcement Learning Control of Acoustic Levitation

Kinnunen, Anniina (2023)

Acoustic levitation refers to the levitation of particles using sound waves. It can be performed on a phased array of transducers (levitator) where the transducers create the sound waves. The levitator device can be controlled by altering the values of the control parameters of the transducers. In this thesis, we present an automatic approach for finding the control parameter values using a branch of machine learning called reinforcement learning. The main goal is to make specifying the control parameter values for complex levitation tasks easier. We first build a simulation environment for the learning task, and then perform several experiments in the environment and compare two model-based reinforcement learning algorithms: Covariance Matrix Adaptation Evolution Strategy and a baseline strategy based on random actions. The experiments are related to optimizing the hyperparameters of reinforcement learning, testing the algorithm with limitations that using a real levitator would bring, and solving different levitation tasks. The results of the experiments show that the simulation environment enables controlling levitators with model-based algorithms. Furthermore, both of the algorithms that were used were able to solve various control problems, such as lifting a particle and moving a particle in circles.
SQL-tehtäväyritysten oikeellisuuden ennustaminen koneoppimismenetelmin

Räty, Matti (2020)

SQL kuuluu suositeltujen oppiaineiden joukkoon tietojenkäsittelytieteestä. Se on tehokas tapa varastoida dataa kontekstista riippumatta. SQL on kuitenkin opittavana aiheena opiskelijoilleen vaikea, ja tämän vuoksi SQL-opetuksen rinnalla käytetään opetusohjelmistoja. Opetusohjelmistojen avulla SQL:ää päästään opettelemaan käytännössä, paikataan suurta oppilaiden määrää opettajien määrään nähden, ja kerätään aineistoa opiskelijoiden suoriutumisesta. Oppimisohjelmistojen keräämä aineisto oppilaiden suoriutumisesta tarjoaa mahdollisuuden ennustaa opiskelijoiden suoriutumista kurssilla koneoppimismenetelmin. Tämä tutkielma kouluttaa SQL-opetusohjelmiston aineistoilla hyväksi todettuja koneoppimisalgoritmeja malleiksi, jotka osaavat ennustaa osaako opiskelija seuraavalla yrityksellään SQL-harjoitustehtävän oikein. Kyseessä ei ole tehdä mallia joka osaisi tarkastaa SQL-tehtäviä, vaan tarkoituksena on antaa koneoppimisalgoritmien tarkkailla opiskelijoilta muita kerättyjä tilastoja tehtäväyrityksen oikeellisuuden arvioimiseen ilman itse oppilaan antamaa ratkaisua. Tutkielmassa huomataan useiden koneoppimismallien olevan toimivia tämän tavoitteen saavuttamiseksi. Vastaavia koneoppimismalleja voidaan hyödyntää oppilaiden löytämisessä, joilla on vaikeuksia tehtävien tekemisessä. Tämä tieto on arvokasta esimerkiksi opetusohjelmistoille, jotka pyrkivät antamaan SQL-tehtävien tekijöille vihjeitä hyödylliseen aikaan.
Sub-sampled and Differentially Private Hamiltonian Monte Carlo

Lode, Lauri (2019)

Hamiltonian Monte Carlo is a powerful Markov Chain algorithm, which is able to traverse complex posterior distributions accurately. One of the method's disadvantages is it's reliance on gradient evaluations over the full data, which quickly becomes computationally costly when the data sets grow large. By mini-batching the data set for stochastic gradient approximations we can speed up the algorithm, albeit with a reduced posterior accuracy. We illustrate by using a toy example, that the stochastic version of the method is unable to explore the exact posterior, and we show how an added friction term greatly alleviates this, when the term is adjusted carefully. We use the added stochastic error to our advantage, by turning the results differentially private. The randomness in the results masks the appearance of any single data point in the used data set, creating a way to more secure handling of sensitive data. In the case of stochastic gradient Hamiltonian Monte Carlo, we are able to achieve reasonable privacy bounds with little to no decrease in optimization performance, although finding a good the differentially private approximation of the target posterior becomes harder. In addition, we compare the previously considered privacy accounting methods to assay the privacy bounds to a new privacy loss distribution method, which is able to determine a tighter privacy profile than, for example, the moments accountant method.
The Analysis of Fairness in Differentially Private Deep Learning

Zhao, Linzh (2024)

As privacy gains consensus in the field of machine learning, numerous algorithms, such as differentially private stochastic gradient descent (DPSGD), have emerged to ensure privacy guarantees. Concurrently, fairness is garnering increasing attention, prompting research aimed at achieving fairness within the constraints of differential privacy. This thesis delves into algorithms designed to enhance fairness in the realm of differentially private deep learning and explores their mechanisms. It examines the role of normalization, a technique applied to these algorithms in practice, to elucidate its impact on fairness. Additionally, this thesis formalizes a hyperparameter tuning protocol to accurately assess the performance of these algorithms. Experiments across various datasets and neural network architectures were conducted to test our hypotheses under this tuning protocol. The decoupling of hyperparameters, allowing each to independently control specific properties of the algorithm, has proven to enhance performance. However, certain mechanisms, such as discarding samples with large norms and allowing unbounded hyperparameter adaptation, may significantly compromise fairness. Our experiments also confirm the critical role of hyperparameter values in influencing fairness, emphasizing the necessity of precise tuning to ensure equitable outcomes. Additionally, we observed differential convergence rates across algorithms, which affect the number of trials needed to identify optimal hyperparameter settings. This thesis aims to offer detailed perspectives on understanding fairness in differentially private deep learning and provides insights into designing algorithms that can more effectively enhance fairness.

Now showing items 101-120 of 138

Browsing by master's degree program "Datatieteen maisteriohjelma"

Yhteystiedot

HELSINGIN YLIOPISTO