Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "Machine learning"

Sort by: Order: Results:

  • Laakso, Jarno (2021)
    Halide perovskites are a promising materials class for solar energy production. The photovoltaic efficiency of halide perovskites is remarkable but their toxicity and instability have prevented commercialization. These problems could be addressed through compositional engineering in the halide perovskite materials space but the number of different materials that would need to be considered is too large for conventional experimental and computational methods. Machine learning can be used to accelerate computations to the level that is required for this task. In this thesis I present a machine learning approach for compositional exploration and apply it to the composite halide perovskite CsPb(Cl, Br)3 . I used data from density functional theory (DFT) calculations to train a machine learning model based on kernel ridge regression with the many-body tensor representation for the atomic structure. The trained model was then applied to predict the decomposition energies of CsPb(Cl, Br)3 materials from their atomic structure. The main part of my work was to derive and implement gradients for the machine learning model to facilitate efficient structure optimization. I tested the machine learning model by comparing its decomposition energy predictions to DFT calculations. The prediction accuracy was under 0.12 meV per atom and the prediction time was five orders of magnitude faster than DFT. I also used the model to optimize CsPb(Cl, Br)3 structures. Reasonable structures were obtained, but the accuracy was qualitative. Analysis on the results of the structural optimizations exposed shortcomings in the approach, providing important insight for future improvements. Overall, this project makes a successful step towards the discovery of novel perovskite materials with designer properties for future solar cell applications.
  • Haatanen, Henri (2022)
    In the modern era, using personalization when reaching out to potential or current customers is essential for businesses to compete in their area of business. With large customer bases, this personalization becomes more difficult, thus segmenting entire customer bases into smaller groups helps businesses focus better on personalization and targeted business decisions. These groups can be straightforward, like segmenting solely based on age, or more complex, like taking into account geographic, demographic, behavioral, and psychographic differences among the customers. In the latter case, customer segmentation should be performed with Machine Learning, which can help find more hidden patterns within the data. Often, the number of features in the customer data set is so large that some form of dimensionality reduction is needed. That is also the case with this thesis, which includes 12802 unique article tags that are desired to be included in the segmentation. A form of dimensionality reduction called feature hashing is selected for hashing the tags for its ability to be introduced new tags in the future. Using hashed features in customer segmentation is a balancing act. With more hashed features, the evaluation metrics might give better results and the hashed features resemble more closely the unhashed article tag data, but with less hashed features the clustering process is faster, more memory-efficient and the resulting clusters are more interpretable to the business. Three clustering algorithms, K-means, DBSCAN, and BIRCH, are tested with eight feature hashing bin sizes for each, with promising results for K-means and BIRCH.
  • Rantanen, Robert (2023)
    Manual creation of game content is often the most expensive and time-consuming part of game development. Procedural content generation offers an alternative solution, automatically generating game content with the help of algorithms. This can decrease the cost and effort of content creation inaddition to offering other benefits such as increasing the game’s replayability. This thesis investigates the current state of procedural content generation and how it is utilized in game development. A major part the thesis is investigating state-of-art open-source software that can be used for automatic generation of game content. We evaluate the usefulness and practicality of utilizing these tools in game development.
  • Jälkö, Joonas (2017)
    This thesis focuses on privacy-preserving statistical inference. We use a probabilistic point of view of privacy called differential privacy. Differential privacy ensures that replacing one individual from the dataset with another individual does not affect the results drastically. There are different versions of the differential privacy. This thesis considers the ε-differential privacy also known as the pure differential privacy, and also a relaxation known as the (ε, δ)-differential privacy. We state several important definitions and theorems of DP. The proofs for most of the theorems are given in this thesis. Our goal is to build a general framework for privacy preserving posterior inference. To achieve this we use an approximative approach for posterior inference called variational Bayesian (VB) methods. We build the basic concepts of variational inference with certain detail and show examples on how to apply variational inference. After giving the prerequisites on both DP and VB we state our main result, the differentially private variational inference (DPVI) method. We use a recently proposed doubly stochastic variational inference (DSVI) combined with Gaussian mechanism to build a privacy-preserving method for posterior inference. We give the algorithm definition and explain its parameters. The DPVI method is compared against the state-of-the-art method for DP posterior inference called the differentially private stochastic gradient Langevin dynamics (DP-SGLD). We compare the performance on two different models, the logistic regression model and the Gaussian mixture model. The DPVI method outperforms DP-SGLD in both tasks.