Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "Bayesian inference"

Sort by: Order: Results:

  • Huggins, Robert (2023)
    In this thesis, we develop a Bayesian approach to the inverse problem of inferring the shape of an asteroid from time-series measurements of its brightness. We define a probabilistic model over possibly non-convex asteroid shapes, choosing parameters carefully to avoid potential identifiability issues. Applying this probabilistic model to synthetic observations and sampling from the posterior via Markov Chain Monte Carlo, we show that the model is able to recover the asteroid shape well in the limit of many well-separated observations, and is able to capture posterior uncertainty in the case of limited observations. We greatly accelerate the computation of the forward problem (predicting the measured light curve given the asteroid’s shape parameters) by using a bounding volume hierarchy and by exploiting data parallelism on a graphics processing unit.
  • Rehn, Aki (2022)
    The application of Gaussian processes (GPs) is limited by the rather slow process of optimizing the hyperparameters of a GP kernel which causes problems especially in applications -- such as Bayesian optimization -- that involve repeated optimization of the kernel hyperparameters. Recently, the issue was addressed by a method that "amortizes" the inference of the hyperparameters using a hierarchical neural network architecture to predict the GP hyperparameters from data; the model is trained on a synthetic GP dataset and in general does not require retraining for unseen data. We asked if we can understand the method well enough to replicate it with a squared exponential kernel with automatic relevance determination (SE-ARD). We also asked if it is feasible to extend the system to predict posterior approximations instead of point-estimates to support fully Bayesian GPs. We introduce the theory behind Bayesian inference; gradient-based optimization; Gaussian process regression; variational inference; neural networks and the transformer architecture; the method that predicts point-estimates of the hyperparameters; and finally our proposed architecture to extend the method to a variational inference framework. We were able to successfully replicate the method from scratch with an SE-ARD kernel. In our experiments, we show that our replicated version of the method works and gives good results. We also implemented the proposed extension of the method to a variational inference framework. In our experiments, we do not find concrete reasons that would prevent the model from functioning, but observe that the model is very difficult to train. The final model that we were able to train predicted good means for (Gaussian) posterior approximations, but the variances that the model predicted were abnormally large. We analyze possible causes and suggest future work.
  • Länsman, Olá-Mihkku (2020)
    Demand forecasts are required for optimizing multiple challenges in the retail industry, and they can be used to reduce spoilage and excess inventory sizes. The classical forecasting methods provide point forecasts and do not quantify the uncertainty of the process. We evaluate multiple predictive posterior approximation methods with a Bayesian generalized linear model that captures weekly and yearly seasonality, changing trends and promotional effects. The model uses negative binomial as the sampling distribution because of the ability to scale the variance as a quadratic function of the mean. The forecasting methods provide highest posterior density intervals in different credible levels ranging from 50% to 95%. They are evaluated with proper scoring function and calculation of hit rates. We also measure the duration of the calculations as an important result due to the scalability requirements of the retail industry. The forecasting methods are Laplace approximation, Monte Carlo Markov Chain method, Automatic Differentiation Variational Inference, and maximum a posteriori inference. Our results show that the Markov Chain Monte Carlo method is too slow for practical use, while the rest of the approximation methods can be considered for practical use. We found out that Laplace approximation and Automatic Differentiation Variational Inference have results closer to the method with best analytical quarantees, the Markov Chain Monte Carlo method, suggesting that they were better approximations of the model. The model faced difficulties with highly promotional, slow selling, and intermittent data. Best fit was provided with high selling SKUs, for which the model provided intervals with hit rates that matched the levels of the credible intervals.
  • Hellsten, Kirsi (2023)
    Triglycerides are a type of lipid that enters our body with fatty food. High triglyceride levels are often caused by an unhealthy diet, poor lifestyle, poorly treated diseases such as diabetes and too little exercise. Other risk factors found in various studies are HIV, menopause, inherited lipid metabolism disorder and South Asian ancestry. Complications of high triglycerides include pancreatitis, carotid artery disease, coronary artery disease, metabolic syndrome, peripheral artery disease, and strokes. Migration has made Singapore diverse, and it contains several subpopulations. One third of the population has genetic ancestry in China. The second largest group has genetic ancestry in Malaysia, and the third largest has genetic ancestry in India. Even though Singapore has one of the highest life expectancies in the world, unhealthy lifestyles such as poor diet, lack of exercise and smoking are still visible in everyday life. The purpose of this thesis was to introduce GWAS-analysis for quantitative traits and apply it to real data, and also to see if there are associations between some variants and triglycerides in three main subpopulations in Singapore and compare the results to previous studies. The research questions that this thesis answered are: what is GWAS analysis and what is it used for, how can GWAS be applied to data containing quantitative traits, and is there associations between some SNPs and triglycerides in three main populations in Singapore. GWAS stands for genome-wide association studies designed to identify statistical association between genetic variants and phenotypes or traits. One reason for developing GWAS was to learn to identify different genetic factors which have an impact on significant phenotypes, for instance susceptibility to certain diseases Such information can eventually be used to predict the phenotypes of individuals. GWAS have been globally used in, for example, anthropology, biomedicine, biotechnology, and forensics. The studies enhance the understanding of human evolution and natural selection and helps forward many areas of biology. The study used several quality control methods, linear models, and Bayesian inference to study associations. The research results were examined, among other things, with the help of various visual methods. The dataset used in this thesis was an open data used by Saw, W., Tantoso, E., Begum, H. et al. in their previous study. This study showed that there are associations between 6 different variants and triglycerides in the three main subpopulations in Singapore. The study results were compared with the results of two previous studies, which differed from the results of this study, suggesting that the results are significant. In addition, the thesis reviewed the ethics of GWAS and the limitations and benefits of GWAS. Most of the studies like this have been done in Europe, so more research is needed in different parts of the world. This research can also be continued with different methods and variables.
  • Kolehmainen, Ilmari (2022)
    This thesis analyses the colonization success of lowland herbs in open tundra using Bayesian inference methods. This was done with four different models that analyse the the effects of different treatments, grazing levels and environmental covariates on the probability of a seed growing into a seedling. The thesis starts traditionally with an introduction chapter. The second chapter goes through the data; where and how it was collected, different treatments used and other relevant information. The third chapter goes through all the methods that you need to know to understand the analysis of this thesis, which are the basics of Bayesian inference, generalized linear models, generalized linear mixed models, model comparison and model assessment. The actual analysis starts in the fourth chapter that introduces the four models used in this thesis. All of the models are binomial generalized linear mixed models that have different variables. The first model only has the different treatments and grazing levels as variables. The second model also includes interactions between these treatment and grazing variables. The third and fourth models are otherwise the same as the first and the second but they also have some environmental covariates as additional variables. Every model also has the block number, where the seeds were sown as a random effect. The fifth chapter goes through the results of the models. First it shows the comparison of the predictive accuracy of all models. Then the gotten fixed effects, random effects and draws from posterior predictive distribution are presented for each model separately. Then the thesis ends with the sixth conclusions chapter
  • Li, Yinong (2024)
    The thesis is about developing a new neural network-based simulation-based inference (SBI) method for performing flexible point estimation; we call this method Neural Amortization of Bayesian Point Estimation (NBPE). Firstly, using neural networks, we can achieve amortized inference so that most of the computation cost is spent on training the neural network while performing inference only costs a few milliseconds. In this thesis, we utilize an encoder-decoder architecture; we use an encoder as a summary network to extract informative features from raw data and then feed them to a decoder as an inference network to output point estimations. Moreover, with a novel training method, the utilization of a variable \( \alpha \) in the loss function \( |\theta_i - \theta_{\text{pred}}|^\alpha \) enables the prediction of different statistics (mean, median, mode) of the posterior distribution. Thus, with our method, at inference time, we can get a fast point estimation, and if we want to get different statistics of the posterior, we have to specify the value of the power of the loss $\alpha$. When $\alpha = 2$, the result will be the mean; when $\alpha = 1$, the result will be the median; and when $\alpha$ is getting closer to 0, the result will approach the mode. We conducted comprehensive experiments on both toy and simulator models to demonstrate these features. In the first part of the analysis, we focused on testing the accuracy and efficiency of our method, NBPE. We compared it to the established method called Neural Posterior Estimation (NPE) in the BayesFlow SBI software. NBPE performs with competitive accuracy compared to NPE and can perform faster inference than NPE. In the second part of the analysis, we concentrated on the flexible point estimation capabilities of NBPE. We conducted experiments on three conjugate models since most of these models' posterior mean, median, and mode have analytical expressions, which leads to more straightforward analysis. The results show that at inference time, the different choices of $\alpha$ can influence the output exactly, and the results align with our expectations. In summary, in this thesis, we propose a new neural SBI method, NBPE, that can perform fast, accurate, and flexible point estimation, broadening the application of SBI in downstream tasks of Bayesian inference.
  • Jälkö, Joonas (2017)
    This thesis focuses on privacy-preserving statistical inference. We use a probabilistic point of view of privacy called differential privacy. Differential privacy ensures that replacing one individual from the dataset with another individual does not affect the results drastically. There are different versions of the differential privacy. This thesis considers the ε-differential privacy also known as the pure differential privacy, and also a relaxation known as the (ε, δ)-differential privacy. We state several important definitions and theorems of DP. The proofs for most of the theorems are given in this thesis. Our goal is to build a general framework for privacy preserving posterior inference. To achieve this we use an approximative approach for posterior inference called variational Bayesian (VB) methods. We build the basic concepts of variational inference with certain detail and show examples on how to apply variational inference. After giving the prerequisites on both DP and VB we state our main result, the differentially private variational inference (DPVI) method. We use a recently proposed doubly stochastic variational inference (DSVI) combined with Gaussian mechanism to build a privacy-preserving method for posterior inference. We give the algorithm definition and explain its parameters. The DPVI method is compared against the state-of-the-art method for DP posterior inference called the differentially private stochastic gradient Langevin dynamics (DP-SGLD). We compare the performance on two different models, the logistic regression model and the Gaussian mixture model. The DPVI method outperforms DP-SGLD in both tasks.
  • Rautavirta, Juhana (2022)
    Comparison of amphetamine profiles is a task in forensic chemistry and its goal is to make decisions on whether two samples of amphetamine originate from the same source or not. These decisions help identifying and prosecuting the suppliers of amphetamine, which is an illicit drug in Finland. The traditional approach of comparing amphetamine samples involves computation of the Pearson correlation coefficient between two real-valued sample vectors obtained by gas chromatography-mass spectrometry analysis. A two-sample problem, such as the problem of comparing drug samples, can also be tackled with methods such as a t-test or Bayes factors. Recently, a newer method called predictive agreement (PA) has been applied in the comparison of amphetamine profiles, comparing the posterior predictive distributions induced by two samples. In this thesis, we did a statistical validation of the use of this newer method in amphetamine profile comparison. In this thesis, we compared the performance of the predictive agreement method to the traditional method involving computation of the Pearson correlation coefficient. Techniques such as simulation and cross-validation were used in the validation. In the simulation part, we simulated enough data to compute 10 000 PA and correlation values between sample pairs. Cross-validation was used in a case-study, where a repeated 5-fold group cross-validation was used to study the effect of changes in the data used in training of the model. In the cross-validation, performance of the models was measured with area under curve (AUC) values of receiver operating characteristics (ROC) and precision-recall (PR) curves. For the validation, two separate datasets collected by the National Bureau of Investigation of Finland (NBI), were available. One of the datasets was a larger collection of amphetamine samples, whereas the other dataset was a more curated group of samples, of which we also know which samples are somehow linked to each other. On top of these datasets, we simulated data representing amphetamine samples that were either from different or same source. The results showed that with the simulated data, predictive agreement outperformed the traditional method in terms of distinguishing sample pairs consisting of samples from different sources, from sample pairs consisting of samples from the same source. The case-study showed that changes in the training data have quite a marginal effect on the performance of the predictive agreement method, and also that with real world data, the PA method outperformed the traditional method in terms of AUC-ROC and AUC-PR values. Additionally, we concluded that the PA method has the benefit of interpretation, where the PA value between two samples can be interpreted as the probability of these samples originating from the same source.