Skip to main content
Login | Suomeksi | På svenska | In English

Amortized Bayesian inference of Gaussian process hyperparameters

Show full item record

Title: Amortized Bayesian inference of Gaussian process hyperparameters
Author(s): Rehn, Aki
Contributor: University of Helsinki, Faculty of Science
Degree program: Master's Programme in Data Science
Specialisation: no specialization
Language: English
Acceptance year: 2022
The application of Gaussian processes (GPs) is limited by the rather slow process of optimizing the hyperparameters of a GP kernel which causes problems especially in applications -- such as Bayesian optimization -- that involve repeated optimization of the kernel hyperparameters. Recently, the issue was addressed by a method that "amortizes" the inference of the hyperparameters using a hierarchical neural network architecture to predict the GP hyperparameters from data; the model is trained on a synthetic GP dataset and in general does not require retraining for unseen data. We asked if we can understand the method well enough to replicate it with a squared exponential kernel with automatic relevance determination (SE-ARD). We also asked if it is feasible to extend the system to predict posterior approximations instead of point-estimates to support fully Bayesian GPs. We introduce the theory behind Bayesian inference; gradient-based optimization; Gaussian process regression; variational inference; neural networks and the transformer architecture; the method that predicts point-estimates of the hyperparameters; and finally our proposed architecture to extend the method to a variational inference framework. We were able to successfully replicate the method from scratch with an SE-ARD kernel. In our experiments, we show that our replicated version of the method works and gives good results. We also implemented the proposed extension of the method to a variational inference framework. In our experiments, we do not find concrete reasons that would prevent the model from functioning, but observe that the model is very difficult to train. The final model that we were able to train predicted good means for (Gaussian) posterior approximations, but the variances that the model predicted were abnormally large. We analyze possible causes and suggest future work.
Keyword(s): Bayesian inference Gaussian process variational inference neural network transformer

Files in this item

Files Size Format View
Rehn_Aki_thesis_2022.pdf 2.349Mb PDF

This item appears in the following Collection(s)

Show full item record