Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "neural network"

Sort by: Order: Results:

  • Rehn, Aki (2022)
    The application of Gaussian processes (GPs) is limited by the rather slow process of optimizing the hyperparameters of a GP kernel which causes problems especially in applications -- such as Bayesian optimization -- that involve repeated optimization of the kernel hyperparameters. Recently, the issue was addressed by a method that "amortizes" the inference of the hyperparameters using a hierarchical neural network architecture to predict the GP hyperparameters from data; the model is trained on a synthetic GP dataset and in general does not require retraining for unseen data. We asked if we can understand the method well enough to replicate it with a squared exponential kernel with automatic relevance determination (SE-ARD). We also asked if it is feasible to extend the system to predict posterior approximations instead of point-estimates to support fully Bayesian GPs. We introduce the theory behind Bayesian inference; gradient-based optimization; Gaussian process regression; variational inference; neural networks and the transformer architecture; the method that predicts point-estimates of the hyperparameters; and finally our proposed architecture to extend the method to a variational inference framework. We were able to successfully replicate the method from scratch with an SE-ARD kernel. In our experiments, we show that our replicated version of the method works and gives good results. We also implemented the proposed extension of the method to a variational inference framework. In our experiments, we do not find concrete reasons that would prevent the model from functioning, but observe that the model is very difficult to train. The final model that we were able to train predicted good means for (Gaussian) posterior approximations, but the variances that the model predicted were abnormally large. We analyze possible causes and suggest future work.
  • Hommy, Antwan (2024)
    Machine learning (ML) is becoming increasingly important in the telecommunications industry. The purpose of machine learning models in telecommunications is to outperform a classical receiver’s performance by fine-tuning parameters. Since ML models have the advantage of being more concise, their performance is easier to evaluate, contrary to a classical receiver’s multiple blocks each with their own small errors. Evaluating the said models, however, is challenging. To identify the correct parameters is also not trivial. To address this issue, a coherent and reliant hyperparameter optimization method needs to be introduced. This thesis investigates how a hyperparameter optimization method can be implemented, and which one is best suited for the problem. It looks into the value it provides, the metrics displayed for each hyperparameter set during training and inference, and the challenges of realising such a system, in addition to various other qualities needed for an efficient training stage. The framework aims to provide valuable insight into model accuracy, validation loss, computing cost, signal-to-noise ratio improvement, and available resources when using hyperparameter tuning. The framework uses grid search optimization, Bayesian optimization as well as genetic algorithm optimization to determine which performs better, and compare the results between them. Grid search will act as a reference baseline for the performance of the two algorithms. The thesis is split into two parts: Phase One, which implements the system in a sandbox-like manner, essentially acting as a testing platform to assess the implementation compatibility. Phase Two inspects a more real-case scenario more suitable for a 5G physical layer environment. The proposed framework uses modern, widely used orchestration and development tools. These include ResNet, Pytorch, and sklearn.
  • Li, Yinong (2024)
    The thesis is about developing a new neural network-based simulation-based inference (SBI) method for performing flexible point estimation; we call this method Neural Amortization of Bayesian Point Estimation (NBPE). Firstly, using neural networks, we can achieve amortized inference so that most of the computation cost is spent on training the neural network while performing inference only costs a few milliseconds. In this thesis, we utilize an encoder-decoder architecture; we use an encoder as a summary network to extract informative features from raw data and then feed them to a decoder as an inference network to output point estimations. Moreover, with a novel training method, the utilization of a variable \( \alpha \) in the loss function \( |\theta_i - \theta_{\text{pred}}|^\alpha \) enables the prediction of different statistics (mean, median, mode) of the posterior distribution. Thus, with our method, at inference time, we can get a fast point estimation, and if we want to get different statistics of the posterior, we have to specify the value of the power of the loss $\alpha$. When $\alpha = 2$, the result will be the mean; when $\alpha = 1$, the result will be the median; and when $\alpha$ is getting closer to 0, the result will approach the mode. We conducted comprehensive experiments on both toy and simulator models to demonstrate these features. In the first part of the analysis, we focused on testing the accuracy and efficiency of our method, NBPE. We compared it to the established method called Neural Posterior Estimation (NPE) in the BayesFlow SBI software. NBPE performs with competitive accuracy compared to NPE and can perform faster inference than NPE. In the second part of the analysis, we concentrated on the flexible point estimation capabilities of NBPE. We conducted experiments on three conjugate models since most of these models' posterior mean, median, and mode have analytical expressions, which leads to more straightforward analysis. The results show that at inference time, the different choices of $\alpha$ can influence the output exactly, and the results align with our expectations. In summary, in this thesis, we propose a new neural SBI method, NBPE, that can perform fast, accurate, and flexible point estimation, broadening the application of SBI in downstream tasks of Bayesian inference.