Skip to main content
Login | Suomeksi | På svenska | In English

Causal-aware feature selection for domain adaptation

Show full item record

Title: Causal-aware feature selection for domain adaptation
Author(s): Porna, Ilkka
Contributor: University of Helsinki, Faculty of Science
Degree program: Master's Programme in Data Science
Specialisation: no specialization
Language: English
Acceptance year: 2022
Abstract:
Despite development in many areas of machine learning in recent decades, still, changing data sources between the domain in a model is trained and the domain in the same model is used for predictions is a fundamental and common problem. In the area of domain adaptation, these circum- stances have been studied by incorporating causal knowledge about the information flow between features to be utilized in the feature selection for the model. That work has shown promising results to accomplish so-called invariant causal prediction, which means a prediction performance is immune to the change levels between domains. Within these approaches, recognizing the Markov blanket to the target variable has served as a principal workhorse to find the optimal starting point. In this thesis, we continue to investigate closely the property of invariant prediction performance within Markov blankets to target variable. Also, some scenarios with latent parents involved in the Markov blanket are included to understand the role of the related covariates around the latent parent effect to the invariant prediction properties. Before the experiments, we cover the concepts of Makov blankets, structural causal models, causal feature selection, covariate shift, and target shift. We also look into ways to measure bias between changing domains by introducing transfer bias and incomplete information bias, as these biases play an important role in the feature selection, often being a trade-off situation between these biases. In the experiments, simulated data sets are generated from structural causal models to conduct the testing scenarios with the changing conditions of interest. With different scenarios, we investigate changes in the features of Markov blankets between training and prediction domains. Some scenarios involve changes in latent covariates as well. As result, we show that parent features are generally steady predictors enabling invariant prediction. An exception is a changing target, which basically requires more information about the changes in other earlier domains to enable invariant prediction. Also, emerging with latent parents, it is important to have some real direct causes in the feature sets to achieve invariant prediction performance.
Keyword(s): causality domain adaptation feature selection Markov blankets latent variables soft interventions


Files in this item

Files Size Format View
Porna_Ilkka_thesis_2022.pdf 1.583Mb PDF

This item appears in the following Collection(s)

Show full item record