Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "syväoppiminen"

Sort by: Order: Results:

  • Berg, Anton (2022)
    This master's thesis seeks to conceptually replicate psychologist Michael Kosinski's study, published in 2021 in Nature Scientific Reports, in which he trained a cross-validated logistic regression model to predict political orientations from facial images. Kosinski reported that his model achieved an accuracy of 72\%, which is significantly higher than the 55\% accuracy measured in humans for the same task. Kosinski's research attracted a huge amount of attention and also accusations of pseudoscience. Where Kosinski trained his model with facial features containing information for example about head position and emotions, in this thesis I use a deep learning convolutional neural network for the same task. Also, I train my model with Finnish data, consisting of photos of the faces of Finnish left- and right-wing candidates gathered from the 2021 municipal elections. I research whether a convolutional neural network can learn to predict from candidates' faces whether a member of a Finnish party belongs to either the right-wing Coalition Party (Coalition) or the left-wing Left Alliance (Left Alliance) with better than 55\% accuracy, and what is the possible role of color information on the classification accuracy of the model. On this basis, I also consider the wider ethical issues surrounding these types of models and the technological advances they bring. There has been a recent ethical debate on the widespread use of facial recognition technology in relation to issues such as human autonomy, privacy, and civil liberties. In the context of previous scientific findings, there has also been debate about the potential ability of facial recognition technologies to reveal information about our most personal traits, such as sexual orientation, personality, and emotional states. Thus, facial recognition technologies are also closely related to privacy issues. In his original article, Michael Kosinski did not underestimate the many problematic ethical issues that the use of facial recognition technology can raise. He did, however, underline the role of science in trying to determine the function, capability, and accuracy of these technologies. Only through research can we gain insights into these technologies, which can then potentially be used to inform societal decision-making. This research approach is also the aim of this Master's thesis.
  • Kylliäinen, Ilmari (2022)
    Automatic question answering and question generation are two closely related natural language processing tasks. They both have been studied for decades, and both have a wide range of uses. While systems that can answer questions formed in natural language can help with all kinds of information needs, automatic question generation can be used, for example, to automatically create reading comprehension tasks and improve the interactivity of virtual assistants. These days, the best results in both question answering and question generation are obtained by utilizing pre-trained neural language models based on the transformer architecture. Such models are typically first pre-trained with raw language data and then fine-tuned for various tasks using task-specific annotated datasets. So far, no models that can answer or generate questions purely in Finnish have been reported. In order to create them using modern transformer-based methods, both a pre-trained language model and a sufficiently big dataset suitable for question answering or question generation fine-tuning are required. Although some suitable models that have been pre-trained with Finnish or multilingual data are already available, a big bottleneck is the lack of annotated data needed for fine-tuning the models. In this thesis, I create the first transformer-based neural network models for Finnish question answering and question generation. I present a method for creating a dataset for fine-tuning pre-trained models for the two tasks. The dataset creation is based on automatic translation of an existing dataset (SQuAD) and automatic normalization of the translated data. Using the created dataset, I fine-tune several pre-trained models to answer and generate questions in Finnish and evaluate their performance. I use monolingual BERT and GPT-2 models as well as a multilingual BERT model. The results show that the transformer architecture is well suited also for Finnish question answering and question generation. They also indicate that the synthetically generated dataset can be a useful fine-tuning resource for these tasks. The best results in both tasks are obtained by fine-tuned BERT models which have been pre-trained with only Finnish data. The fine-tuned multilingual BERT models come in close, whereas fine-tuned GPT-2 models are generally found to underperform. The data developed for this thesis will be released to the research community to support future research on question answering and generation, and the models will be released as benchmarks.