Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Author "Vainionpää, Matti"

Sort by: Order: Results:

  • Vainionpää, Matti (2022)
    The aim of the research presented in this dissertation is to construct a model for personalised item recommendations in an online setting using a reinforcement learning approach, specifically Thompson sampling, which is part of the family of multi-armed bandit algorithms. Moreover the setting involves an online shopfront where arriving customers get viewed with the recommended item, and make purchasing decisions of them. The recommendations are conducted by the multi- armed bandit algorithm which "plays" different arms, represented by the items, while learning, exploring and exploiting the underlying distributions of the data that is obtained. Thompson sampling and the theory behind it is introduced thoroughly and comparison against other bandit algorithms as well as a multinomial logistic regression model is conducted both on real-life data collected over time from an online environment and a dummy data set. The experiments focus on the applicability of bandits in the setting, dealing with challenges that a bandit algorithm may face and the strengths they have over more traditional and well known models such as the logistic regression model in the setting at hand.