Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Author "Airamo, Niko"

Sort by: Order: Results:

  • Airamo, Niko (2022)
    Regret is the value lost by playing an action on the current round of a iterative game. The idea of regret matching is to generate strategies that minimize regret, since it is guaranteed by folk theorem to converge to Nash Equilibrium in a two-player zero-sum game. Storing cumulative regrets after each iteration enables to use of regret matching. It is an algorithm that chooses next iteration strategy based on the cumulative regrets of the actions. This procedure itself would converge to Nash Equilibrium in a normal form game zero-sum game. For extensive form game the data storing will become too resource demanding, when the game size is even moderate. However, it's possible to minimize regret on each information set separately using counterfactual regret. Counterfactual regret is calculated by using counterfactual values. Counterfactual value calculates expected value given that player tries to get to an information set and play a certain action in it. The difference between the action and expected value of the information set is counterfactual regret. Similarly minimizing regrets converges toward Nash Equilibrium. Using counterfactual regret minimization framework I create an iterative self-play algorithm to solve two-player zero-sum imperfect information games. In this thesis I work with last betting round of limit Hold'Em. CFR+ uses improved strategy averaging, does not store negative regrets and alternates between strategy updating player. I did eventually manage to construct CFR+ algorithm, and it seems extremely effective compared to earlier versions of the algorithm.