Applying Temporal Difference Learning on the Board Game Risk

Kasper Emil Dueholm Freiman, Johan Mogens Følsgaard & Amir Zoet

Studenteropgave: Kandidatprojekt

Abstrakt

The goal of this report was to attempt to successfully train an AI to play a digital version
of the board game Risk, by using a neural network and temporal dierence learning. Due
in part to lack of computational power and part to encourage more aggressive behaviour,
some limitations where introduced to the rules of the game. Using a constant value for
the learning rate parameter, = 0:9, the decay parameter, = 0:5, 127 input nodes
and 20 hidden nodes, the AI was trained through 30000 sessions of Risk against itself.
Afterwards the the trained AI played out 5000 games against an AI with random weights
resulting in 3009 wins for the trained AI. The trained AI was shown to have a win rate
of 58% - 62% against a random opponent with 99% certainty. Using the same parameter
values for , and the number of hidden nodes another instance of the AI was trained for
25000 games against 3 copies of itself using 211 input nodes. After 5000 games against 3
random players the trained AI could not be shown to have a win rate of more than 25%
meaning no training was evident. As such the results show that at least an AI can be
successfully trained to play Risk with some limitations against 1 random opponent.

UddannelserDatalogi, (Bachelor/kandidatuddannelse) Kandidat
SprogEngelsk
Udgivelsesdato10 jun. 2016
Antal sider38
VejledereKeld Helsgaun