Making it easier to play my Tic-Tac-Toe agent

Background #

As part of my senior project in undergrad, I made a Tic-tac-toe playing RL agent. It used a simple temporal difference (TD) update rule that can be found in Chapter 1 of Sutton and Barto’s RL book. In fact, all of the details for how to build the agent can be found in Chapter 1 of that book. They do a case study of making a Tic-Tac-Toe playing algorithm and cover everything from what’s needed from the environment, to the update rule for the agent. Definitely worth a read, especially if you want to implement one yourself.

The Update #

A friend asked me how he could play my trained agent, so I chose to go ahead and write a simple script to make it easy for anyone (with a tiny bit of terminal knowledge) to play against it. Here’s how to do it:

Head over to my GitHub repository and clone it:

git clone https://github.com/jfpettit/senior-practicum.git

Once you’ve cloned it, go ahead and cd into the repository and into the Tic-tac-toe folder:

cd senior-practicum/TD_tictactoe/

At last, you can run the game with:

python tictactoe_runner.py

Here’s a sample of a game I played with it so you know what kind of output should show up in your terminal:

Jacobs-MacBook-Pro:TD_TicTacToe jacobpettit$ python tictactoe_runner.py 
Select piece to play as: input X or O:x
[['-' '-' '-']
 ['-' '-' '-']
 ['-' '-' '-']]
Input your move coordinates, separated by a comma: 1,1
[['-' '-' 'O']
 ['-' 'X' '-']
 ['-' '-' '-']]
Input your move coordinates, separated by a comma: 2,0
[['-' '-' 'O']
 ['-' 'X' '-']
 ['X' '-' 'O']]
Input your move coordinates, separated by a comma: 1,2
[['-' '-' 'O']
 ['O' 'X' 'X']
 ['X' '-' 'O']]
Input your move coordinates, separated by a comma: 2,1
[['-' 'O' 'O']
 ['O' 'X' 'X']
 ['X' 'X' 'O']]
Input your move coordinates, separated by a comma: 0,0
[['X' 'O' 'O']
 ['O' 'X' 'X']
 ['X' 'X' 'O']]

So, in this case, nobody won. The code doesn’t print out the winner of the game, so don’t expect any output after that last move.

 
3
Kudos
 
3
Kudos

Now read this

Introducing gym-snake-rl

github_repo Motivation # Although there are existing implementations of the classic Snake game (play the game here), I wanted to create my own implementation for a few reasons: Opportunity to learn more about environment design,... Continue →