The Minimax Algorithm in Tic-Tac-Toe: When graphs, game theory and algorithms come together
The graphs are very fun to explore and study! … until they get too big. Consider, for the example, the game of Tic-Tac-Toe. If we make each possible state of the board a node and connect two nodes with an edge if there is a difference of one move between them, the graph we will get is going to consist of 362,880 different nodes! By eliminating symmetry and recognizing when different sequences of moves lead to the same position, Gary Fredericks got this number down to 765 distinct board positions. Here is the graph visualization he came up with:
Source of the picture: https://gfredericks.com/blog/76
To truly become unbeatable at the game and always make the move with the greatest payoff, one would have to be able to model every possible outcome for every possible state of the field. This sounds like something nearly impossible to do for a human, and that is when the Minimax algorithm (and the game theory) come for the rescue.
The name of the algorithm is a product of two words: minimize and maximize. That is, minimize the payoff for the opponent, maximize the payoff for yourself. If a sequence of moves leads to a victory, its payoff is evaluated at +10 points. If it leads to a loss, the payoff is -10. If a sequence ultimately results in a draw, the payoff it brings to the table is 0.
As an example, let’s look at the near end of a game, when it’s turn for Player X to make a move:
Source of the picture: https://www.neverstopbuilding.com/blog/minimax
Out of the three available nodes the initial state node is connected to, the one on the left yields the greatest payoff (+10 points), so this is the move to make.
Similarly, player O is trying to maximize their own payoff. Player X knows this and evaluates their opponent’s moves as well. The following picture serves as a good example of this process:
Source of the picture: https://www.neverstopbuilding.com/blog/minimax
Both top nodes are states of the board (move sequences) which bring a payoff of -10. Even though there is a two-step path to a node with a payoff of +10, there is an obvious choice Player O will make (the algorithm assumes them to be a perfect minimizer) that will yield a good outcome for them (and a bad one for Player X).
The Minimax algorithm evaluates the payoff of each available move by taking turns for the minimizer and the maximizer. The following picture is a perfect example of the evaluation process:
Source of the picture: https://www.geeksforgeeks.org/minimax-algorithm-in-game-theory-set-3-tic-tac-toe-ai-finding-optimal-move/
Here, the initial state is node 1, so X has three possible moves. The payoff for node 2 is -10 (I explained why in the previous example). The payoff for node 3 is 0, because Player O is assumed to be the perfect minimizer, and therefore will not allow X to get +10. The payoff for node 4 is +10, so this is the move to make.
The Minimax Tic-Tac-Toe algorithm is impossible to beat, and when two Minimaxes play against each other, every move they make is the best response to what the opponent could possibly do (Nash equilibrium), resulting in 100% chance of a draw. Want to try it yourself? Check out this website developed by Jason Fox of Never Stop Building LLC: http://perfecttictactoe.herokuapp.com/ .
Sources:
Aradhya, A. L. (2022, April 29). Minimax algorithm in Game theory: Set 3 (tic-tac-toe ai – finding optimal move). GeeksforGeeks. Retrieved September 13, 2022, from https://www.geeksforgeeks.org/minimax-algorithm-in-game-theory-set-3-tic-tac-toe-ai-finding-optimal-move/
Fox, J. (2021, November 10). Tic Tac Toe: Understanding the minimax algorithm. Never Stop Building – Crafting Wood with Japanese Techniques. Retrieved September 13, 2022, from https://www.neverstopbuilding.com/blog/minimax
Fredericks, G. (2010, September 18). Visualizing Tic-tac-Toe. gfredericks.com. Retrieved September 13, 2022, from https://gfredericks.com/blog/76
[…] Запрос на удаление Посмотреть полный ответ на blogs.cornell.edu […]