Nash Equilibrium and Chess Engines
Article: Giraffe: Using Deep Reinforcement Learning to Play Chess
In class, we were introduced to Game Theory, specifically analyzing strategies in a game and finding Nash equilibria. While the examples and applications touched upon in lecture and the textbook are fairly simple, with students able to easily compute and find the equilibria, there are a lot of more complex games that have Nash equilibria that would be hard to find. If we can use Nash equilibria to determine what the best strategy for a player is in a certain game, then it makes sense that we can create Artificial Intelligence (AI) bots to play the game optimally. However, for more complex games such as chess, while we know that Nash Equilibria exist for finite closed games, we need to employ other strategies to create optimal chess engines.
A lot of research has gone into developing Chess Engines, and with a lot of success, with the main strategy being to pick the move that has the highest chance of leading to a winning game. The limits on processing power mean that it is impossible to go through all possible moves and predict, for each of them, the subsequent moves until the end of the game. Therefore, chess engines can only approximate or estimate Nash equilibrium while placing constraints to make computations more feasible, and this is also where machine learning comes in. Neural networks and reinforced learning help chess engines learn common patterns and strategies used in order to narrow down the space needed to search for the next best move and also to assign scores to each decision to determine how effective each move is. Hence, designing a chess engine also includes designing the machine learning models, choosing different heuristics to prioritize, and determining how to best score different board positions and configurations. In fact, development of chess engines continues to change and evolve, incorporating different approaches and strategies such as deep learning, self-learning, and other variants of machine learning, but regardless of the implementation, the end goal will still be approximate Nash equilibrium to produce the optimal strategy in response to any strategy by an opponent.