Skip to main content



Considering a Dominant Strategy for Codenames

Codenames is a fun family game which can be analyzed extensively using graph theory and game theory. The game is set up with a grid of words. The words are divided into red, blue, white, and black, but you cannot tell which words are each color. There are two teams(red and blue). Each team has a codemaster who knows the color of all the words. To play the game, the codemasters take turns giving hints to their teammates who then try to guess which words are theirs. The game ends when all the words of a team have been guessed. Hints are given in the form of a single word (which cannot be one of the words on the board) followed by a number (representing one less than the number of words which can be guessed). The team can keep guessing up to the number given as long as every word they guess is correct. Finally, if either team guesses the one black word, they lose immediately.

Example of a codenames game in progress.

Since Codenames is an actual game, we can take a game theory approach and consider the payoffs of different strategies to attempt to find some which are dominant. There are two different roles in codenames: that of the codemaster and that of the guesser. First, let’s consider the different strategies each player has. A codemaster can say any word not present on the board and a number from zero to nine. If you consider the English language to have around 170,000 words, that gives 1.7 million word-number combinations. The number of strategies the guesser has is much less, at any time they can only guess one of the words on the board. At most (at the start of the game), this gives them 25 possible strategies. This major difference in number of strategies makes it so the guesser’s dominant strategy is simpler than the other, but since both the guesser and the codemaster are on the same team, their payoffs are linked, and the payoff of any guesser’s strategy is dependent on the strategy of the codemaster.

From my experience playing Codenames, there is one dominant behavior which both teams employ. The codemaster picks a subset of words on the board which are all words of their own team’s color, thinks of a word which is closely linked to all of these words, and then says this word followed by the number of words in the subset they picked. When this strategy is used, the team that wins is usually the team whose hints are abstract enough that they can apply to many words while still specific enough that the guesser doesn’t guess the other words by mistake. In line with this, the guesser’s dominant strategy is to guess the number of words given by the codemaster, starting with the word with the highest relatedness and finishing with the word of the least relatedness. 

To implement this strategy, we need to define what it means for words to be “linked” or “related”. In actuality, any definition of relatedness is acceptable as long as both the codemaster and the guesser agree on the definition. To give an example of an unusual definition of relatedness, one could say that two words are more related if they have more letters in common. Another definition is that words are related if they have similar semantic value. While it is difficult to mathematically quantify semantic relatedness, human minds do this task very well. Natural Language Processing research has attacked this problem, and have been able to represent words as vectors in n-dimensional space such that words which are similar have similar directions and magnitude. With a little computational work, we can translate this vector space into a relatedness graph. In this graph, each word is a node and there exists weighted edges between words such that the weight of an edge represents the semantic relatedness of the nodes it connects over a range. For example, the edge between “cat” and “dog” might have a value of .75 while the edge between “cat” and “baseball” might have a value of .1. In calculating the value of edges between words we could project one word onto another and measure the similarity in magnitude between the projection and the actual word. (I’m sure there are other ways to measure this similarity as well) Additionally, if we were actually doing this work on a computer, we could greatly reduce the memory requirements for this graph by picking a threshold and only adding an edge to the graph if the relatedness value is greater than the threshold. 

If we assume that both the guesser and the codemaster have access to this graph, we can define their best strategy in terms of it. First the guesser: The guesser receives a word and a number. They should look at all the edges between the given word and their options (the unpicked words on the board) and pick the word with the highest value edge. If this word turns out to be correct, they should then pick the word which forms an edge with the next highest value. The player should stop guessing when they have guessed the number of words given by the codemaster. The codemaster should look at all words in the graph which are connected to one or more of the target words. For each of these potential clues, it should order the words on the board based on relatedness to the clue. For each clue, let n_clue be the number of words in the list which are in the target group of words that the team needs to guess before a non-target word comes up. The codemaster should pick the clue which maximizes n_clue, and pick the number n_clue.

 

 

  1. Erk and S. Pado. 2008. A structured vector space model for word meaning in context. In Proceedings of EMNLP-08, Hawaii.

Comments

Leave a Reply

Blogging Calendar

September 2022
M T W T F S S
 1234
567891011
12131415161718
19202122232425
2627282930  

Archives