GANs as Games
[4]
Take a look at the images above. While they may look like random photographs of people’s faces, they were in fact created by a computer. Generating images that look that realistic was an incredibly hard–if not impossible–task for computers until the recent development of General Adversarial Networks (or GANs) in 2014 [1]. On a high level, the purpose of GANs are to create a model that generates data that is as close to a given distribution of data as possible (for example, the GAN that generated the photos above was given a dataset of human headshots and was quite successful at creating data to match the distribution). GANs do this by utilizing two separate neural networks–a generator and a discriminator (it’s unimportant to know the specifics of how neural networks work but we can basically think of them as black-boxes that try their best to approximate a given transformation). The purpose of the generator is to take random noise as input and transform it into data that resembles the given distribution. The purpose of the discriminator is to detect if samples of data were selected from the original distribution or created by the generator. In a sense, the two networks are competing with each other–the generator is trying to create realistic-looking data while the discriminator is trying to differentiate between the real and generated data. The interaction between the two is what gives GANs their “adversarial” nature.
These adversarial processes can be represented as a zero sum game between the generator and discriminator [2]. The requirements of a game are players, strategies in which the players can choose from, and payoffs for the respective players given a pair of strategies they choose. Each of these components can be mapped to GANs. The players are the generator and the discriminator, denote these with g and d respectively. The strategies the players can choose from are the weights of the neural networks (these are what the networks use to approximate their assigned transformation). Let the weights of the generator and discriminator be x_g and x_d respectively. The payoff of each pair of strategies is derived from what’s called a loss function J_d (x_g, x_d) that measures how good the discriminator is performing. The payoff for the discriminator is J_d(x_g, x_d) while the payoff for the generator is -J_d(x_g, x_d). The fact that the payoffs are opposites is what gives GANs their zero sum nature. Now that we have set up GANs as a game, we know that there exists some Nash equilibrium in which neither the generator nor the discriminator are incentivized to change their strategies. Unfortunately, finding the Nash equilibria for GANs is exceedingly difficult (in class we only looked at games with a three or fewer strategies per player but in the case of GANs there are an infinite amount of strategies for each player). In fact, finding a Nash equilibrium for GANs has been so difficult that some researchers hypothesize that “GANs may have no Nash equilibria [3].”
[1] https://arxiv.org/abs/1406.2661
[2] https://arxiv.org/pdf/2003.13637
[3] https://arxiv.org/abs/2002.09124
[4] https://github.com/NVlabs/stylegan2