Equilibria for Generative Adversarial Networks
http://proceedings.mlr.press/v119/farnia20a/farnia20a.pdf
Generative Adversarial Networks (GANs) are a common approach to AI image generation. A GAN consists of two neural network models (large functions whose behavior is determined by optimizable parameters) trained by playing a minimax (zero-sum) game. One of these models is the generator, which transforms random noise into an image according to its parameters. The other is the discriminator, which makes a prediction for the probability of an image being real. Each unique combination of parameter values is a strategy that the models can use. The goal of training a GAN is for the models to reach a subgame perfect Nash equilibrium. It is assumed that a generator that is in equilibrium would produce images that are indistinguishable from the set of real images given to the discriminator.
In practice, reaching an equilibrium solution is more complicated. The outcome of every round of the game is determined by the value function V(G, D), which calculates how accurate the discriminator D is for real images and those produced by the generator G. Thus, the discriminator’s objective is to maximize V, while the generator’s objective is to minimize V. A model can only change strategies by estimating which similar strategy will give the best outcome. However, this game is unstable. Experiments have shown that after training both models for many iterations (so that the generator produces realistic images), continuing to train only the generator causes V to decrease but the images to become less realistic. This implies that the solution that had been previously reached was not an equilibrium.
A pair of strategies (G*, D*) is a subgame perfect Nash equilibrium if and only if for all possible G and D, V(G*, D) ≤ V(G*, D*) ≤ V(G, D*). This means that G* is the best response to D* and D* is the best response to G*. It can be proven that many GAN implementations do not necessarily have subgame perfect Nash equilibria. A more flexible definition of equilibrium is Stackelberg equilibrium. A pair of strategies (G*, D*) is a Stackelberg equilibrium solution if and only if no other G has a better outcome than G* given the discriminator chooses the best response to whichever G is used, and D* is the best response to G*.
These two definitions can be combined by defining a proximal value function V-prox(G, D) = max(D’ ∈ D)[V(G, D’)] – λ||D’ – D||², where λ is a constant hyperparameter. This can be interpreted as the value function of G and the best response to G (D’) minus λ times the sum of squared differences in the parameters of D’ and D. Then, for a pair of strategies (G*, D*) to be a proximal equilibrium solution, for all G in G and D in D, V-prox(G*, D) ≤ V-prox(G*, D*) ≤ V-prox(G, D*). This can be simplified to be V(G*, D) ≤ V(G*, D*) ≤ max(D’ ∈ D)[V(G, D’)] – λ||D’ – D*||². For λ = ∞, D’ must equal D*, so this becomes the definition of Nash equilibrium. For λ = 0, this becomes the definition of Stackelberg equilibrium. For any other λ value, the proximal value function restricts the strategies that the discriminator can use given the current D. Additionally, for any 0 ≤ λ1 ≤ λ2, the set of λ2-proximal equilibria solutions is a subset of the λ1-proximal equilibria solutions. By choosing the right λ value, a subgame perfect Nash equilibria solution will exist. More intuitively, proximal equilibrium can be interpreted as when a strategy G* is the best response to any of the strategies within an allowed distance of D*, and D* is the best response to G*.
Using a proximal value function when training GANs has been found to be more stable. Repeating the previous experiment with the proximal value function gave values and images that changed little after training the generator for more iterations. This suggests that Nash equilibrium solutions always exist for proximal value functions (with proper λ values) and are easier to find than equilibrium solutions for the standard value function V.
