Outsmarting the Prisoner’s Dilemma with Punishments: Preventing the Tragedy of the Commons
https://academics.hamilton.edu/economics/cgeorges/game-theory-files/repeated.pdf
https://online.hbs.edu/blog/post/tragedy-of-the-commons-impact-on-sustainability-issues
The Tragedy of the Commons is a term that describes the overexploitation of public resources, usually environmental resources, caused by human self-interest. For instance, a group of people who allow their animals to graze on a piece of land without investing in protective measures to maintain the health of the land will quickly deplete the land. In terms of long-term considerations, all individuals would be worse off since they would lose a valuable resource, however a large number of people are still unwilling to put in effort to protect the public resources they are using.
The Tragedy of the Commons is similar to the Prisoner’s Dilemma, except confess/not confess is replaced with protect public resources/don’t protect public resources. Therefore, each individual involved in such a game will choose the dominant strategy of ‘don’t protect public resources’ and not be incentivized to deviate from said strategy. However, there is a possible method of incentivizing players to deviate from this dominant strategy. This blogpost will set out to prove that there is a method to prevent the Tragedy of the Commons in some situations using game theory, then find out some possible limitations of the method.
Given the decision to utilize a public resource is not a one-off decision, but rather a continuous one for many individuals(E.g Pour pollution into river every month, allow herds to graze every year, and such), instead of viewing the decisions of individuals using these resources as a one-off game, one could visualize them as a series of the same repeated game.
Given Player 1 and Player 2 are involved in some k rounds of the same repeated game as shown below:
Player B | |||
Player A | Protect Resource | Don’t Protect Resource | |
Protect Resource | A1, B1 | A2, B2 | |
Don’t Protect Resource | A3, B3 | A4, B4 |
(With similar outcomes to a Prisoner’s Dilemma)
With every set time period, Players A and B have to play a round of the game shown above to decide whether they would choose Protect Resource(PR) or Don’t Protect Resource(DPR). For instance, Players A and B could be two companies trying to decide on whether to dump more pollution into a river every month. If Players A and B were to only play k=1 rounds of the game, both would choose Don’t Protect Resource as A3>A1, A4>A2, B3>B1, B4>B2. Therefore, each player would end up with payoffs A4 and B4 respectively. However, if they were to play rounds of games more than k=1, and if they were to add out all their total payoffs, both players would understand that not only k(A4) And k(B4) will be lower outcomes than k(A3) and k(A4), and therefore if they were to cooperate, they could both end up with higher outcomes. While neither player has the incentive to cooperate during a one-round game, if the game were to be repeated for more times, they may have the incentive to cooperate if punishments were to be enforced for not cooperating.
Say that Player A and Player B agreed to only play Protect Resource for the k rounds of games, then they would have less of an incentive to deviate. For instance, if at one of the games Player B deviates from Protect Resource and wins a high outcome of B2 compared to Player A’s A2, Player A could enact a punishment upon player B by only choosing to play Don’t Protect Resource for all remaining games. Therefore, Player B could have originally ended up with k(B1) outcome for all games, but instead, they will only receive a payoff of B2+(k-x)B1+(x-1)B4 for all games, with x being the round at which Player B broke the cooperation.
However, in order to make sure the punishment actually works, the payoffs would have to be that:
k(A1) > A2 + (k-x)A1 + (x-1)A4
k(B1) > B2 + (k-x)B1 + (x-1)B4
(In other words, the payoffs for cooperating is higher than not cooperating)
However there are some payoffs, or some k rounds of games, in which punishments would not be effective in making sure all Players continue to cooperate.
For instance, take the scenario of two tourism companies taking tourists on weekly scuba diving experiences around an area of fragile coral reefs. They can either choose to simultaneously invest money on protective programs for the corals and continue their business, or they could simply neglect the long-term health of the corals.
Player B | |||
Player A | Protect Resource | Don’t Protect Resource | |
Protect Resource | 3,3 | 0,5 | |
Don’t Protect Resource | 5,0 | 2,2 |
(With numbers representing expected revenue of a company, in millions. If both players choose don’t protect resource, both of the payoffs will be lower than if they both chose protect resource, due to the degradation of the corals and hence less tourists)
Both player 1 and player 2 would be incentivized to deviate from cooperation at the first round:
(If player A deviates:)
Payoff for Player A | Payoff for Player B | |
Round 1 | 5 | 0 |
Round 2 | 2 | 2 |
Total payoff of A: 7
Total payoff of B: 2
(If player B deviates:)
Payoff for Player A | Payoff for Player B | |
Round 1 | 0 | 5 |
Round 2 | 2 | 2 |
Total payoff of A: 2
Total payoff of B: 7
(If neither deviates:)
Payoff for Player A | Payoff for Player B | |
Round 1 | 3 | 3 |
Round 2 | 2 | 2 |
Total payoff of A: 5
Total payoff of B: 5
Payoff 7 > 5.
Or, if the payoff for choosing don’t protect resources when the other player continues to cooperate is incredibly high, then players may also be tempted to deviate. Therefore, the effectiveness of punishments for this repeated game can depend on the number of rounds played at the values of the payoffs.
However, if the game is repeated for many more rounds, say k=100, then it is very likely neither player will be incentivized to deviate until the last couple of games. Say for instance k=100 rounds of the game above are played:
Payoff of Player A if they cooperates: 300
Payoff of Player A if they deviates: 3+3+3+…+5+2+2+2…….<300
Payoff of Player B if they cooperates: 300
Payoff of Player B if they deviates: 3+3+3+…+5+2+2+2…….<300
Therefore, the Tragedy of the Commons could potentially be avoided by these two companies in this scenario.
In conclusion, the fact that players can cooperate and create higher payoffs for themselves while utilizing punishments to enforce said cooperation shows that the tragedy of the commons can, in fact, be resolved in certain situations.