Game theory is a framework for analysing the outcome of the strategic interaction between decision makers1. The fundamental concept is that of a Nash equilibrium where no player can improve her payoff by a unilateral strategy change. Typically, the Nash equilibrium is considered to be the optimal outcome of a game, however in social dilemmas the individual optimal outcome is at odds with the collective optimal outcome2. This means that one player can improve her payoff at the expense of the other by unilaterally deviating, but if both deviate, they end up with lower payoffs. In this type of games, the mutually beneficial, but non-Nash equilibrium strategy is called cooperation. However, in this context cooperation should not be interpreted as an interest in the welfare of others, as players only aim to secure a high payoff for themselves.
In this framework, payoff maximisation is considered to be rational, but when such rational players then seize every opportunity to gain at the opponent’s expense, they may counterintuitively both end up with low payoffs. A game that clearly exhibits this contradiction is the Traveler’s Dilemma. Since its formulation in 1994 by the economist Kaushik Basu3, the game has become one of the most studied in the economics literature. Additionally, it has been discussed in theoretical biology in the context of evolutionary game theory.
In general, the dilemma relies on the individuals’ incentive to undercut the opponent. To be more specific, players are motivated to claim a lower value than their opponent to reach a higher payoff at the opponent’s expense. Such incentive leads players to a systematic mutual undercutting until the lowest possible payoff is reached, which is the unique Nash equilibrium. It seems paradoxical that players defined as rational in a game theoretical sense end up with such a poor outcome. Therefore, the question that naturally arises is how can this poor outcome be prevented and how cooperation can be achieved.
To address these questions, it can be helpful to better understand price wars, which consist in the mutual undercutting of prices to gain market share. In addition, it can provide information about human behaviour, because economic experiments have shown that individuals prefer to choose the cooperative high payoff action, instead of the Nash equilibrium4.
Our analysis focuses on showing that the Traveler’s Dilemma can be decomposed into a local and a global game. If the payoff optimisation is constrained to the local game, then players will inevitably end up in the Nash equilibrium. However, if players escape the local maximisation and optimise their payoff for the global game, they can reach the cooperative high payoff equilibrium.
Here, we show that the cooperative equilibrium can be reached in a game like the Traveler’s Dilemma due to diversity, which we define as the presence of suboptimal strategies. The appearance of strategies far from those of the residents allows for the local maximisation process to be escaped, such that an optimisation at a global level takes place. Overall this can lead to cooperation because by considering “suboptimal strategies” that play against each other it is possible to reach higher payoffs, both collectively and individually.
Game
The Traveler’s Dilemma is a two-player game. Player i has to choose a claim, (n_i), from the action space, consisting of all integers on the interval [L, U], where (0 le L < U). The payoffs are determined as follows:
If both players, i and j, choose the same value ((n_i = n_j)), both get paid that value.
There is a reward parameter (R>1), such that if (n_i < n_j), then i receives (n_i + R) and j gets (n_i- R)
Thus, the payoff of player i playing against player j is
$$begin{aligned} pi _{ij} = {left{ begin{array}{ll} n_i& text { if } n_i = n_j n_i + R& text { if } n_i < n_j n_j – R& text { if } n_i > n_j end{array}right. } end{aligned}$$
(1)
Thus, a player is better off by choosing a slightly lower value than the opponent: when j plays (n_j), then it is best for i to play (n_j-1). The iteration of this reasoning, which we will call the stairway to hell, leads to the only Nash equilibrium of the game, ({L,L}), where both players choose the lowest possible claim. The classical game theory method to arrive to this equilibrium is called iterative elimination of dominated strategies5.
The game can be visualised through its payoff matrix (Fig. 1). For simplicity, we use the values from the original formulation: (L=2), (U=100) and (R=2). The payoff matrix shows that the Traveler’s Dilemma can be decomposed into a local and a global game. Let us begin with the local game. When the action space of the game is reduced to two adjacent actions n and (n+1) (black boxes in Fig. 1), the Traveler’s Dilemma with (R=2) is equivalent to the Prisoner’s Dilemma6. In general, for any value of R, the Traveler’s Dilemma becomes a Prisoner’s Dilemma for any pair of actions n and (n+s), where ( 1 le s le R-1 ). For example, for (R=4) the pair of actions n and (n+1), n and (n+2), n and (n+3) follow the same game structure as the Prisoner’s Dilemma. Therefore, the Traveler’s Dilemma consists of many embedded Prisoners’ Dilemmas. This means that at a local level the game is a Prisoner’s Dilemma.
If we now consider actions that are distant from each other in the action space, e.g. 2 and 100, we can observe a coordination game structure (gray boxes in Fig. 1), where ({100,100}) is payoff and risk dominant7,8. In general, any pair of actions n and (n+s), where ( R le s le U-n), construct a coordination game. As a result, the Traveler’s Dilemma becomes a coordination game at a global level, which has different equilibria than the local game.
Paradox
Social dilemmas appear paradoxical in the sense that self-interested competing players, when rationally playing the Nash equilibrium, end up with a payoff that clearly goes against their self-interest. But with the Traveler’s Dilemma, the paradox goes further, as suggested in its original formulation3. Classical game theory proposes ({L,L}) as the Nash equilibrium of the game. However, it seems unlikely and implausible that, with R being moderately low, say (R=2), for individuals to play the Nash equilibrium. This has been confirmed in economic experiments where individuals rather choose values close to the upper bound of the interval. Such experiments have also shown that the chosen value depends on the reward parameter (R), where an increasing value of R shifts players’ decision towards ({L,L})4. Nonetheless, classical game theory states that the Nash equilibrium of the game is independent of R.
Consequently, the aim of this paper is to seek and explore simple mechanisms through which the apparent non-rational cooperative behaviour can come about. We also examine the effect of the reward parameter on the game’s outcome. Given that the Traveler’s Dilemma paradox emerges in the classical game theory framework, we analyse the game using evolutionary game theory tools5,9,10. This dynamical approach allows us to explore adaptive behaviour outside of the stationary classical game theory framework. To be more precise, for this approach individuals dynamically adjust their actions according to their payoffs.
The key point of course is to understand how the system can converge to high claims. We show that this behaviour is possible because the Traveler’s Dilemma can be decomposed into a local and a global game. If the payoff maximisation is constrained to the local level, then the stairway to hell leads the system to the Nash equilibrium; given that locally the game is a Prisoner’s Dilemma. On the other hand, at a global level the game follows a coordination game structure, where the high claim actions are payoff dominant. Thus, for the system to reach a high claim equilibrium the maximisation process needs to jump from the local to the global level.
Our analysis led us to identify the mechanism of diversity as responsible for enabling this jump and preventing players from going down the stairway to hell. This mechanism works on the idea that to reach a high claim equilibrium, players have to benefit from playing a high claim. For a population setting, it means that players need to have the chance to encounter opponents also playing high. From a learning model point of view, it refers to the belief that the opponent will also play high, at least with a certain probability. If the belief is shared by both players, they should both play high and reach the cooperative equilibrium. Here, we explore these two types of models to unveil the mechanism leading to cooperation.
Population based models unveil diversity as the cooperative mechanism via the effect of mutations on the game’s outcome. This is shown for the replicator-mutator equation and the Wright–Fisher model. Similarly, a two-player learning model approach, more in line with human reasoning, shows that if players are free to adopt a higher payoff action from a diverse action set during their introspection process, they can reach the cooperative equilibrium. This result is obtained using introspection dynamics.
Finally, we explain how diversity is the underlying mechanism that enables the convergence to high claims in previously proposed models. To be more precise, we show that diversity is required because it allows for the maximisation process to jump from the local to the global level.
Source: Ecology - nature.com