in

Pairwise interact-and-imitate dynamics

The model

Consider a unit-mass population of agents who repeatedly interact in pairs to play a symmetric stage game. The set of strategies available to each agent is finite and denoted by (S equiv {1, ldots , n}). A population state is a vector (x in X equiv {x in {mathbb{R}}^n_+: sum _{i in S} x_i = 1}), with (x_i) the fraction of the population playing strategy (i in S). Payoffs are described by a function (F: S times S rightarrow {mathbb{R}}), where F(ij) is the payoff received by an agent playing strategy i when the opponent plays strategy j. As a shorthand, we refer to an undirected pair of individuals, one playing i and the other playing j, as an ij pair. The set of all possible undirected pairs is denoted by (mathscr {P}).

The interaction structure is modeled as a function (p : X times mathscr {P} rightarrow left[ 0, 1/2 right] ) subject to (sum _{ij in mathscr {P}} p_{ij}(x)=1/2) (since the mass of pairs is half the mass of agents), with (p_{ij}(x)) indicating the mass of ij pairs formed in state x. Note that the mass of ij pairs can never exceed (min {x_i,x_j}), that is, (p_{ij}(x) le min {x_i,x_j}) for all x. We assume that p is continuous in X, and that (p_{ij}(x) > 0) if and only if (x_i > 0) and (x_j > 0 )—meaning that the probability of an ij pair being formed is strictly positive if and only if strategies i and j are played by someone. In the case of uniform random matching, (p_{ii} = x_i^2/2) and (p_{ij} = x_i x_j) for any i and (j ne i).

The revision protocol is modeled as a function (phi : X times S times S rightarrow [-1,1]), where (phi _{ij}(x) in [-1,1]) is the probability that an ij pair will turn into an ii pair minus the probability that it will turn into a jj pair, conditional on the population state being x and an ij pair being formed. We assume that (phi ) is continuous in X. We note that by construction (phi _{ij}=-phi _{ji}) for all (i,j in S), and hence (phi _{ii}=0) for all (i in S). Our main assumption on the revision protocol is the following, which is met, among others, by pairwise proportional imitative and imitate-if-better rules22.

Assumption 1

For every (x in X), (phi _{ij}(x) > 0) if (F(i,j) > F(j,i)).

In what follows we consider a dynamical system in continuous time with state space X, characterized by the following equation of motion.

Definition 1

(Pairwise interact-and-imitate dynamics—PIID) For every (x in X) and every (i in S):

$$begin{aligned} dot{x}_i = sum _{j in S} p_{ij}(x) phi _{ij}(x). end{aligned}$$

(1)

Main findings

Global asymptotic convergence

In any purely imitative dynamics, if (x_i(t)=0), then (x_i(t^{prime})=0) for every (t^{prime} > t). This implies that we cannot hope for global asymptotic convergence in a strict sense. Thus, to assess convergence towards a certain state x in a meaningful way, we restrict our attention to those states where all strategies that have positive frequency in x have positive frequency as well. We denote by (X_x) the set of states whose support contains the support of x.

Definition 2

(Supremacy) Strategy (iin S) is supreme if (F(i,j)>F(j,i)) for every (j in S setminus {i}).

We note that under PIID, the concept of supremacy is closely related to that of asymmetry33,34, in that (F(i,j) > F(j,i)) implies that agents can only switch from strategy j to strategy i.

Proposition 1

If (i in S) is a supreme strategy, then state (x^* equiv left{ x in X : x_i = 1 right} ) is globally asymptotically stable for the dynamical system with state space (X_{x^*}) and PIID as equation of motion.

Relation to replicator dynamics

To further characterize the dynamics induced by the pairwise interact-and-imitate protocol, we make two additional assumptions. First, matching is uniformly random, meaning that everyone in the population has the same probability of interacting with everyone else; formally, (p_{ii} = x_i^2/2) and (p_{ij} = x_i x_j) for all i and (j ne i). Second, the probability that an agent has to imitate the opponent is proportional to the difference in their payoffs if the opponent’s payoff exceeds her own, and is zero otherwise. As a consequence, (phi _{ij} = F(i,j) – F(j,i)) up to a proportionality factor. Let

  • (F left( i, x right) :=sum _j x_j F left( i, j right) ),

  • (F left( x, i right) :=sum _j x_j F left( j, i right) ), and

  • ( F left( x, x right) :=sum _i sum _j x_i x_j F left( i, j right) ).

Under these assumptions, at any point in time, the motion of (x_i) is described by:

$$begin{aligned} dot{x}_i&= sum _{j ne i} x_j x_i left[ F left( i, j right) – F left( j, i right) right] = x_i sum _{j} x_j left[ F left( i, j right) – F left( j, i right) right] nonumber &= x_i left[ F left( i, x right) – F left( x, i right) right] , end{aligned}$$

(2)

which is a modified replicator equation. According to (2), for every strategy i chosen by one or more agents in the population, the rate of growth of the fraction of i-players, (dot{x}_i / x_i), equals the difference between the expected payoff from playing i in state x and the average payoff received by those who are matched against an agent playing i. In contrast, under standard replicator dynamics35, the fraction of agents playing i varies depending on the excess payoff of i with respect to the current average payoff in the whole population, i.e., (dot{x}_i = x_i left[ F left( i, x right) – F left( x, x right) right] ).

A noteworthy feature of replicator dynamics is that they are always payoff monotone: for any (i,j in S), the proportions of agents playing i and j grow at rates that are ordered in the same way as the expected payoffs from the two strategies36. In the case of PIID, this result fails.

Proposition 2

Pairwise-Interact-and-Imitate dynamics need not satisfy payoff monotonicity.

To verify this, it is sufficient to consider any symmetric (2 times 2) game where (F left( i, j right) > F left( j, i right) ) but (F left( j, x right) > F left( i, x right) ) for some (x in X), meaning that i is the supreme strategy but j yields a higher expected payoff in state x. See Fig. 1 for an example where, in the case of uniform random matching, the above inequalities hold for any x; if strategies are updated according to the interact-and-imitate protocol, then this game only admits switches from i to j, therefore violating payoff monotonicity. Proposition 2 can have important consequences, including the survival of pure strategies that are strictly dominated.

Survival of strictly dominated strategies

An recurring topic in evolutionary game theory is to what extent does support exist for the idea that strictly dominated strategies will not be played. It has been shown that if strategy i does not survive the iterated elimination of pure strategies strictly dominated by other pure strategies, then the fraction of the population playing i will converge to zero in all payoff monotone dynamics37,38. This result does not hold in our case, as PIID is not payoff monotone.

More precisely, under PIID, a strictly dominated strategy may be supreme and, therefore, not only survive but even end up being adopted by the whole population. This suggests that from an evolutionary perspective, support for the elimination of dominated strategies may be weaker than is often thought. Our result contributes to the literature on the conditions under which evolutionary dynamics fail to eliminate strictly dominated strategies in some games, examining a case which has not yet been studied39.

To see that a strictly dominated strategy may be supreme, consider the simple example shown in Fig. 1. Here each agent has a strictly dominant strategy to play A; however, since the payoff from playing B against A exceeds that from playing A against B, strategy B is supreme. Thus, by Proposition 1, the population state in which all agents choose B is globally asymptotically stable.

Figure 1

A game where the supreme strategy is strictly dominated.

Full size image

Figure 1 can also be used to comment on the relation between a supreme strategy and an evolutionary stable strategy, which is a widely used concept in evolutionary game theory40,41. Indeed, while B is the supreme strategy, A is the unique evolutionary stable strategy because it is strictly dominant. However, if F(BA) were reduced below 2, holding everything else constant, then B would become both supreme and evolutionary stable. We therefore conclude that no particular relation holds between evolutionary stability and supremacy: neither one property implies the other, nor are they incompatible.

Applications

Having obtained general results for the class of finite symmetric games, we now restrict the discussion to the evolution of behavior in social dilemmas. We show that if the conditions of Proposition 1 are met, then inefficient conventions emerge in the Prisoner’s Dilemma, Stag Hunt, Minimum Effort, and Hawk–Dove games. Furthermore, this result holds both without and with the assumption that agents interact assortatively.

Ineffectiveness of assortment

Consider the (2 times 2) game represented in Fig. 2. If (c> a> d > b), then mutual cooperation is Pareto superior to mutual defection but agents have a dominant strategy to defect. The resulting stage game is the Prisoner’s Dilemma, whose unique Nash equilibrium is (BB). Moreover, since (F (B,A) > F(A,B)), B is the supreme strategy and the population state in which all agents defect is globally asymptotically stable.

We stress that defection emerges in the long run for every matching rule satisfying our assumptions, and therefore also in the case of assortative interactions. Assortment reflects the tendency of similar people to clump together, and can play an important role in the evolution of cooperation42,43,44,45. Intuitively, when agents meet assortatively, the risk of cooperating in a social dilemma may be offset by a higher probability of playing against other cooperators. However, under PIID, this is not the case: the decision whether to adopt a strategy or not is independent of expected payoffs, and like-with-like interactions have no effect except to reduce the frequency of switches from A to B.

Figure 2

A (2 times 2) stage game.

Full size image

Emergence of the maximin convention

If (a> c > b), (a > d) and (d > b), then the game in Fig. 2 becomes a Stag Hunt game, which contrasts risky cooperation and safe individualism. The payoffs are such that both (left( A, Aright) ) and (left( B, Bright) ) are strict Nash equilibria, that (left( A, Aright) ) is Pareto superior to (left( B, Bright) ), and that B is the maximin strategy, i.e., the strategy which maximizes the minimum payoff an agent could possibly receive. We also assume that (a + c ne c + d), so that one of A and B is risk dominant46. If (a + b > c + d), then A (Stag) is both payoff and risk dominant. When the opposite inequality holds, the risk dominant strategy is B (Hare).

Since (F (B,A) > F(A,B)), B is supreme independently of whether or not it is risk dominant to cooperate. This can result in large inefficiencies because, in the long run, the process will converge to the state in which all agents play the riskless strategy regardless of how rewarding social coordination is. As in the case of the Prisoner’s Dilemma, this holds for all matching rules satisfying our assumptions.

Evolution of effort exertion

In a minimum effort game, agents simultaneously choose a strategy i, usually interpreted as a costly effort level, from a finite subset S of ({mathbb{R}}). An agent’s payoff depends on her own effort and on the minimum effort in the pair:

$$begin{aligned} F left( i, j right) = alpha min left{ i, j right} – beta i , end{aligned}$$

where (beta > 0) and (alpha > beta ) are the cost and benefit of effort, respectively. From a strategic viewpoint, this game can be seen as an extension of the Stag Hunt to cases where there are more than two actions. The best response to a choice of j by the opponent is to choose j as well, and coordinating on any common effort level gives a Nash equilibrium. Nash outcomes can be Pareto-ranked, with the highest-effort equilibrium being the best possible outcome for all agents. Thus, choosing a high i is rationalizable and potentially rewarding but may also result in a waste of effort.

Under PIID, any (i > j) implies (phi _{ij} < 0) by Assumption 1, meaning that agents will tend to imitate the opponent when the opponent’s effort is lower than their own. The supreme strategy is therefore to exert as little effort as possible, and the population state in which all agents choose the minimum effort level is the unique globally asymptotically stable state.

Emergence of aggressive behavior

Consider again the payoff matrix shown in Fig. 2. If (c> a> b > d), then the stage game is a Hawk–Dove game, which is often used to model the evolution of aggressive and sharing behaviors. Interactions can be framed as disputes over a contested resource. When two Doves (who play A) meet, they share the resource equally, whereas two Hawks (who play B) engage in a fight and suffer a cost. Moreover, when a Dove meets a Hawk, the latter takes the entire prize. Again we have that (F (A,B) < F(B,A)), implying that B is the supreme strategy and that the state where all agents play Hawk is the sole asymptotically stable state.

The inefficiency that characterizes the (BB) equilibrium in the Hawk–Dove game arises from the cost that Hawks impose on one another. This can be viewed as stemming from the fact that neither agent owns the resource prior to the interaction or cares about property. A way to overcome this problem may be to introduce a strategy associated with respect for ownership rights, the Bourgeois, who behaves as a Dove or Hawk depending on whether or not the opponent owns the resource41. If we make the standard assumption that each member of a pair has a probability of 1/2 to be an owner, then in all interactions where a Bourgeois is involved there is a 50 percent chance that she will behave hawkishly (i.e., fight for control over the resource) and a 50 percent chance that she will act as a Dove.

Let R and C denote the agent chosen as row and column player, respectively, and let (omega _R) and (omega _C) be the states of the world in which R and C owns the resource. The payoffs of the resulting Hawk–Dove–Bourgeois game are shown in Fig. 3. If agents behave as expected payoff maximizers, then All Bourgeois can be singled out as the unique asymptotically stable state. Under PIID, this is not so; depending on who owns the resource, an agent playing C against an opponent playing B may either fight or avoid conflict and let the opponent have the prize. It is easy to see that (F left( C, B mid omega _R right) = F left( B,C mid omega _C right) = d), meaning that the payoff from playing C against B, conditional on owning the resource, equals the payoff from playing B against C conditional on not being an owner. In contrast, the payoff from playing C against B, conditional on not owning the resource, is always worse than that of the opponent, i.e., (F left( C, B mid omega _C right) = b < c = F left( B, C mid omega _R right) ). Thus, in every state of the world, B (Hawk) yields a payoff that is greater or equal to that from C (Bourgeois). Moreover, since (F left( B,A right) > F left( A, B right) ) in both states of the world, strategy B is weakly supreme by Definition 4, and play unfolds as an escalation of hawkishness and fights.

Figure 3

The Hawk–Dove–Bourgeois game.

Full size image


Source: Ecology - nature.com

Revisiting a quantum past for a fusion future

From NYC zookeeper to aspiring architect