The Learning Point‎ > ‎Mathematics‎ > ‎

Introduction to Game Theory- With Problems- Normal Form, Nash Equilibrium, Prisoner's Dilemma, Zero Sum and Mixed Strategies

   Our Game Theory Tutorials- at a glance

Game Theory

 An Introduction to Game Theory

Extensive Games

Bayesian Games : Games with Incomplete Information

Repeated Games

           An Introduction to Game Theory

Game Theory

Game Theory : An Introduction

Game Theory helps us understand situations in which decision-makers interact. A game in the everyday sense—“a competitive activity . . . in which players contend with each other according to a set of rules. It encompasses a wide range of applications some of which are listed below:
  • firms competing for business
  • Arms race between the countries
  • political candidates competing for votes
  • animals fighting over prey
  • bidders competing in an auction
  • penalty kicks in a football game
And the list goes on….. Like other sciences, game theory consists of a collection of models. These models provide insights when interactions affect incentives.

Theory of Rational Choice

The theory of rational choice is a component of many models in game theory. By assuming a decision maker to be rational, according to this theory a decision-maker chooses the best action among all the actions available to her.

Basic Ingredients of a Model

are the ones who make the decisions in a game/model.

A set A consisting of all the actions that, under some circumstances, are available to the decision-maker, and a specification of the decision-maker’s preferences. In any given situation the decision-maker is faced with a subset of A, from which she must choose a single element.

If the decision-maker prefers the action a to the action b, and the action b to the action c, then she prefers the action a to the action c. This preference is decided on the basis of “payoffs/utilities”. Payoff function associates a number with
each action in such a way that actions with higher numbers are preferred.
Let “u” be a payoff function, “a” & “b” be two actions in set A, then action “a” is preferred over “b” if (and only if) u(a) >u(b)
If u(a)=u(b) then the player is said to be indifferent.

A person is faced with the choice of three vacation packages, to Havana, Paris, and Venice. She prefers the package to Havana to the other two, which she regards as equivalent. Her preferences between the three packages are represented by any payoff function that assigns the same number to both Paris and Venice and a higher number to Havana. For example, we can set u(Havana) = 1 and u(Paris) = u(Venice) = 0, or u(Havana) = 10 and u(Paris) = u(Venice) = 1, or u(Havana) = 0 and u(Paris) = u(Venice) = −2.

  • A high payoff doesn’t convey in any way the intensity of preference.
  • It is a relative measure and just tells the order of preferences of actions in a set.

Apart from the above there exists other ingredients (information, strategies) depending upon the model being followed.

Different Forms of Games  

Normal Form
Extensive Form
Repeated Games
Bayesian Games
In this tutorial we will cover the Normal Form.

Normal Form

Consists of:

• a set of players
• for each player, a set of actions
• for each player, preferences over the set of action profiles.

Example 1: The Prisoner’s Dilemma

Two suspects in a major crime are held in separate cells. There is enough evidence to convict each of them of a minor offense, but not enough evidence to convict either of them of the major crime unless one of them acts as an informer against the other (defects). If they both stay quiet, each will be convicted of the minor offense and spend one year in prison. If one and only one of them defects, she will be freed and used as a witness against the other, who will spend four years in prison. If they both defect, each will spend three years in prison.

Players: The two suspects.
Actions: Each player’s set of actions is {Quiet, Defect}.
Preferences Suspect 1’s ordering of the action profiles, from best to worst, is
(Defect, Quiet) (he defects and suspect 2 remains quiet, so he is freed), (Quiet,
Quiet) (he gets one year in prison), (Defect, Defect) (he gets three years in prison),
(Quiet, Defect) (he gets four years in prison). Suspect 2’s ordering is (Quiet, Defect),
(Quiet, Quiet), (Defect, Defect), (Defect, Quiet).
All of the above data can be represented in tabular form.

Suspect1/ Suspect 2QuietDefect

Rows enlist the actions of Suspect 1 while columns contain that of player 2. Each cell in the table tells about the payoffs under a particular action. First number denotes the payoff of the first player while the latter that of the second player.
For Example the actions {Quiet, Defect} of suspect 1 & 2 respectively will give them the payoffs of 0 & 3 respectively.
Another famous game/situation is the Battle of sexes often abbreviated as BoS.

Example 2: Battle of Sexes

There is a couple who want to go on a date, so they decide a place to meet (either a movie or football game in our case). The husband prefers to go to a football match while the wife prefers movie. Problem is on the day of date both of them forget the place they have to meet. Now both of them would like to meet at the same place even though it might be the venue they prefer less.

Row player is wife while the column player is Husband.

Suppose both of them go to a different venue they get a payoff of zero as their purpose of date is not fulfilled irrespective of the fact whether the venue they have gone to is their preferred one or not.
If both go to the Movie then they meet each other & wife gets a higher payoff because she prefers going to movie over football while the opposite holds for the husband. In the same way husband gets a higher payoff when both go to the football match.

Mathematically Normal form games consist of:
  • N= {1,...,n} the set of players
  • Ai the set of actions of player i
  • A= A1 x ∙∙∙ x An profiles of actions
  • ui: A → R utility function of player i

Taking Bos (Battle of Sexes) as an example
N= {Wife, Husband}
A1= {Movie, Football}, A2= {Movie Football}
u1 (Movie, Football) =0…….

Two Interesting Families of Games 

  • Team games: Ultimate cooperation
  • Zero-Sum games: Ultimate competition

Team Games

  • For any two players i and j and any action profile a, it is the
           case that ui(a) = uj(a)
  • Players have identical preference over strategy profiles. That is, for any two players i and j and any two strategy profiles s1 and s2, it is the case that
           ui(s1) ≥ ui(s2) iff uj(s1) ≥ uj(s2)

Example: Consider a hypothetical game with two players each having a choice of two similar actions {L,R}
In this game it is beneficial for both the players to choose the same actions i.e. they get higher payoffs if they cooperate.


Zero Sum Games

Consider a game which all of us must have played during our childhood: Stone (Rock), Paper & Scissor  (R,P,S). Its payoff can be represented as follows


The above game is type of Zero Sum Game whose general features are as follows:

  • Limited to two-person games
  • u1(a1,a2) + u2(a1,a2) = 0
  • “Zero” is not critical; generalize to “constant-sum” games: u1(a1,a2) + u2(a1,a2 )=k


  • A strategy always leads to at least as high (higher) payoff than
  • Then no matter what a player believes about other players,
            one strategy leads to a higher payoff than the other.
  • A strategy ai is a (weakly) dominant strategy for a player i
           ui (ai, a-i ) ≥ ui (ai’, a-i ) for all ai’ and a-i
  • A strategy ai is a strictly dominant strategy for a player i
           ui (ai, a-i ) > ui (ai’, a-i ) for all ai’ and a-i

(ai , a−i) denotes the action profile in which every player j except i chooses her action aj as specified by a, whereas player i chooses a’i. (The −i subscript on a stands for “except i”.)

Example: A Predator and a prey

Consider a prey and a predator. Now predator wants to hunt a prey, it has two choices: Either stay active (search for prey) or passive (wait for the prey). Similarly prey wants to avoid the predator and has same choice as the predator (Active and Passive). So if a predator is active then prey would also like to be active(so as to avoid predator) and vice-versa. Following table captures one such scenario
Active2.5, -0.94,-1

Predators have a strictly dominant strategy here that is to stay Active.

Going back to Prisoner’s Dilemma it is easy to observe that both the prisoners have a strictly dominant strategy i.e. both of them defect.
Analysis: Look at the payoffs table, suppose player1 stays quiet then player2 can either remain quiet or defect getting payoffs of 2 & 3 respectively. So he chooses to defect. Suppose player1 chooses to defect then again player2 gets a payoff 1 by defecting and payoff 0 by remaining quiet. So we conclude player 2 will defect irrespective of what action player1 chooses. Similar analysis can be done for player 1 and result would be the same i.e player 1 will defect irrespective of player2’s actions.

Nash Equilibrium

Nash Equilibrium gives us the stable points (actions) in a game i.e. once those actions have been taken no one would like to deviate from them.

Best Responses

  • Consider a game: (N, (Ai)i , (ui)i)
  • ai is a best response to a-i (strategies of others) if
           ui (ai, a-i ) ≥ ui (ai’, a-i ) for all ai
To put it in simple words ai is the best response to a particular action profile of other players if by playing that action ai fetches the maximum payoff to the player.

Consider the Predator-Prey example.
If the predator is active then best response of prey will be to stay active.
If the predator is passive then best response of prey will be to stay passive.

Pure Strategy Equilibrium

  • A profile of strategies (a1,...,an) is a (pure strategy) Nash
           equilibrium if ai is a best response to a-i for all i.
  • So: ui (ai, a-i ) ≥ ui (ai’, a-i ) for all i and ai.’

So for the Prisoner’s Dilemma Nash Equilibrium would be {Defect, Defect}.
As we have seen in the analysis each player has a dominant strategy of defecting which also happens to be the best response for each player, so nash equilibrium would be both players defecting.

Difference between Dominant Strategy & Nash Equilibrium

  • ai is a (weakly) dominant strategy for i:
           ui (ai, a-i ) ≥ ui (ai’, a-i ) for all ai’ and a-i
  • ai is a best response for i:
           ui (ai, a-i ) ≥ ui (ai’, a-i ) for all ai’.
So if we look at the Bos there doesn’t exist any dominant strategy but there exist two Nash Equilibriums {Movie, Movie}, {Football, Football}.

Mixed Strategy Equilibrium

Consider the following scenario of the Predator-Prey game.


There exists no pure strategy Nash Equilibrium.

  • Si is the set of all probability distributions over Ai
  • A mixed strategy for player i is an si in Si
  • si(ai) is the probability that player i plays ai in Ai
  • A pure strategy ai is a degenerate mixed strategy where
           si(ai)=1 and si(ai’)=0 for all other pure strategies ai’.

  • The expected payoff of player i when she chooses a pure
           action ai and her opponents play mixed strategies s-i is:
           Σa-i s-i(a-i) ui(ai,a-i)
  • The expected payoff of player i when she chooses si and
           her opponents play s-i is:
           ui(si, s-i) = Σai si(ai) [Σa-i s-i(a-i) ui(ai,a-i)]
  • A profile of mixed strategies s is a mixed strategy Nash
           equilibrium if si is a best response to s-i for all i:
  • So: ui (si, s-i ) ≥ ui (si’, s-i ) for all i and si’ in Si

Example: Coming back to the Predator-Prey example discussed in this section assume:
s2(active) is the probability Prey is active =p
Probability predator is active= q
Now for the predator to be indifferent towards being active or passive its expected payoffs need to be equal in both the cases.
Expected Payoff when active = 2p + 6(1-p) … eq 1
Expected Payoff when passive = 3p - (1-p) ….eq 2
As both equations are equal , we solve and find that
similarly for the prey to be indifferent we solve for q
-7q -6(1-q) = -8q + 0(1-q)
So p=7/8; q=6/7 is the mixed strategy Nash Equilibrium.
In a game there can exist pure strategy as well as mixed strategy Nash equilibriums.
Example: Consider again BoS game
p= probability husband goes to movie
q= probability wife goes to movie
then for wife to be indifferent:
2p= 1-p
So there exist three Nash equilibriums : Two pure and One mixed.

Minimax Theorem

  • (s1, s2) is an equilibrium of a zero-sum game iff:
           S1 = arg max min u1(s1’, s2) [1’s “maximin”]
                      s1’ϵ S1   s2 ϵ  S2
           S2 =  argmin  max u1(s1, s2’) [2 “minmaxes” 1]
                    s2’ ϵ S2    s1ϵ S1

  • Implication: u1(s1, s2) = max min u1(s1, s2) = min max u1(s1, s2)

Example: Consider penalty kicks in a soccer match with payoffs represented by following table
Left1.2, 0.81.6, 0.4
Right1.8, 0.21.4, 0.6

Now kicker tries to maximise his minimum

S1(L) = Probability kicker kicks left
S2(L) = Probability goalie jumps left

max min[s1(L)s2(L)*1.2 + s1(L)s2(R)*1.6 + s1(R)s2(L)*1.8 + s1(R)s2(R)*1.4]
s1        s2

His minimum :

min [s1(L)s2(L)*1.2 + s1(L)(1-s2(L))*1.6 + (1-s1(L))s2(L)*1.8 + (1-s1(L))(1-s2(L))*1.4]
Now taking s2(L) as a variable and taking derivative to find the minimum we obtain

s1(L) = 1/2, s1(R)= ½

Now goalie minimises Kicker’s maximum

min max[s1(L)s2(L)*1.2 + s1(L)s2(R)*1.6 + s1(R)s2(L)*1.8 + s1(R)s2(R)*1.4]
s2       s1

His Maximum:
max [s1(L)s2(L)*1.2 + s1(L)(1-s2(L))*1.6 + (1-s1(L))s2(L)*1.8 + (1-s1(L))(1-s2(L))*1.4]

Now taking s1(L) as a variable and equating derivative to 0   we obtain

s2(L)=1/4  s2(R)= ¾

Hence Equilibrium :
s1(L) = 1/2, s1(R)= ½
s2(L)=1/4  s2(R)= ¾

Existence of Nash Equilibrium

Theorem: Every game in which each player has a finite number of pure strategies has at least one equilibrium (possibly in mixed strategies).

Exercises : Examples and Problems

  1. A decision-maker’s preferences over the set A = {a, b, c} are represented by the payoff function u for which u(a) = 0, u(b) = 1, and u(c) = 4. Are they also represented by the function v for which v(a) = −1, v(b) = 0, and v(c) = 2?
Solution: The answer is yes. As discussed in theory payoffs give us the order of preference and it is easy to see the decision maker prefers “c” the most, then “b” and “a” the least so we assign any payoff to these action provided payoff of “c” is the highest, then that of “b” and lastly “c”.

2.       You are working with a friend on a joint project. Each of you can either work hard or goof         off. If your friend works hard then you prefer to goof off (the outcome of the project would be better if you worked hard too, but the increment in its value to you is not worth the extra effort). You prefer the outcome of your both working hard to the outcome of your both goofing off (in which case nothing gets accomplished), and the worst outcome for you is that you work hard and your friend goofs off (you hate to be “exploited”). If your friend has the same preferences then tabulate the suitable payoffs which can be assigned.

Solution: The situation here is analogous to the prisoner’s dilemma. Working hard is similar to staying quiet while goofing off is same as defecting

3. Consider the following normal form:
• N={1, 2}
• Ai={Left, Right}
• Player 1 prefers to choose the same action as player 2:
u1 (a1=a2)=1, u1(a1≠a2)=--‐1
• Player 2 prefers to choose the opposite action from player 1:
u2(a1=a2)=‐1, u2(a1≠a2)=1
Make an appropriate payoff table based on given data.


4. Consider the following zero sum game. Fill in the blanks(?).

Solution: In a zero--‐sum game, u1(a1,a2) + u2(a1,a2) = 0, for all possible (a1,a2).
                So u2(A,A)=1, u2(B,B)= -2  u1(C,C)=4

5. Consider a different version of BoS game given below

Specify the dominant strategy (weak/strict) for player 1 and player2.
Solution: Looking at the definition of strictly and weakly dominant strategy it is clear that player 1 has a strictly dominant strategy of going to movie
When 2 plays Movie, 1 gets 2 from Movie and 0 from Home; When 2 plays Home, 1 gets 1 from Movie and 0 from Home.
while player 2 has a weakly dominant strategy of staying at home
When 1 plays Movie, 2 gets 1 from either Movie or Home (so is indifferent);
When 1 plays Home, 2 gets 1 from Movie and 2 from Home.

6. Consider the collective--‐action game

When player 1 plays “B”, for player 2 what is the best response. Specify any dominant strategy if it exists.
Solution: When player 1 plays “A” player 1 can either play “A” or “B” with -1 and 0 as respective payoffs. So the best response is to choose “B”
It is easy to notice that there doesn’t exist any dominant strategy.

7. Consider the project game of example 2. List all the pure strategy Nash Equilibrium.
Solution: (Goofs, Goofs) is the pure strategy Nash equilibrium:
– Neither player would be strictly better by deviating from prescribed pairs of actions, presuming the other plays the prescribed action:
If the other goofs, then a player is indifferent and also willing to Goof.

8. Consider the modified pred/prey game with a mixed strategy:


Let p=probab. prey is active
q=probab. Predator is active
Find all the mixed strategy equilibrium
Solution: payoff of the pred when Playing active is
When playing passiveis
Payoffs should be equal since the pred should be indifferent. Hence solving for p we get
Solving in a similar way we obtain
Mixed strategy Nash equilibrium is p=10/11; q=5/7.

9. Consider a bargaining game:

Find all pure strategy Nash equilibrium:
Solution: Suppose 1 chooses “low” then best response of 2 will be to choose “yes”. Now consider the other way round if 2 chooses “yes” then 1’s best response will be “low”. So neither of two would want to deviate. Applying the same logic to other points we deduce {low, yes} is the only Nash Equilibrium.

10. Consider the matching pennies game

What is the maxmin strategy for row player?
Solution: From theory S1= argmax min u1(s1’,s2)
p= probab. 1 plays L
If p>1/2, s2=R leads 1 to earn 1-2p<0;
– If p<1/2, s2=L leads 1 to earn 2p-1<0;
– If p=1/2,then regardless of2’s strategy 1 earns 0.
– Thus p=1/2 is the maximin strategy

11. Consider the joint project game from Ex. 2 .Formulate a strategic game that models a situation in which two people work on a joint project in the case that their preferences are the same as those in the game in except that each person prefers to work hard than to goof off when the other person works hard. Present your game in a table.

12. Two people enter a bus. Two adjacent cramped seats are free. Each person must decide whether to sit or stand. Sitting alone is more comfortable than sitting next to the other person, which is more comfortable than standing.
a. Suppose that each person cares only about her own comfort. Model the situation as a strategic game. Find its Nash equilibrium (equilibria?).

Above table holds for the given ques. There exist only one pure strategy nash equilibrium (stand, stand) and the argument is same as that of Prisoner’s Dilemma discussed in theory.

13. Consider the matching-pennies game:

Does the game have a pure-strategy equilibrium? Construct a mixed-strategy equilibrium:

Solution: It’s easy to see there doesn’t exist any pure strategy Nash Equilibrium
Let p= probab. Player 2 plays left
q= probab. Player 1 plays left
For player1 to be indifferent:
p-(1-p)= -p+(1-p)
Similarly we obtain q=1/2.

14. Consider the following Game:

1\2Stay Leave
Stay-Z, -2 1,0

In the resulting mixed strategy equilibrium, how does the probability of staying change for row and column player as Z is increased?
Solution: p= probab. Of column player staying
q= probab. Of row player staying
For row player to be indifferent:
Now as we increase “z” p decreases
If we write the expression for column player we will notice that “q” is independent of “z”.

15. True or False: Every game in which each player has a finite number of pure strategies has at least one pure strategy equilibrium.
Solution: False. Nash’s theorem only says that Every game in which each player has a finite number of pure strategies has at least one equilibrium (possibly in mixed strategies).

16. Suppose there are n people sitting in a room. Every one of them is asked to guess an integer between 1 and 100 and write it on a paper. Then the average of all the numbers written on paper is taken and the person whose guess is closest to 2/3 of the average is the winner. Suggest the best strategy available to each player and what number should they guess.
Solution: Game can be formally represented as follows:
  • N={1,…., n} where n>2 is the number of
  • Ai = {1,2,…,100}
  • Let m(a) = Σi ai /n be the average action
  • ui(a)=1 if |ai – 2m(a)/3| < |aj – 2m(a)/3| for all j≠i
  • ui(a)=0 if |ai – 2m(a)/3| > |aj – 2m(a)/3| for some j≠i
  • ui(a)=1/K if i is among K players all closest to 2m(a)/3
  • Best reply of any player is below the mean of others’ actions if that mean is above 1.Everybody wanting to announce a number below the average, leads all to announce 1.

17. N people guess an integer between 1 and 100, and the winner is the player whose guess is closest to 2 times the mean of the guesses
What is the equilibrium in this case?
Solution: Similar logic will be used here as the previous question.
Each player’s best response is to announce a number closest to twice the average, subject to the Constraint of the 100.
So, each person wants to name a number above average, and so nothing is stable except all saying 100.

18. Guess an even integer between 1 and 100 that is closest to 1/2 of the mean of the guesses, what will be the equilibrium in that case?
Solution: This question is similar to ques. 17 but with an added constraint. Now one can choose only an even integer. But still our aim would be to guess a smaller number, so guessing number 2 is the equilibrium.


Drivers going from point 1 to point 2.
f = fraction of drivers taking route R. Payoffs is the negative of total time taken. Find the equilibrium.
Solution: ui(L, f)= - ( 1.2 + 1 - f)
ui(R, f)= - ( f + 1.2 )
Best response: L iff 1.2+1-f ≤ 1.2+f
or 1/2 ≤ f
R iff f ≤ ½.
Hence the equilibrium point is f=1/2.

19. If in the previous question a new route is added as shown above. what will be the equilibrium now?
Solution: Time R=R1+R2: fR1 + 1.2
Time L=L1+L2: 1.2 + fL2
Time R1+L2: fR1 +.1+ fL2
Always best to take R1 to L2
So fR1 = fL2 = 1 total travel time = 2.1 hours! Which is more than what was the time taken without this new route!!!
This is known as Braess Paradox.

21. In question 19 Change commute time for L1 from 1.2hr to 1.25hr. What will be the new Equilibrium?
Solution: 1.25 + f = (1-f) +1.2,
Then all players are indifferent and f=0.475 is an equilibrium.

22. Suppose there is a pond with fishes and “n” number of fishermen living around it. Let ai is time spent fishing per day by player i. Thus total time per day is a1 + ... + an. The number of fish available at any time is given by: (2000- Σi ai ). The number of fish caught by a fisherman “j” is: aj (2000- Σi ai ). Then what would be the best response of fisherman “i” to maximise his total?
Solution: Fish caught by “i”: ai (2000- (ai + Σj≠i aj )). Now we have to maximise this expression. Therefore differentiating w.r.t ai and equating to 0
ai = 1000 - Σj≠i aj /2
Now each player will try to maximise his catch and assuming each one using the same strategy ai = a * to achieve equilibrium
a* = 1000 - (n-1) a* /2
a* = 2000 / (n+1)
So catch by each player=[2000 / (n+1)]2

23. Suppose in the previous question all fishermen coordinate and try to maximise the total catch by all of them. Supposing number of fishermen=2000 Will they be better off?
Solution: In this question we have to maximise this expression:
Σi ai (2000 - Σi ai )
Differentiating w.r.t Σi ai and solving
We get Σi ai = 1000. Assuming symmetric play
ai = 1000/n
Total catch per player:
[1000/n] (2000 - 1000) = 10002 /n
Putting n=2000 it is easy to see that they would be better off in this case!!

24. Consider the same problem with 2 fishermen:
Stock of fish over time S = (300- a1 -a2).
Fisherman i catches aiS =ai(300- a1-a2), and a* is the symmetric pure Strategy Nash equilibrium production. Consider the two approaches discussed in the previous two questions. What will be their individual catch on the basis of these?
Solution: a*=300/(n+1) where n=2.
Thus a*=100.
When maximizing atotal(300– atotal), the solution
a*total = 300/2 =150.
Thus aop = a*total /2 =75.