Strategies and Best Responses in the Iterated Prisoner's Dilemma

PhD Thesis Proposal Defence


Title: "Strategies and Best Responses in the Iterated Prisoner's Dilemma"

by

Mr. Shiheng WANG


Abstract:

The Iterated Prisoner's Dilemma (IPD) is a well-known benchmark for
studying the long-term behaviors of rational agents. Researchers from
diverse disciplines have used the IPD to study the emergence of
cooperation among unrelated agents. In 1981, Robert Axelrod was the first
to run some computer tournaments on iterated prisoner's dilemma.
Remarkably the simple Tit-For-Tat (TFT) strategy was the winner. In 2012
(Press and Dyson 2012) dramatically changed people's understanding of this
game by deriving what they called zero determinant (ZD) strategies. Among
them, of particular interests are what they called extortionate strategies
that can enforce an extortionate linear relation between the players'
scores. Later Stewart and Plotkin ran a Axelrod-style tournaments that
include a few extortionate strategies. One of the notable results of
Stewart and Plotkin's tournament is that the extortionate strategy named
Extort-2 won the second most head-to-head matches.

We aim to give mathematical explanation of experimental tournament
results. The repeated games are modeled into Markov decision processes and
Markov chains. We consider what we call invincible strategies. These are
ones that will never lose against any other strategy in terms of average
payoff in the limit. We provide a simple characterization of this class of
strategies and show that invincible strategies can also be nice. We
discuss its relationship with some important strategies and generalize our
results to some typical repeated 2x2 games.

In addition, we compute best responses to k-memory mixed strategies.
Theoretical analysis shows that there always exists a k-memory pure
strategy best response. Due to the complexity of computation, theorems are
discovered with the help of symbolic solvers. The evolutionary behavior
left over by Press and Dyson is also explained. Our framework can be
generalized to any memory-bounded strategies in any repeated
games.


Date:                   Friday, 28 February 2020

Time:                   3:00pm - 5:00pm

Zoom Meeing:            https://hkust.zoom.com.cn/j/509959030

Committee Members:      Prof. Fangzhen Lin (Supervisor)
                        Prof. Ke Yi (Chairperson)
                        Dr. Sunil Arya
                        Prof. Nevin Zhang


**** ALL are Welcome ****