More about HKUST
Strategies and Best Responses in the Iterated Prisoner's Dilemma
PhD Thesis Proposal Defence Title: "Strategies and Best Responses in the Iterated Prisoner's Dilemma" by Mr. Shiheng WANG Abstract: The Iterated Prisoner's Dilemma (IPD) is a well-known benchmark for studying the long-term behaviors of rational agents. Researchers from diverse disciplines have used the IPD to study the emergence of cooperation among unrelated agents. In 1981, Robert Axelrod was the first to run some computer tournaments on iterated prisoner's dilemma. Remarkably the simple Tit-For-Tat (TFT) strategy was the winner. In 2012 (Press and Dyson 2012) dramatically changed people's understanding of this game by deriving what they called zero determinant (ZD) strategies. Among them, of particular interests are what they called extortionate strategies that can enforce an extortionate linear relation between the players' scores. Later Stewart and Plotkin ran a Axelrod-style tournaments that include a few extortionate strategies. One of the notable results of Stewart and Plotkin's tournament is that the extortionate strategy named Extort-2 won the second most head-to-head matches. We aim to give mathematical explanation of experimental tournament results. The repeated games are modeled into Markov decision processes and Markov chains. We consider what we call invincible strategies. These are ones that will never lose against any other strategy in terms of average payoff in the limit. We provide a simple characterization of this class of strategies and show that invincible strategies can also be nice. We discuss its relationship with some important strategies and generalize our results to some typical repeated 2x2 games. In addition, we compute best responses to k-memory mixed strategies. Theoretical analysis shows that there always exists a k-memory pure strategy best response. Due to the complexity of computation, theorems are discovered with the help of symbolic solvers. The evolutionary behavior left over by Press and Dyson is also explained. Our framework can be generalized to any memory-bounded strategies in any repeated games. Date: Friday, 28 February 2020 Time: 3:00pm - 5:00pm Zoom Meeing: https://hkust.zoom.com.cn/j/509959030 Committee Members: Prof. Fangzhen Lin (Supervisor) Prof. Ke Yi (Chairperson) Dr. Sunil Arya Prof. Nevin Zhang **** ALL are Welcome ****