PokerEval: Investigating the Decision Optimality and Theory-of-Mind Capabilities of Large Language Models as Poker Agents

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering

Final Year Thesis Oral Defense

Title: "PokerEval: Investigating the Decision Optimality and Theory-of-Mind 
Capabilities of Large Language Models as Poker Agents"

by

ZHENG Tianshi

Abstract:

We present PokerEval, a comprehensive evaluation framework investigating the 
performance and capabilities of Large Language Models (LLMs) as poker agents in 
heads-up limited Texas Hold'em. PokerEval focuses on two primary aspects: 
evaluating the decision optimality of LLMs and examining their multi-agent 
gameplay dynamics. For the optimality experiment, we generate a benchmark of 
gameplay scenarios with corresponding Game Theory Optimal (GTO) solver 
decisions. In the multi-agent experiment, we conduct a round-robin tournament 
between LLMs, calculating win rates and evaluating their theory-of-mind on 
gameplay intentions. To establish a baseline for LLM performance, we organize a 
human evaluation where amateur poker players compete against LLM-based agents. 
Our study aims to provide insights into the potential of LLMs as poker agents, 
their adherence to optimal strategies, and their ability to model and reason 
about the intentions of other players in a complex, partially observable 
environment like poker.


Date            : 26 April 2024 (Friday)

Time            : 13:45 - 14:25

Venue           : Room 5560 (near lifts 27/28), HKUST

Advisor         : Prof. ZHANG Nevin Lianwen

2nd Reader      : Dr. HE Junxian