IntentionQA: A Benchmark for Evaluating Purchase Intention Understanding Abilities of Large Language Models in E-commerce

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering

Final Year Thesis Oral Defense

Title: "IntentionQA: A Benchmark for Evaluating Purchase Intention 
Understanding Abilities of Large Language Models in E-commerce"

by

DING Wenxuan

Abstract:

Enabling large language models (LLMs) to comprehend purchase intentions in 
e-commerce scenarios is essential for their assistance in various downstream 
tasks. However, previous approaches that distill intentions from LLMs often 
fail to generate meaningful and human-centric intentions applicable in 
real-world e-commerce contexts. This raises doubts regarding the true 
understanding and utilization of purchasing intentions by LLMs. In this work, 
we present IntentionQA, a double-task Multiple-Choice Question Answering (MCQA) 
benchmark to evaluate LLMs' comprehension of purchase intentions in e-commerce. 
Specifically, LLMs are tasked with inferring intentions given purchased 
products and further utilize the intention to predict additional purchases. 
IntentionQA contains 4,375 problems and is divided into three difficulty 
levels. Human evaluations demonstrate the high quality and low false-negative 
rate of our benchmark. Extensive experiments across 30 language models with 
varying sizes and methods showcase that they still struggle with certain 
scenarios, such as identifying complementary products, understanding specific 
intention types, and more. Our code and data are publicly available at 
https://github.com/HKUST-KnowComp/IntentionQA.


Date            : 3 May 2024 (Friday)

Time            : 09:00 - 09:40

Venue           : Room 4504 (near lifts 25/26), HKUST

Advisor         : Dr. SONG Yangqiu

2nd Reader      : Dr. HE Junxian