More about HKUST
Towards Efficient Multi-objective Alignment of Large Language Models
The Hong Kong University of Science and Technology Department of Computer Science and Engineering MPhil Thesis Defence Title: "Towards Efficient Multi-objective Alignment of Large Language Models" By Mr. Rui YANG Abstract: This study addresses the challenge of multi-objective alignment of foundation models, particularly Large Language Models (LLMs), with human preferences--a crucial step towards developing helpful and harmless AI systems. Fine-tuning large foundation models using reinforcement learning (RL) is often costly and unstable. Additionally, the multi-dimensionality, heterogeneity, and conflicting nature of human preferences further complicate the alignment process. In this paper, we introduce Rewards-in-Context (RiC), a novel approach that conditions the response of a foundation model on multiple rewards within its prompt context and employs supervised fine-tuning for alignment. RiC is characterized by its simplicity and adaptability, requiring only supervised fine-tuning of a single foundation model and allowing for dynamic adjustment of user preferences during inference. Inspired by the analytical solution of an abstracted convex optimization problem, our dynamic inference-time adjustment method approximates the Pareto-optimal solution for multiple objectives. Empirical evidence demonstrates the efficacy of our method in aligning LLMs to accommodate diverse rewards with only approximately 10% of the GPU hours required by multi-objective RL baselines. Date: Thursday, 8 August 2024 Time: 10:00am - 12:00noon Venue: Room 4475 Lifts 25/26 Chairman: Dr. Ling PAN Committee Members: Dr. Junxian HE (Supervisor) Dr. Dongdong SHE