More about HKUST
Towards Efficient Multi-objective Alignment of Large Language Models
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
MPhil Thesis Defence
Title: "Towards Efficient Multi-objective Alignment of Large Language Models"
By
Mr. Rui YANG
Abstract:
This study addresses the challenge of multi-objective alignment of foundation
models, particularly Large Language Models (LLMs), with human preferences--a
crucial step towards developing helpful and harmless AI systems. Fine-tuning
large foundation models using reinforcement learning (RL) is often costly and
unstable. Additionally, the multi-dimensionality, heterogeneity, and
conflicting nature of human preferences further complicate the alignment
process. In this paper, we introduce Rewards-in-Context (RiC), a novel approach
that conditions the response of a foundation model on multiple rewards within
its prompt context and employs supervised fine-tuning for alignment. RiC is
characterized by its simplicity and adaptability, requiring only supervised
fine-tuning of a single foundation model and allowing for dynamic adjustment of
user preferences during inference. Inspired by the analytical solution of an
abstracted convex optimization problem, our dynamic inference-time adjustment
method approximates the Pareto-optimal solution for multiple objectives.
Empirical evidence demonstrates the efficacy of our method in aligning LLMs to
accommodate diverse rewards with only approximately 10% of the GPU hours
required by multi-objective RL baselines.
Date: Thursday, 8 August 2024
Time: 10:00am - 12:00noon
Venue: Room 4475
Lifts 25/26
Chairman: Dr. Ling PAN
Committee Members: Dr. Junxian HE (Supervisor)
Dr. Dongdong SHE