More about HKUST
A survey of Mixture of Experts in Multi-Modal Large Language Models
PhD Qualifying Examination
Title: "A survey of Mixture of Experts in Multi-Modal Large Language Models"
by
Mr. Zhili LIU
Abstract:
Recent advancements in Multi-Modal Large Language Models (MLLMs) have brought
us closer to developing general-purpose assistants capable of following
complex vision-and-language instructions. A key challenge in this development
is alignment, which ensures that MLLMs accurately interpret and act on human
intent across a range of real-world tasks. Concurrently, Mixture of Experts
(MoE) models have gained significant attention for their success in large
language models (LLMs), and many of these strategies are being integrated
into MLLM alignment. However, despite the growing use of MoE in MLLMs, a
systematic and comprehensive review of the literature is lacking. In this
survey, we first introduce the MoE paradigm and the three key alignment
stages in MLLM tuning: Vision Encoder Training, MLLM Alignment, and MLLM
Inference. We then provide an overview of MoE's role in these stages and
highlight potential directions for future research.
Date: Thursday, 19 December 2024
Time: 4:00pm - 6:00pm
Venue: Room 2128A
Lift 19
Committee Members: Prof. James Kwok (Supervisor)
Dr. Dan Xu (Chairperson)
Dr. Long Chen
Dr. May Fung