The Application of Foundation Models in Robotics

PhD Qualifying Examination


Title: "The Application of Foundation Models in Robotics"

by

Mr. Siyuan ZHOU


Abstract:

Foundation models that are pre-trained on diverse data at scale have 
demonstrated substantial potential across numerous tasks in both vision and 
language domains. Yet, developing foundation models for general-purpose robots 
remains a significant challenge. General-purpose robots must be capable of 
performing seamlessly across any task in any environment. However, current 
robotic systems are constrained in specific tasks and specific environments, 
resulting in generalization issues -- models struggle to perform effectively 
when encountering unseen tasks or new environments.

Motivated by the remarkable open-set performance and common sense reasoning of 
foundation models, such as Large Language Model (LLM) and Vision Language Model 
(VLM), we devote this survey to exploring how these foundation models integrate 
seamlessly into robotic systems. We start with a thorough overview of both 
robotic systems and foundation models in the vision and language fields. Next, 
we discuss current works on leveraging existing foundation models for robotic 
tasks. Finally, we consider several promising directions for future research.


Date:                   Thursday, 23 May 2024

Time:                   10:00am - 12:00noon

Venue:                  Room 3494
                        Lifts 25/26

Committee Members:      Prof. Dit-Yan Yeung (Supervisor)
                        Prof. Fangzhen Lin (Chairperson)
                        Dr. Long Chen
                        Dr. Qifeng Chen