A Survey on Efficient Transformers: from Training to Inference

PhD Qualifying Examination


Title: "A Survey on Efficient Transformers: from Training to Inference"

by

Mr. Shih-yang LIU


Abstract:

Transformer-based models have shown remarkable improvements over earlier 
techniques and ushered in significant progress in many tasks, especially in 
the domains of vision and language. However, as their size and complexity 
continue to increase, latency and model size have become a noticeable issues 
in both training and deployment. To tackle these challenges, researchers have 
begun investigating methods to reduce training overhead—often referred to as 
Parameter Efficient Fine-Tuning (PEFT)—which adjust only a subset of the 
model parameters, as well as compression strategies that help shrink models 
for quicker inference. However, many existing studies regard these two 
strategies as independent, missing a chance to leverage the advantages of 
both line works to achieve simultaneous gains in both training and inference 
efficiency. In this context, this summary provides a comprehensive review of 
both PEFT and model compression, showcasing various examples and examining 
how their integration can enhance the efficiency of transformer 
architectures, particularly in the realm of Large Language Models (LLMs). 
Notably, this work is among the first broad surveys in the efficient 
transformers domain that encompasses both PEFT and model compression, 
potentially inspiring readers to further investigate the synergy between 
these research directions.


Date:                   Tuesday, 1 April 2025

Time:                   2:30pm - 4:30pm

Venue:                  Room 3494
                        Lifts 25/26

Committee Members:      Prof. Tim Cheng (Supervisor)
                        Dr. Yangqiu Song (Chairperson)
                        Dr. Dan Xu