Exploring optimization methods in deep learning

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering

Final Year Thesis Oral Defense

Title: "Exploring optimization methods in deep learning"

by

LIU Yuzhi

Abstract:

Common optimization algorithms such as Stochastic Gradient Descent (SGD) 
are widely adopted for large-scale training of deep learning models. Its 
popular momentum variants are Heavy Ball (HB) method and Nesterov's 
Accelerated Gradient descent (NAG) method. Though the momentum-based 
algorithms perform over SGD in practice, their effectiveness cannot be 
fully explained by existing theories. This work summarizes recent work on 
SDG with momentum, including theoretical bounds and newly proposed 
momentum-based algorithms. We also provide theoretical insights on SGD 
with HB given quadratic objectives.


Date            : 2 May 2023 (Tuesday)

Time            : 11:30 - 12:10

Venue           : Room 4472 (near lifts 25/26), HKUST

Advisor         : Prof. ZHANG Tong

2nd Reader      : Prof. ZHANG Nevin Lianwen