An Exploration of Three Approaches to Automating Data Augmentation for Machine Learning

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "An Exploration of Three Approaches to Automating Data Augmentation 
for Machine Learning"

By

Mr. Tsz Him CHEUNG


Abstract:

Data Augmentation is an effective way to improve the generalization of deep 
learning models. However, current augmentation methods rely on manual 
operations, such as flipping and cropping for image data. These methods are 
often designed based on human expertise or trial and error. Meanwhile, 
Automated Data Augmentation (AutoDA) is a promising research direction that 
treats data augmentation as a learning task and discovers the most effective 
ways to augment the data. This thesis examines recent AutoDA research and 
introduces four new methods to enhance AutoDA using composition-based, 
mixing-based, and generation-based approaches.

Existing composition-based AutoDA methods apply a fixed policy to the entire 
dataset and mainly focus on image classification tasks. In response, we 
propose AdaAug to learn class- and instance-adaptive augmentation policies 
and MODALS to learn modality-agnostic latent space augmentation strategies. 
For mixing-based AutoDA, we introduce TransformMix to learn transformation 
and mixing augmentation strategies from data. Compared to previous 
mixing-based methods, TransformMix considers the saliency information of 
input images and produces more compelling mixed images that contain accurate 
and important information for the target tasks. Regarding generation-based 
AutoDA, we present AutoGenDA to address imbalanced classification tasks. 
Specifically, AutoGenDA identifies and transfers label-invariant changes 
across data classes through image captions and text-guided generative models. 
We also propose an automated search strategy to adapt the AutoGenDA 
augmentation to each data class, leading to better generalization.


Date:                   Thursday, 16 January 2025

Time:                   10:00am - 12:00noon

Venue:                  Room 5506
                        Lifts 25/26

Chairman:               Dr. Jin QI (IEDA)

Committee Members:      Prof. Dit-Yan YEUNG (Supervisor)
                        Prof. Albert CHUNG
                        Prof. Raymond WONG
                        Dr. Can YANG (MATH)
                        Prof. Sinno Jialin PAN (CUHK)