More about HKUST
Toward Robust Transfer Learning: From Supervised Fine-Tuning to Dynamic Test-Time Adaptation
PhD Thesis Proposal Defence Title: "Toward Robust Transfer Learning: From Supervised Fine-Tuning to Dynamic Test-Time Adaptation" by Mr. Xingzhi ZHOU Abstract: Transfer learning facilitates the use of pre-trained knowledge to address new tasks and is classified into two categories: inductive and transductive. Inductive transfer learning involves different tasks and limited labeled data in the target domain. In contrast, transductive transfer learning concerns identical tasks but lacks target domain labels. This thesis examines both inductive transfer learning and transductive transfer learning at inference, referred to as test-time adaptation. Although promising, inductive methods often exhibit poor generalization due to scarce labeled samples. Meanwhile, transductive methods during inference face instability caused by continual domain shifts, temporal correlations, and noisy data. These challenges significantly undermine the practical reliability of transfer learning. This dissertation advances the robustness of transfer learning through methodological innovations in supervised fine-tuning (inductive transfer learning) and test-time adaptation (transductive transfer learning during inference). The contributions are organized into three distinct studies. The first study addresses inductive transfer learning for traditional Chinese medicine (TCM) prescription prediction, a task challenged by limited labeled data and complex symptom-herb relationships. We propose TCM-FTP, a fine-tuning framework that adapts large language models (LLMs) using a curated dataset, DigestDS, derived from clinical documentation. TCM-FTP integrates low-rank adaptation (LoRA) to reduce computational demands and employs a herb-order randomization strategy for data augmentation. Experimental results show that TCM-FTP significantly improves herb identification and dosage estimation, highlighting the efficacy of specialized fine-tuning in low-resource medical domains. The second study explores test-time adaptation under continual domain shifts and temporal correlation. We introduce ResiTTA, a resilient test-time adaptation framework that mitigates overfitting by softly regularizing batch normalization statistics between source and target domains. To stabilize adaptation, ResiTTA maintains a low-entropy, class-balanced memory bank, enabling teacher-student self-training under an approximate i.i.d. assumption. Rigorous benchmarks demonstrate that ResiTTA consistently outperforms state-of-the-art methods in dynamic environments. The third study extends transductive adaptation to noisy test-time settings. We propose MoTTA, a pruning-based framework that identifies noisy samples via output difference under pruning (ODP), offering greater robustness than prediction-based filtering. Additionally, MoTTA introduces flatness-aware entropy minimization (FlatEM), guiding optimization toward flatter loss landscapes through zeroth- and first-order constraints. Empirical evaluations confirm that MoTTA achieves superior robustness and adaptation performance under noisy conditions where existing methods degrade. Date: Thursday, 29 May 2025 Time: 4:00pm - 6:00pm Venue: Room 2128B Lift 19 Committee Members: Prof. Nevin Zhang (Supervisor) Prof. Dit-Yan Yeung (Chairperson) Dr. Brian Mak