More about HKUST
A Survey on Self-supervised Visual Representation Learning with Vision Transformer
PhD Qualifying Examination Title: "A Survey on Self-supervised Visual Representation Learning with Vision Transformer" by Mr. Kai CHEN Abstract: Self-supervised visual representation learning aims at pre-training a representation backbone network from pseudo labels automatically generated from unlabeled images, without dependence on human annotations such as semantic class labels and image captioning. Previous methods in the CNN era have been dominated by instance discrimination, while with the development of Vision Transformer, novel pretext tasks represented by masked image modeling have demonstrated potential for more superior transfer performance. In this survey, we provide a comprehensive review of the self-supervised visual representation learning methods with Vision Transformer. Specifically, we first formulate self-supervised learning with a unified objective for both instance discrimination and masked image modeling and provide a brief introduction to Vision Transformer. After that, we conduct a throughout review of the two mainstream self-supervised pretext tasks with an in-depth analysis of the challenges and differences. Finally, we conclude by discussing several potential research directions. Date: Tuesday, 9 August 2022 Time: 2:00pm - 4:00pm Zoom Meeting: https://hkust.zoom.us/j/93921451190?pwd=SW1zYWlseFBXWW5VRHBqbGFFRHJmdz09 Committee Members: Prof. Dit-Yan Yeung (Supervisor) Prof. Raymond Wong (Chairperson) Dr. Dan Xu Dr. Zhiqiang Shen **** ALL are Welcome ****