Vision-Language Pre-training for Medical Imaging

PhD Qualifying Examination


Title: "Vision-Language Pre-training for Medical Imaging"

by

Mr. Xiaoyu ZHENG


Abstract:

Data plays a crucial role in advancing current Artificial Intelligence (AI) 
technologies. Various data modalities such as images, texts, videos and audios 
are collected and utilized for training AI systems to obtain better 
performance. In medical domain, a doctor will also take multi-modal factors 
(e.g. blood test results, medical images, previous drug history) into 
consideration for diagnosis and treatment.

Among them, medical imaging examination result is one of the most decisive 
factors, thus, many AI systems in early stage aims to predict the labels of 
corresponding medical images for classification, segmentation and object 
detection by feeding the images and its annotation from doctors to a deep 
learning model. However, the human annotation in medical domain often requires 
trained experts, which makes it hard to generate very large-scale labelled data 
for supervised model training.

In recent years, self-supervised learning is considered as a promising approach 
to boost the AI model performance in medical domain. The self-supervised 
learning aim to learn robust representations form data itself, which doesn't 
require data labeling during the training stage. In medical imaging domain, one 
advantage for self-supervised learning is that the images are often paired with 
their reports consisting of the image description and the diagnosis result, 
which makes the vision-language pre-training realizable. By aligning the text 
representations and image representations into the same space, the pre-training 
models can even achieve a better performance compared with supervised methods 
and some models can also gain the zero-shot capability. In this survey, we 
present the mainstream vision-language pretraining frameworks and datasets for 
2D, 3D and video medical imaging. Beyond that, we also provide a summary of 
downstream datasets and tasks. Meanwhile, the challenges and future directions 
are also discussed.


Date:                   Tuesday, 30 July 2024

Time:                   4:00pm - 6:00pm

Venue:                  Room 5508
                        Lifts 25/26

Committee Members:      Dr. Hao Chen (Supervisor)
                        Dr. Qifeng Chen (Chairperson)
                        Dr. Junxian He
                        Dr. Dan Xu
Privacy Sitemap
Vision-Language Pre-training for Medical Imaging

About

People

Research

Academics

Admissions