More about HKUST
Towards Medical Image Understanding and Interpretation: From Anatomy to Clinical Insights
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
PhD Thesis Defence
Title: "Towards Medical Image Understanding and Interpretation: From Anatomy
to Clinical Insights"
By
Mr. Haibo JIN
Abstract:
Medical image interpretation is a cornerstone of modern healthcare, enabling
accurate diagnosis and precise clinical decision-making. However, this
process remains time- consuming, prone to variability, and constrained by
the growing demand for radiological expertise. Artificial intelligence (AI)
offers a promising solution, yet existing approaches often address medical
image analysis tasks in isolation, limiting their clinical applicability.
This thesis presents a unified framework for automated medical image
interpretation, advancing from anatomical understanding to diagnostic
reasoning through deep learning innovations.
First, we address anatomical structure understanding by developing
semi-supervised landmark detection methods that reduce reliance on labeled
data. Our approach leverages self-training with domain adaptation and a
task-level curriculum to refine pseudo-labels, improving scalability and
generalization across datasets.
Next, we tackle automated report generation with PromptMRG, a novel
framework that enhances diagnostic accuracy by using disease classification
outputs as prompts for the text decoder. Cross-modal feature retrieval and
an adaptive loss function further improve performance, integrating prior
clinical knowledge and addressing class imbalance problem, respectively.
To enable fine-grained clinical reasoning, we introduce Chain of Diagnosis
(CoD), a framework that integrates large language models (LLMs) for accurate
and explainable report generation. By simulating radiologist workflows
through diagnostic question- answering (QA) pairs, our method ensures
clinically accurate descriptions of diagnosed disease and lesion attributes.
Moreover, a diagnosis grounding module aligns generated text with evidence
while a lesion grounding module localizes abnormalities for improved
workflow efficiency.
Collectively, this work bridges the gap between AI research and clinical
needs, delivering scalable, interpretable, and robust solutions for medical
image analysis. We conclude by outlining future directions, including
generalist diagnostic model, multi- agent collaborative refinement, and
universal multi-organ model to further advance the field.
Date: Tuesday, 20 May 2025
Time: 2:00pm - 4:00pm
Venue: Room 2128B
Lift 19
Chairman: Prof. David Chuen Chun LAM (MAE)
Committee Members: Dr. Hao CHEN (Supervisor)
Dr. Long CHEN
Dr. Xiaomin OUYANG
Dr. Terence Tsz Wai WONG (CBE)
Prof. Jin QIN (PolyU)