Enabling Explainable AI with Transformer Models: Opportunities and Limitations in Visual and Textual Concept Generation

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


MPhil Thesis Defence


Title: "Enabling Explainable AI with Transformer Models: Opportunities and
Limitations in Visual and Textual Concept Generation"

By

Mr. Ao SUN


Abstract:

Recent advances in neural networks have driven remarkable progress in computer
vision (CV), yet their opaque decision-making has fueled growing interest in
Explainable AI (XAI). Conventional explanation methods often depend on linear
segmentation and manual annotations, limiting understandability and scalability.
Meanwhile, breakthroughs in transformer-based Large Language Models (LLMs) and
Vision Language Models (VLMs) offer new opportunities to produce automated,
high-quality concept-explanations. In this thesis, we investigate how such
models can serve as enablers of XAI to enhance the generation of visual and
textual concepts, and we further assess their limitations in fulfilling this
role effectively. For visual concepts, we propose the Explain Any Concept
(EAC) framework, which leverages the Segment Anything Model (SAM) to faithfully
identify human-understandable image regions that influence a target model's
predictions. For textual concepts, we introduce the Hierarchical-Concept
Bottleneck Model (Hi-CBM), which leverages LLMs to automatically generate
conceptual annotations that are both richly informative and well-structured.
The richness of these concepts allows any CV model, when processing an image,
to perform more faithful reasoning before making a final decision, while their
structured organization filters redundant information and mitigates the
information-leakage issues of traditional CBMs. To assess the limitations of
transformer-based models in generating reliable explanations, we propose the
Fast and Slow Effect (FSE) framework. FSE assesses a single model's ability to
generate effective concepts by comparing classification performance in two
modes: fast mode, simulating a black-box making direct predictions without
rationales, and slow mode, simulating an interpretable expert that reasons over
conceptual evidence. On specialized datasets, slow mode underperforms fast mode
by about 30%, whereas on general-purpose datasets, it outperforms fast mode by
roughly 10%. Overall, this thesis lays a strong foundation for transformer
models to serve as enablers of XAI, and highlights their understandability to
specialized tasks as a critical frontier for future research.


Date:                   Wednesday, 8 October 2025

Time:                   2:00pm - 4:00pm

Venue:                  Room 5501
                        Lifts 25-26

Chairman:               Dr. Dan XU

Committee Members:      Dr. Shuai WANG (Supervisor)
                        Prof. Long QUAN