On Faithful Citations in Retrieval Augmented Generation

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


MPhil Thesis Defence


Title: "On Faithful Citations in Retrieval Augmented Generation"

By

Mr. Yue LIU


Abstract:

Retrieval-augment generation is a prevalent strategy to mitigate
hallucinations of LLMs. The attributable RAG (RAGQ) generates quotes for
its answers. The quotes indicate which input contexts support the RAG to
derive the answers, enhancing the answer's verifiability and
trustworthiness. However, existing RAGQs exhibit significant degradation
when dealing with questions that require multi-hop reasoning and multi-modal
understanding, suffering from over-citation, implicit entity identification
failure, and poor generalization.

In this thesis, we propose a novel RAGQ framework, namely QDRAG. QDRAG
breaks down the input question into atomic subquestions to identify the
implicit entities. Then, the reranker prunes context distractors to
eliminate the downstream over-citation. To facilitate query decomposition,
we propose two zero-shot approaches: QD-C and QD-R, which guide the QD MLLM
to decompose the question based on context knowledge and retrieval rewards,
respectively. One interesting finding is that finetuning on the QD task
shows better generalizability compared to directly finetuning on the
downstream RAGQ task. Experiments on four multi-modal QA benchmarks
demonstrate QDRAG's efficacy in grounding answers and generating faithful
citations. The framework significantly outperforms all the baselines on
both in-domain and out-of-domain tests, even surpassing Gemini-Pro.


Date:                   Tuesday, 16 December 2025

Time:                   2:00pm - 4:00pm

Venue:                  Room 2128B
                        Lift 22

Chairman:               Dr. Dan XU

Committee Members:      Prof. Xiaofang ZHOU (Supervisor)
                        Dr. May FUNG