From Brain to Speech: Analyzing the Trade-off Between Decoding Accuracy and Speech-Related Data

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering

Final Year Thesis Oral Defense

Title: "From Brain to Speech: Analyzing the Trade-off Between Decoding 
Accuracy and Speech-Related Data"

by

YANG Dongjie

Abstract:

Brain-computer interfaces (BCIs) hold significant potential for restoring 
communication in individuals with aphasia, particularly in tonal languages 
like Mandarin Chinese. However, decoding such languages from neural signals 
remains challenging due to their reliance on pitch and phonetic complexity. 
This project investigates the trade-off between decoding accuracy and the 
incorporation of speech-related data in Chinese speech decoding. Three 
architectures were developed: the Brain Architecture (solely neural data), 
Brain-Articulation Architecture (partial speech features), and Brain-Audio 
Architecture (full audio integration). Results reveal that speech-related 
data enhances decoding performance: initial and final accuracy improved by 
4-8% and 30-33%, respectively, when transitioning from brain-only models to 
architectures incorporating articulation or audio features. Tone decoding, 
however, achieved 44-52% accuracy even without speech data, suggesting robust 
neural encoding of pitch. Despite these gains, overall accuracy remains 
suboptimal for practical use, highlighting the necessity of speech-derived 
features for finer phonetic elements. Challenges such as limited datasets and 
overfitting underscore the need for future innovations in model design and 
data augmentation. This work advances understanding of neural speech decoding 
trade-offs and informs the development of tailored solutions for 
Chinese-speaking populations with aphasia.


Date            : 8 May 2025 (Thursday)

Time            : 10:00 - 10:40

Venue           : Room 2408 (near lifts 17/18), HKUST

Advisor         : Dr. MA Xiaojuan

2nd Reader      : Dr. CHAN Ki Cecia