More about HKUST
Bilingual Category Induction with Recursive Neural Network
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Bilingual Category Induction with Recursive Neural Network" By Mr. Yuchen YAN Abstract: Inversion Transduction Grammar (ITG) is capable of symbolically representing a transduction relationship with much more explanatory power compared to mainstream neural network approaches. However inducing an ITG from parallel corpus still remains a challenging problem, in which the main bottleneck lies in category induction. The main obstacle is that the space of possible ITG rules grows exponentially with the category size, making symbolic search infeasible. Later Tranductive Recursive Auto-Associative Memory (TRAAM) showed the possibility of hinting useful categories with the help of recursive neural network, but the quality of the produced categories are far from ideal. We observe that TRAAM suffers from vanishing gradient problems, and the self-reconstruction training objective of TRAAM do not align with contextually appropriate bilingual categorization. For vanishing gradient problems, most mainstream methods like LSTM, GRU, skip connections and Gated Linear Unit either require sequential input topologies or have unbounded output range thus cannot be used for biparse trees. So we design our own recursive network architecture. For training objectives, we argue that context-aware objectives suits better than self-reconstruction objectives. Among common context-aware objectives, sibling token prediction can be more easily generalized to work with biparse trees than autoreggressive objective used in GPT or sibling sentence prediction objective used in BERT. Even with our improved method to hint bilingual categories, inducing a categorized ITG still remains challenging because TRAAM based bilingual category hinting relies on high quality biparse tree topologies, but high quality biparse tree topologies relies on high quality categorized ITG. To solve this chicken and egg problem, we will introduce what we call a feedback training pipeline that co-trains both the categorized ITG and our improved TRAAM network at the same time. Implementationally, mainstream deep learning frameworks like Tensorflow and Pytorch lack the flexibility to run our feedback training pipeline which requires integrating symbolic ITG biparsing algorithm into the training loop. So we also develop our own deep learning framework in C++. Date: Monday, 21 August 2023 Time: 2:00pm - 4:00pm Venue: Room 3494 Lifts 25/26 Chairman: Prof. Amy DALTON (MARK) Committee Members: Prof. Dekai WU (Supervisor) Prof. Andrew HORNER Prof. Nevin ZHANG Prof. Tao LIU (PHYS) Prof. Lei SHA (Beihang University) **** ALL are Welcome ****