More about HKUST
Bilingual Category Induction with Recursive Neural Network
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
PhD Thesis Defence
Title: "Bilingual Category Induction with Recursive Neural Network"
By
Mr. Yuchen YAN
Abstract:
Inversion Transduction Grammar (ITG) is capable of symbolically representing a
transduction relationship with much more explanatory power compared to
mainstream neural network approaches. However inducing an ITG from parallel
corpus still remains a challenging problem, in which the main bottleneck lies
in category induction. The main obstacle is that the space of possible ITG
rules grows exponentially with the category size, making symbolic search
infeasible.
Later Tranductive Recursive Auto-Associative Memory (TRAAM) showed the
possibility of hinting useful categories with the help of recursive neural
network, but the quality of the produced categories are far from ideal. We
observe that TRAAM suffers from vanishing gradient problems, and the
self-reconstruction training objective of TRAAM do not align with contextually
appropriate bilingual categorization. For vanishing gradient problems, most
mainstream methods like LSTM, GRU, skip connections and Gated Linear Unit
either require sequential input topologies or have unbounded output range thus
cannot be used for biparse trees. So we design our own recursive network
architecture. For training objectives, we argue that context-aware objectives
suits better than self-reconstruction objectives. Among common context-aware
objectives, sibling token prediction can be more easily generalized to work
with biparse trees than autoreggressive objective used in GPT or sibling
sentence prediction objective used in BERT.
Even with our improved method to hint bilingual categories, inducing a
categorized ITG still remains challenging because TRAAM based bilingual
category hinting relies on high quality biparse tree topologies, but high
quality biparse tree topologies relies on high quality categorized ITG. To
solve this chicken and egg problem, we will introduce what we call a feedback
training pipeline that co-trains both the categorized ITG and our improved
TRAAM network at the same time.
Implementationally, mainstream deep learning frameworks like Tensorflow and
Pytorch lack the flexibility to run our feedback training pipeline which
requires integrating symbolic ITG biparsing algorithm into the training loop.
So we also develop our own deep learning framework in C++.
Date: Monday, 21 August 2023
Time: 2:00pm - 4:00pm
Venue: Room 3494
Lifts 25/26
Chairman: Prof. Amy DALTON (MARK)
Committee Members: Prof. Dekai WU (Supervisor)
Prof. Andrew HORNER
Prof. Nevin ZHANG
Prof. Tao LIU (PHYS)
Prof. Lei SHA (Beihang University)
**** ALL are Welcome ****