More about HKUST
IMPROVISING HIP HOP LYRICS VIA TRANSDUCTION GRAMMAR INDUCTION
PhD Thesis Proposal Defence
Title: "IMPROVISING HIP HOP LYRICS VIA TRANSDUCTION GRAMMAR INDUCTION"
by
Mr. Karteek ADDANKI
Abstract:
Among the many genres of language that have been studied in computational
linguistics and spoken language processing, there has been a dearth of work on
lyrics in music, despite the major impact that this form of language has across
almost all human cultures. In this proposal, we propose theoretically motivated
symbolic and deep learning models for improvising lyrics in music and we choose
the genre of hip hop lyrics as our domain. Unlike most other approaches, all
our models are completely unsupervised and do not make use of any linguistic or
phonetic information. Through our work, we model the issues in song lyric
improvisation using modern statistical language technologies and attempt to
bridge the gap between language and music from a computational perspective.
We improvise hip hop lyrics by generating responses to challenges similar to a
freestyle rap battle, modeling it as a ''transduction'' problem where the
challenges need to be translated into a response. We propose a novel hidden
Markov Model (HMM) based rhyme scheme detection module, which identifies the
rhyme scheme within a stanza in a completely unsupervised fashion and use it to
select training data for our models so as to generate fluent and rhyming
responses. We choose the framework of transduction grammars, in particular
inversion transduction grammars (ITGs), as our transduction model given their
representational capacity and empirical performance across a spectrum of NLP
tasks. We propose two symbolic models for improvisation based on 1) a token
based bracketing inversion transduction grammar, and 2) interpolated grammar
induced using bottom-up token based rule induction and top-down rule
segmentation strategies. We demonstrate that the interpolated grammar generates
responses that are more fluent and rhyme better with the challenges according
to human evaluators. We also compare the performance of our models against the
widely used off-the-shelf phrase base SMT (PBSMT) model upon the same task and
show that both our models outperform the PBSMT baseline. We also present
similar results on Maghrebi French hip hop lyrics demonstrating the language
independence of our models.
We also present a novel deep learning improvisation model based on a fully
bilingual generalization of the monolingual Recursive Auto-Associative Memory
(RAAM) known as Transduction Recursive Auto-Associative Memory (TRAAM). TRAAM
models learn soft, context-sensitive generalizations over the structural
relationships between associated parts of challenge and response raps, while
avoiding the exponential complexity costs that symbolic models would require.
In TRAAM, feature vectors are learned simultaneously using context from both
the challenge and the response, such that challenge-response association
patterns with similar structure tend to have similar vectors. We argue that the
TRAAM models capture the context-sensitive generalizations that symbolic models
fail to capture and will therefore produce better quality responses.
Lastly, we discuss the challenges in evaluating the performance on the
improvisation task of evaluating hip hop lyrics as a first step toward
designing robust evaluation strategies for improvisation tasks, a relatively
neglected area to date. We discuss our observations regarding inter-evaluator
agreement on judging improvisation quality as a means to better understand the
high degree of subjectivity at play in improvisation tasks, thereby enabling
the design of more discriminative evaluation strategies to drive future model
development.
Date: Tuesday, 16 June 2015
Time: 2:00pm - 4:00pm
Venue: Room 3584
lifts 27/28
Committee Members: Prof. Dekai Wu (Supervisor)
Prof. Qiang Yang (Chairperson)
Prof. Dit-Yan Yeung
Prof. Pascale Fung (ECE)
**** ALL are Welcome ****