More about HKUST
IMPROVISING HIP HOP LYRICS VIA TRANSDUCTION GRAMMAR INDUCTION
PhD Thesis Proposal Defence Title: "IMPROVISING HIP HOP LYRICS VIA TRANSDUCTION GRAMMAR INDUCTION" by Mr. Karteek ADDANKI Abstract: Among the many genres of language that have been studied in computational linguistics and spoken language processing, there has been a dearth of work on lyrics in music, despite the major impact that this form of language has across almost all human cultures. In this proposal, we propose theoretically motivated symbolic and deep learning models for improvising lyrics in music and we choose the genre of hip hop lyrics as our domain. Unlike most other approaches, all our models are completely unsupervised and do not make use of any linguistic or phonetic information. Through our work, we model the issues in song lyric improvisation using modern statistical language technologies and attempt to bridge the gap between language and music from a computational perspective. We improvise hip hop lyrics by generating responses to challenges similar to a freestyle rap battle, modeling it as a ''transduction'' problem where the challenges need to be translated into a response. We propose a novel hidden Markov Model (HMM) based rhyme scheme detection module, which identifies the rhyme scheme within a stanza in a completely unsupervised fashion and use it to select training data for our models so as to generate fluent and rhyming responses. We choose the framework of transduction grammars, in particular inversion transduction grammars (ITGs), as our transduction model given their representational capacity and empirical performance across a spectrum of NLP tasks. We propose two symbolic models for improvisation based on 1) a token based bracketing inversion transduction grammar, and 2) interpolated grammar induced using bottom-up token based rule induction and top-down rule segmentation strategies. We demonstrate that the interpolated grammar generates responses that are more fluent and rhyme better with the challenges according to human evaluators. We also compare the performance of our models against the widely used off-the-shelf phrase base SMT (PBSMT) model upon the same task and show that both our models outperform the PBSMT baseline. We also present similar results on Maghrebi French hip hop lyrics demonstrating the language independence of our models. We also present a novel deep learning improvisation model based on a fully bilingual generalization of the monolingual Recursive Auto-Associative Memory (RAAM) known as Transduction Recursive Auto-Associative Memory (TRAAM). TRAAM models learn soft, context-sensitive generalizations over the structural relationships between associated parts of challenge and response raps, while avoiding the exponential complexity costs that symbolic models would require. In TRAAM, feature vectors are learned simultaneously using context from both the challenge and the response, such that challenge-response association patterns with similar structure tend to have similar vectors. We argue that the TRAAM models capture the context-sensitive generalizations that symbolic models fail to capture and will therefore produce better quality responses. Lastly, we discuss the challenges in evaluating the performance on the improvisation task of evaluating hip hop lyrics as a first step toward designing robust evaluation strategies for improvisation tasks, a relatively neglected area to date. We discuss our observations regarding inter-evaluator agreement on judging improvisation quality as a means to better understand the high degree of subjectivity at play in improvisation tasks, thereby enabling the design of more discriminative evaluation strategies to drive future model development. Date: Tuesday, 16 June 2015 Time: 2:00pm - 4:00pm Venue: Room 3584 lifts 27/28 Committee Members: Prof. Dekai Wu (Supervisor) Prof. Qiang Yang (Chairperson) Prof. Dit-Yan Yeung Prof. Pascale Fung (ECE) **** ALL are Welcome ****