More about HKUST
Corpus Pattern Analysis: New light on words and meanings
================================================================ Joint Seminar ================================================================ The Hong Kong University of Science & Technology Department of Computer Science and Engineering Department of Electronic and Computer Engineering Human Language Technology Center ---------------------------------------------------------------- Speaker: Professor Patrick HANKS University of Wolverhampton / Charles University in Prague Title: "Corpus Pattern Analysis: New light on words and meanings" Date: Monday, 1 November 2010 Time: 4:00pm - 5:00pm Venue: Lecture Theater F (near lifts 25/26), HKUST Abstract: This seminar presents a new technique for creating a new kind of lexical resource, The analytic technique is called Corpus Pattern Analysis (CPA) and the first product (work in progress; publicly available at http://nlp.fi.muni.cz/projects/cpa) is called The Pattern Dictionary of English Verbs. This work is associated with the development of a new theory of meaningful linguistic behaviour, called the Theory of Norms and Exploitations (TNE). Traditionally, it was thought (by Leibniz and others) that the analysis of meaning ought to proceed word by word, like a child building a toy house with Lego bricks. This view of meaning is untenable. Corpus evidence shows clearly that words in isolation do not have much meaning - and what they have is not very precise. Words in isolation have only meaning potential. In context, the meaning of a word becomes much more precise. How does this work? Each use of a word is generally associated with an underlying phraseological pattern. Meanings are associated with patterns, not with words (or rather, as well as with words). Words are highly ambiguous, but most patterns are unambiguous. Hence, it is necessary to decide what counts as a pattern. CPA is a technique for identifying and storing meaningful patterns of word use, drawing on prototype theory (Rosch), valencies (Halliday; Herbst), and collocational analysis (Sinclair; Church and Hanks; Kilgarriff). Collocations are grouped together according to their semantic type (Pustejovsky). Mapping sentences in text onto CPA patterns usually yields an unambiguous interpretation. Patterns discovered through CPA are stored in a Pattern Dictionary. The lecture shows how a Pattern Dictionary can be compiled and discusses some of the linguistic issues that arise. ******************* Biography: Patrick Hanks is a lexicographer and corpus linguist. He was chief editor of current English dictionaries at Oxford University Press from 1990 to 2000. In the 1980s he was managing editor of the Cobuild project and chief editor of Collins English dictionaries. He is author and co-author of many papers on lexicography, lexical analysis, and similes and metaphor. He is currently compiling a 6-volume collection of papers on lexicology in the Routledge Creative Concepts series. His main research interests are: (a) precision and vagueness in language; (b) mapping meaning onto use-corpus-based syntagmatic analysis of lexical regularities; (c) similes and metaphors: creative and innovative use of language; (d) personal names: origin and history, convention and creativity in naming. From 2002 to 2006 he divided his time between Brandeis University in Waltham, Massachusetts, and the Berlin-Brandenburg Academy of Sciences in Berlin, Germany. He is currently Visiting Professor at the Research Institute in Information and Language Processing, Department of Computer Science, University of Wolverhampton, UK, as well as Visiting Professor at the Institute of Formal and Applied Linguistics, Charles University in Prague.