Corpus Pattern Analysis: New light on words and meanings

================================================================
                Joint Seminar
================================================================
The Hong Kong University of Science & Technology
Department of Computer Science and Engineering
Department of Electronic and Computer Engineering
Human Language Technology Center
----------------------------------------------------------------
Speaker:	Professor Patrick HANKS
		University of Wolverhampton /
		Charles University in Prague

Title:		"Corpus Pattern Analysis: New light on words
		 and meanings"

Date:		Monday, 1 November 2010

Time:		4:00pm - 5:00pm

Venue:		Lecture Theater F (near lifts 25/26), HKUST

Abstract:

This seminar presents a new technique for creating a new kind of lexical
resource, The analytic technique is called Corpus Pattern Analysis (CPA)
and the first product (work in progress; publicly available at
http://nlp.fi.muni.cz/projects/cpa) is called The Pattern Dictionary of
English Verbs. This work is associated with the development of a new
theory of meaningful linguistic behaviour, called the Theory of Norms and
Exploitations (TNE).

Traditionally, it was thought (by Leibniz and others) that the analysis of
meaning ought to proceed word by word, like a child building a toy house
with Lego bricks. This view of meaning is untenable. Corpus evidence shows
clearly that words in isolation do not have much meaning - and what they
have is not very precise. Words in isolation have only meaning potential.
In context, the meaning of a word becomes much more precise. How does this
work?

Each use of a word is generally associated with an underlying
phraseological pattern. Meanings are associated with patterns, not with
words (or rather, as well as with words).  Words are highly ambiguous, but
most patterns are unambiguous.  Hence, it is necessary to decide what
counts as a pattern.

CPA is a technique for identifying and storing meaningful patterns of word
use, drawing on prototype theory (Rosch), valencies (Halliday; Herbst),
and collocational analysis (Sinclair; Church and Hanks; Kilgarriff).
Collocations are grouped together according to their semantic type
(Pustejovsky). Mapping sentences in text onto CPA patterns usually yields
an unambiguous interpretation. Patterns discovered through CPA are stored
in a Pattern Dictionary. The lecture shows how a Pattern Dictionary can be
compiled and discusses some of the linguistic issues that arise.


*******************
Biography:

Patrick Hanks is a lexicographer and corpus linguist. He was chief editor
of current English dictionaries at Oxford University Press from 1990 to
2000. In the 1980s he was managing editor of the Cobuild project and chief
editor of Collins English dictionaries. He is author and co-author of many
papers on lexicography, lexical analysis, and similes and metaphor. He is
currently compiling a 6-volume collection of papers on lexicology in the
Routledge Creative Concepts series.

His main research interests are: (a) precision and vagueness in language;
(b) mapping meaning onto use-corpus-based syntagmatic analysis of lexical
regularities; (c) similes and metaphors: creative and innovative use of
language; (d) personal names: origin and history, convention and
creativity in naming.

From 2002 to 2006 he divided his time between Brandeis University in
Waltham, Massachusetts, and the Berlin-Brandenburg Academy of Sciences in
Berlin, Germany. He is currently Visiting Professor at the Research
Institute in Information and Language Processing, Department of Computer
Science, University of Wolverhampton, UK, as well as Visiting Professor at
the Institute of Formal and Applied Linguistics, Charles University in
Prague.