Seventh Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-7)

NAACL HLT 2013 / SIGMT / SIGLEX Workshop
13 June 2013, Atlanta, GA

*** [NEW] Full program below ***

The Seventh Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-7) seeks to build on the foundations established in the first six SSST workshops, which brought together a large number of researchers working on diverse aspects of structure, semantics and representation in relation to statistical machine translation. Its program each year has comprised high-quality papers discussing current work spanning topics including: new grammatical models of translation; new learning methods for syntax- and semantics-based models; formal properties of synchronous/transduction grammars (hereafter S/TGs); discriminative training of models incorporating linguistic features; using S/TGs for semantics and generation; and syntax- and semantics-based evaluation of machine translation.

The need for structural mappings between languages is widely recognized in the fields of statistical machine translation and spoken language translation, and there is a growing consensus that these mappings are appropriately represented using a family of formalisms that includes synchronous/transduction grammars and their tree-transducer equivalents. To date, flat-structured models, such as the word-based IBM models of the early 1990s or the more recent phrase-based models, remain widely used. But tree-structured mappings arguably offer a much greater potential for learning valid generalizations about relationships between languages.

Within this area of research there is a rich diversity of approaches. There is active research ranging from formal properties of S/TGs to large-scale end-to-end systems. There are approaches that make heavy use of linguistic theory, and approaches that use little or none. There is theoretical work characterizing the expressiveness and complexity of particular formalisms, as well as empirical work assessing their modeling accuracy and descriptive adequacy across various language pairs. There is work being done to invent better translation models, and work to design better algorithms. Recent years have seen significant progress on all these fronts. In particular, systems based on these formalisms are now top contenders in MT evaluations.

At the same time, SMT has seen a movement toward semantics over the past few years, which has been reflected at recent SSST workshops, including the last two editions which had semantics for SMT as a special theme. The issues of deep syntax and shallow semantics are closely linked and SSST-7 continues to encourage submissions on semantics for MT in a number of directions, including semantic role labeling and sense disambiguation for translation and evaluation.

We invite papers on:

syntax-based / semantics-based / tree-structured SMT
machine learning techniques for inducing structured translation models
algorithms for training, decoding, and scoring with semantic representation structure
empirical studies on adequacy and efficiency of formalisms
creation and usefulness of syntactic/semantic resources for MT
formal properties of synchronous/transduction grammars
learning semantic information from monolingual, parallel or comparable corpora
unsupervised and semi-supervised word sense induction and disambiguation methods for MT
lexical substitution, word sense induction and disambiguation, semantic role labeling, textual entailment, paraphrase and other semantic tasks for MT
semantic features for MT models (word alignment, translation lexicons, language models, etc.)
evaluation of syntactic/semantic components within MT (task-based evaluation)
scalability of structured translation methods to small or large data
applications of S/TGs to related areas including:
- speech translation
- formal semantics and semantic parsing
- paraphrases and textual entailment
- information retrieval and extraction
syntactically- and semantically-motivated evaluation of MT

Program

9:15–9:30	Opening Remarks
	Session 1
9:30–10:00	A Semantic Evaluation of Machine Translation Lexical Choice Marine Carpuat
10:00–10:30	Taste of Two Different Flavours: Which Manipuri Script works better for English-Manipuri Language pair SMT Systems? Thoudam Doren Singh
10:30–11:00	Break
	Session 2
11:00–11:30	Hierarchical Alignment Decomposition Labels for Hiero Grammar Rules Gideon Maillette de Buy Wenniger and Khalil Sima’an
11:30–12:00	A Performance Study of Cube Pruning for Large-Scale Hierarchical Machine Translation Matthias Huck, David Vilar, Markus Freitag and Hermann Ney
12:00–12:30	Combining Word Reordering Methods on different Linguistic Abstraction Levels for Statistical Machine Translation Teresa Herrmann, Jan Niehues and Alex Waibel
12:30–2:00	Lunch
	Session 3
2:00–3:00	Panel discussion: Meaning Representations for Machine Translation Jan Hajic (Charles University) Kevin Knight (University of Southern California / Information Sciences Institute) Martha Palmer (University of Colorado Boulder) Dekai Wu (Hong Kong University of Science and Technology)
3:30–4:00	Combining Top-down and Bottom-up Search for Unsupervised Induction of Transduction Grammars Markus Saers, Karteek Addanki and Dekai Wu
3:30–4:00	Break
	Session 4
4:00–4:30	A Formal Characterization of Parsing Word Alignments by Synchronous Grammars with Empirical Evidence to the ITG Hypothesis. Gideon Maillette de Buy Wenniger and Khalil Sima’an
4:30–5:00	Synchronous Linear Context-Free Rewriting Systems for Machine Translation Miriam Kaeshammer

Organizers

Marine CARPUAT, National Research Council (NRC), Canada
Lucia SPECIA, University of Sheffield, UK
Dekai WU, Hong Kong University of Science and Technology (HKUST), Hong Kong

Important Dates

Submission deadline: 15 Mar 2013
Notification to authors: 2 Apr 2013
Camera copy deadline: 12 Apr 2013

Submission

Papers will be accepted on or before 15 Mar 2013 in PDF or Postscript formats via the START system at https://www.softconf.com/naacl2013/SSST-7/. Submissions should follow the NAACL HLT 2013 length and formatting requirements for long papers of eight (8) pages of content with two (2) additional pages of references, found at http://naacl2013.naacl.org/CFP.aspx.

Camera Copy

Camera ready final versions will be accepted on or before 12 Apr 2013 in PDF or Postscript formats via the START system at https://www.softconf.com/naacl2013/SSST-7/. Papers should follow the NAACL HLT 2013 camera ready length and formatting requirements for long papers of eight (8) pages of content with two (2) additional pages of references, found at http://naacl2013.naacl.org/CFP.aspx.

Contact

Please send inquiries to ssst@cs.ust.hk.

Last updated: 2013.05.09