Fourth Workshop on Syntax and Structure in Statistical Translation (SSST-4)

COLING 2010 / SIGMT Workshop
28 August 2010, Beijing

The Fourth Workshop on Syntax and Structure in Statistical Translation (SSST-4) seeks to build on the foundations established in the first three SSST workshops, which brought together a large number of researchers working on diverse aspects of synchronous/transduction grammars (hereafter, S/TGs) in relation to statistical machine translation. Its program each year has comprised high-quality papers discussing current work spanning topics including: new grammatical models of translation; new learning methods for syntax-based models; using S/TGs for semantics and generation; syntax-based evaluation of machine translation; and formal properties of S/TGs. Presentations have led to productive and thought-provoking discussions, comparing and contrasting different approaches, and identifying the questions that are most pressing for future progress in this topic.

The need for structural mappings between languages is widely recognized in the fields of statistical machine translation and spoken language translation, and there is a growing consensus that these mappings are appropriately represented using a family of formalisms that includes synchronous/transduction grammars and their tree-transducer equivalents. To date, flat-structured models, such as the word-based IBM models of the early 1990s or the more recent phrase-based models, remain widely used. But tree-structured mappings arguably offer a much greater potential for learning valid generalizations about relationships between languages.

Within this area of research there is a rich diversity of approaches. There is active research ranging from formal properties of S/TGs to large-scale end-to-end systems. There are approaches that make heavy use of linguistic theory, and approaches that use little or none. There is theoretical work characterizing the expressiveness and complexity of particular formalisms, as well as empirical work assessing their modeling accuracy and descriptive adequacy across various language pairs. There is work being done to invent better translation models, and work to design better algorithms. Recent years have seen significant progress on all these fronts. In particular, systems based on these formalisms are now top contenders in MT evaluations.

We invite papers on:

syntax-based / tree-structured statistical translation models and language models
machine learning techniques for inducing structured translation models
algorithms for training, decoding, and scoring with S/TGs
empirical studies on adequacy and efficiency of formalisms
studies on the usefulness of syntactic resources for translation
formal properties of S/TGs
scalability of structured translation methods to small or large data
applications of S/TGs to related areas including:
- speech translation
- formal semantics and semantic parsing
- paraphrases and textual entailment
- information retrieval and extraction

For more information: http://www.cs.ust.hk/~dekai/ssst/

Invited Keynote

SSST-4 Keynote
Martin Kay

Open Tutorial

This year's SSST will begin with a FREE public tutorial, introducing the fundamental concepts to understand current research on tree-structured and syntactic SMT.
ALL ARE WELCOME

Program

	New Parameterizations and Features for PSCFG-Based Machine Translation Andreas ZOLLMANN and Stephan VOGEL
	Source-side Syntactic Reordering Patterns with Functional Words for Improved Phrase-based SMT Jie JIANG, Jinhua DU and Andy WAY
	A Systematic Comparison between Inversion Transduction Grammar and Linear Transduction Grammar for Word Alignment Markus SAERS and Dekai WU
	Manipuri-English Bidirectional Statistical Machine Translation Systems using Morphology and Dependency Relations Thoudam Doren SINGH and Sivaji BANDYOPADHYAY
	HMM Word-to-Phrase Alignment with Dependency Constraints Yanjun MA and Andy WAY
	Semantic vs. Syntactic vs. N-gram Structure for Machine Translation Evaluation Chi-kiu LO and Dekai WU
	A Discriminative Approach for Dependency Based Statistical Machine Translation Sriram VENKATAPATHY, Rajeev SANGAL, Aravind JOSHI and Karthik GALI
	A Discriminative Syntactic Model for Source Permutation via Tree Transduction Maxim KHALILOV and Khalil SIMA'AN
	Deep Syntax Language Models and Statistical Machine Translation Yvette GRAHAM and Josef VAN Genabith
	Syntactic Constraints on Phrase Extraction for Phrase-Based Machine Translation Hailong CAO, Andrew FINCH and Eiichiro SUMITA
	Intersecting Hierarchical and Phrase-Based Models of Translation: Formal Aspects and Algorithms Marc DYMETMAN and Nicola CANCEDDA
	Phrase Based Decoding using a Discriminative Model Prasanth KOLACHINA, Sriram VENKATAPATHY, Srinivas BANGALORE, Sudheer KOLACHINA and Avinesh PVS
	Improved Language Modeling for English-Persian Statistical Machine Translation Mahsa MOHAGHEGH, Abdolhossein SARRAFZADEH and Tom MOIR
	Seeding Statistical Machine Translation with Translation Memory Output through Tree-Based Structural Alignment Ventsislav ZHECHEV and Josef VAN GENABITH
	Arabic Morpho-syntactic feature disambiguation in translation context Ines Turki KHEMAKHEM, Salma JAMOUSSI and Abdelmajid BEN HAMADOU

Organizer

Dekai WU (Hong Kong University of Science and Technology)

Program Committee

Srinivas BANGALORE (AT&T Research)
Marine CARPUAT (Hong Kong University of Science and Technology)
David CHIANG (USC Information Sciences Institute)
Pascale FUNG (Hong Kong University of Science and Technology)
Daniel GILDEA (University of Rochester)
Dan KLEIN (University of California at Berkeley)
Kevin KNIGHT (USC Information Sciences Institute)
Jonas KUHN (Potsdam)
Yang LIU (ICT)
Yanjun MA (Dublin City University)
Daniel MARCU (USC Information Sciences Institute)
Yuji MATSUMOTO (Nara Institute of Science and Technology)
Hermann NEY (RWTH Aachen)
Owen RAMBOW (Columbia University)
Philip RESNIK (University of Maryland)
Stefan RIEZLER (Google)
Libin SHEN (BBN)
Christoph TILLMANN (IBM)
Stephan VOGEL (Carnegie Mellon University)
Taro WATANABE (NTT)
Andy WAY (Dublin City University)
Yuk-Wah WONG (Google)
Richard ZENS (Google)
Chengqing ZONG (CAS Institute of Automation)

Important Dates

Submission deadline: 2 Jul 2010
Notification to authors: 12 Jul 2010
Camera copy deadline: 19 Jul 2010

Submission

Papers will be accepted on or before 2 Jul 2010 in PDF or Postscript formats via the START system at https://www.softconf.com/coling2010/SSST2010/. Submissions should follow the COLING 2010 length and formatting requirements for full papers of eight (8) pages of content with one (1) extra page for references, found at http://www.coling-2010.org/SubmissionGuideline.htm.

Camera Copy

Camera ready final versions will be accepted on or before 18 Jul 2010 in PDF or Postscript formats via the START system at https://www.softconf.com/coling2010/SSST2010/. Papers should still follow the COLING 2010 length and formatting requirements for full papers, found at http://www.coling-2010.org/SubmissionGuideline.htm.

Contact

Please send inquiries to ssst@cs.ust.hk.

Last updated: 2010.07.21