First Call for Participation Fifth Workshop on Very Large Corpora (WVLC-5) First session: Tsinghua University Beijing, China August 18, 1997 (In conjunction with JSCL'97: the Fourth Joint Symposium of Computational Linguistics of China) Second session: Hong Kong University of Science and Technology (HKUST) Hong Kong, China August 20, 1997 Note: The regular registration deadline is July 25, 1997. All of the information below as well as additional information on the workshop will be available soon at: Contents: - Workshop Overview - Workshop Program - Program Committee and Sponsors - Travel and Accommodations Information - Sightseeing and entertainment - Extended travel in China with Reasonable Costs - Registration Form --------------- Workshop Overview --------------- This workshop, like the preceding ones in the series, will offer an international forum for the presentation of new advances and applications in the area of large scale, corpus-based natural language processing. The fifth workshop will focus on the theme of: Innovative and practical uses of large corpora in real-world applications Gigabytes and terabytes of on-line unrestricted natural language text have become commonplace today. How are these resources actually being used in commercial as well as research applications? What robust and efficient techniques exist for analyzing and organizing these resources? The workshop encourages contributions that demonstrate innovative applications of corpus-based NLP to problems of practical commercial importance. The theme will provide an organizing structure to the workshop and offer a focus for discussion and debate between academic researchers and industrial practitioners. The program committee has selected a diverse set of technical papers in the areas of statistical and corpus-based NLP, including (but not limited to): Text Analysis Techniques - part of speech tagging - term and name identification - morphological analysis - robust parsing - alignment of parallel texts and bilingual terminology - sense disambiguation - anaphora resolution - event categorization - discourse structure Applications - information retrieval/extraction - text categorization and summarization - lexicography - machine translation - spelling and grammar correction - recognition: speech, OCR, handwriting, etc. Each session of the workshop will feature two invited talks and a panel discussion (the topic will be announced soon). For the Beijing session, the invited speakers are Mitch Marcus, ACL President and Chairman of Department of Computer Information Science, University of Pennsylvania John Rausch, Chief Technologist, LEXIS-NEXIS, a Division of Reed Elsevier The invited speakers for Hong Kong will be announced shortly. ---------------- Workshop Program ----------------- (The final program is currently under preparation. Below is a list of the papers to be presented at both sites. Please note that the paper presentations will be divided between the two sites, Beijing and Hong Kong. The poster presentations will be held in Beijing only.) PAPER PRESENTATIONS (19): Domain-Specific Semantic Class Disambiguation using WordNet Li Shiuan Peh and Hwee Tou Ng Acquiring German Prepositional Subcategorization Frames from Corpora Erika F. de Lima A Natural Language Correction Model for Continuous Speech Recognition. Tomek Strzalkowski and Ron Brandow A Self-Organizing Japanese Word Segmenter using Heuristic Word Identification and Re-estimation Masaaki Nagata Using Word Frequency Lists to Measure Homogeneity and Similarity Between Corpora Adam Kilgarriff Knowledge Acquisition : Classification of terms in a thesaurus from a corpus Sta Jean-David Corpus Based Statistical Generalization Tree in Rule Optimization Joyce Yue Chai and Alan W. Biermann Collocation Lattices and Maximum Entropy Models Andrei Mikheev Data Reliability and its Effects on Automatic Abstracting Tadashi Nomoto and Yuji Matsumoto Probabilistic Parsing of Unrestricted English Text, With A Highly-Detailed Grammar E. Black, S. Eubank and H. Kashioka The effects of corpus size and homogeneity on language model quality T. Rose, N. Haddock, R. Tucker Statistical Acquisition of Terminology Dictionary Xuan-jing Huang, Li-de Wu and Wen-xin Wang Finding Terminology Translations from Non-Parallel Corpora Pascale Fung and Kathy McKeown A Statistics-based Chinese Parser Qiang Zhou Automatic identification of zero pronouns and their antecedents within aligned sentence pairs Hiromi Nakaiwa Grammar Acquisition based on Clustering Analysis and its Application to Statistical parsing Thanaruk Theeramunkong and Manabu Okumura Clustering Co-occurrence Graph Based on Transitivity Kumiko TANAKA-Ishii and Hideya IWASAKI Corpus-based PP Attachment Ambiguity Resolution with a Semantic Dictionary Jiri Stetina and Makoto Nagao Reestimation and Best First Parsing Algorithms for Probabilistic Dependency Grammar Seungmi Lee and Key-Sun Choi POSTER PRESENTATIONS (5): (in Beijing only) Maximum Entropy Model Learning of Subcategorization Preference Takehito Utsuro, Takashi Miyata and Yuji Matsumoto Identifying Unknown Lexical Items using Morphological and Syntactic Information using the TIMIT Corpus. Scott M. Thede and Mary Harper LG-based Approach to Recognizing proper names in Korean Jee-sun Nam and Key-sun Choi A Statistical Approach to Thai Morphological Analyzer Asanee Kawtrakul, Chalatip Thumkanon Probablistic Word Classification Base on Context-Sensitive Binary Tree Method Jun Gao and Xi-Xian Chen --------------- Program Committee and Sponsors --------------- PROGRAM CHAIRS: Huang Changning - Tsinghua University (Beijing, China) Kenneth Church - AT&T Laboratories (Murray Hill, NJ, USA) Joe Zhou - LEXIS-NEXIS (Dayton, OH, USA) LOCAL ARRANGEMENTS: For the Beijing session: Jai Peifa, the State Key Laboratory of Intelligent Technology and Systems, China For the Hong Kong session: Dekai Wu, Hong Kong University of Science and Technology PROGRAM COMMITTEE: Susan Armstrong, ISSO, University of Geneva, Switzerland Key-sun Choi, KAIST, Korea Ido Dagan, Bar Ilan University, Isreal Pernilla Danielsson, University of Gothenburg, Sweden Marti Hearst, Xerox Research Park, USA Chu-ren Huang, Academia Sinica, Taiwan Claudia Leacock, Princeton University, USA Sun Maosong, Tsinghua University, China Masaaki Nagata, NTT Information and Communication Systems Labs, Japan Daniel Pliske, LEXIS-NEXIS, USA Benjamin Tsou, City University of Hong Kong Paul Wu, Institute of Systems Science, Singapore SPONSORS: The Association for Computational Linguistics (ACL) LEXIS-NEXIS, a division of Reed Elsevier Inc. AT&T Laboratories - Research National Natural Science Foundation of China State Key Laboratory of Intelligent Technology and Systems, China Hong Kong University of Science and Technology City University of Hong Kong ------ Travel and Accommodations Information ------- Visa Application: For those who will participate in WVLC-5 from outside of China, written invitations will be issued to you through postal service once we receive the registration forms. The invitation will allow you to apply for visa at your local Chinese embassies. Travel: In general, travel to both Mainland China and Hong Kong is heavy during the Summer. It will be more so this August due to the change of sovereignty taking place in Hong Kong on July 1. To ensure flights, reserve airline tickets as early as you can! For those who are planning to attend both sessions, i.e., August 18 in Beijing and August 20 in Hong Kong, make sure that you reserve the connecting flight as well from Beijing to Hong Kong on August 19. Accommodations: On campus housing will be provided in both Beijing and Hong Kong. Housing in Beijing: A block of rooms have been reserved for WVLC-5 at the Guest House Building A (or "jia(3)suo(3)" in Chinese) at Tsinghua University. Suites, suitable for couples or families, are available at RMB 700 Yuan (approximately $90) per night. Double rooms are also available which can be occupies by one person for RMB 300 (approximately $40) Yuan per night or shared by two people for RMB 150 (approximately $20) per night/person. All reservations must be made before July 15, 1997 with a deposit of one night cost. See the registration for housing in Beijing below. Directions to Tsinghua University from Beijing Airport: You can take a taxi directly to the Tsinghua campus. The cost is about RMB 120 Yuan ($15) before 22:00, and doubled after 22:00. Shuttle buses are also available from the airport to Zhong-Guan-Cun before 17:00. It costs RMB 12 Yuan ($1.50). Then, a taxi will take you to Tsinghua at another RMB 12 Yuan ($1.50). Housing in Hong Kong: 50 double rooms have been reserved at Hong Kong University of Science and Technology at the cost of $40 per night. For more information, please contact our local organizer Prof. Dekai Wu at A Web page is being prepared which will provide details on housing and directions. ------- Sightseeing and Entertainment ------- In Beijing, a FREE trip to the Great Wall will be aranged by our local organizers on Sunday, August 17, 1997. PLEASE ARRIVE IN TIME FOR THIS SPECIAL TREAT! In Hong Kong, our local organizer is organizing an evening boat cruise right after the workshop on August 20. The cruise will take you to the Harbor, eating from "floating caterers" or at waterfront eateries, and then either drop people off on Hong Kong Island, Kowloon, or back to the campus. The cost is $25/person if at least 25 people sign up. For more information on this spectacular event, contact Prof. Dekai Wu at ------- Extended Travel in China with Reasonable Costs ------- "No longer is China dull - the sleeping dragon is waking up, making now an exciting time to visit." Since you will be stopping in China, you may want to explore more, especially in those innermost areas of the country. The WVLC-5 program committee has been contacted by a local travel agency who will help make your extended travel in China possible at reasonable costs. Though traveling to any part of China can be arranged, the following unique options are recommended. OPTION 1: Two-day escorted tour in Guilin. You will fly in from either Beijing or Hong Kong after the workshop and tour in this number one scenic spot in China ... OPTION 2: Three-day escorted tour in Chengdu. You will fly in from either Beijing or Hong Kong after the workshop and tour in this historic city and its surrounding areas, such as Leshan where you will visit the Temple of the Great Buddha and see the colossal statue of the Buddha ... OPTION 3: Three-day escorted tour in Lhasa. You will fly in (via Chengdu) from either Beijing or Hong Kong after the workshop and tour in the capital city of Tibet, "roof of the world" ... OPTION 4: Combining OPTION 1 and 2 for a 5-day tour. OPTION 5: Combining OPTION 2 and 3 for a 6-day tour. OPTION 6: Combining all OPTIONS above for a 8-day tour. Note: Costs vary depending on the size of the touring group. In general, a 10-person group is cheaper than 6-to-9-person group, while escorted tour for one person is more expensive. If you are interested, please contact for more detailed information on each or all of the options. Travel arrangements must be made as early as possible in order to ensure timely flights. ---------------- Registration form ------------------- Registration rates: =================== Registration includes proceedings, lunch, and refreshment breaks for each day of the workshop. Regular registration (received by July 25): ------------------ At both sites (Beijing and Hong Kong) : $50 At either site (Beijing or Hong Kong) : $30 On-site registration at the workshop: -------------------- At both sites (Beijing and Hong Kong) : $70 At either site (Beijing or Hong Kong) : $50 FREE REGISTRATION WILL BE PROVIDED TO PARTICIPANTS WHO ARE FULL-TIME STUDENTS. Method of Payment: ----------------- Payment in U.S. dollars must be included with this form. Either send an international money order or check drawn on a U.S. bank, payable to ** WVLC-5/ACL (Association for Computational Linguistics)**. We DO NOT accept credit card payments. Registration may be submitted by Email, by letter, or by fax using the form and addresses below. #%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#% Registration Address: Email: Regular mail: ATTN: Joe Zhou Floor 3 / Building 5 LEXIS-NEXIS, a Division of Reed Elsevier 9555 Springboro Pike Dayton, OH 45342 USA Fax: +1 (937) 865-1655 Registration form (print or type all information): #%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#% Name: Company or Institution: Address (for visa invitation): Phone: Fax: Email Address: URL: Vegetarian (yes or no): Full-time Student (yes or no): Date of Registration: Total registration fee: $ ****************************************************************** ON CAMPUS HOUSING RESERVATIONS IN BEIJING Choise: a suite ( ) or a double room ( ) Date of Arrival: Date of Departure: Number of nights: Amount of Deposit ($90 for a suite and $40 for a double room): NOTE: Your reservations will be cancelled if your deposit is not received within two weeks of your Email registration. #%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#%#% Notes: ------ - To qualify for free registration for students, attendees must provide a proof of student status in hard copy form (e.g., photocopy of valid student ID) at the time of registration. Students who register via Email must send their proof of status by regular mail or fax to the registration address given above. Email registration will not be final until proof of status is received. However, the date of registration will be considered the date that the Email is received, provided the proof of status is received within two weeks. - When submitting registrations, keep in mind the possibility of postal delays. Email registration avoids these delays. But, Email registration will not be final until the payment ( an international money order or check drawn on a U.S. bank) is received within two weeks. - All registrants will receive a confirmation by Email. - Registration fees are not refundable. - In the event of cancellation, WVLC-5 is liable only for the registration fees paid. For questions about registration procedures, please contact CONTACTS: Joe Zhou Ken Church LEXIS-NEXIS, a Division of Reed Elsevier Room 2B-421 9555 Springboro Pike AT&T Laboratories Dayton, OH 45342 USA Murray Hill, NJ 07974 USA email: e-mail: