More about HKUST
Monolingual and crosslingual plagiarism detection
=================================================================== Joint Seminar =================================================================== The Hong Kong University of Science & Technology Human Language Technology Center Department of Computer Science and Engineering ------------------------------------------------------------------- Speaker: Professor Paolo Rosso Technical University of Valencia, Spain Title: "Monolingual and crosslingual plagiarism detection" Date: Thursday, 28 July 2011 Time: 4:00pm - 5:00pm Venue: Rm2578 (Annex, via lift 29/30), HKUST Abstract: Due to the amount of information available on the WWW and its ease of access, during the last years the cases of plagiarism increased. A countermeasure to such phenomenon has been the development of plagiarism detection tools. Unfortunately, state-of-the-art plagiarism detection systems cannot easily detect plagiarism in case of high level paraphrasing or translation. Detecting cases of translated plagiarism is still in its infancy and just few are the crosslingual plagiarism detection approaches that have been investigated so far. The estimation of how similar two texts written in different languages are, could be carried out on the basis of a comparable data set such as Wikipedia (cross-language explicit semantic analysis) or through a statistical machine translation approach (cross-language alignment-based similarity analysis) in order to determine the likelihood of two text fragments of being valid translations of each other. In this talk an overview of plagiarism detection techniques will be given. Special emphasis will be given at crosslingual plagiarism detection. These techniques could be potentially adapted for English-Chinese plagiarism detection. ***************************** Biography: Paolo Rosso (http://www.dsic.upv.es/~prosso/ ) received the Ph.D. degree in computer science from the Trinity College Dublin, University of Ireland, in 1999. He is currently an associate professor with the Technical University of Valencia, Spain where he leads the Natural Language Engineering Lab of the ELiRF research group. He is co-author of over 200 papers published in international conferences and journal. His main research interests include topics related to natural language processing and information retrieval: plagiarism detection, opinion mining and irony detection, toponym disambiguation, and text categorisation, among others. He actively participated in 17 national and international research projects (in 6 as PI). He has co-organised tracks at CLEF on Question Answering on Speech Transcripts and Plagiarism Detection (sponsored by Yahoo! Research ): http://pan.webis.de/