More about HKUST
Knowledge Base Enhancement and Data Cleaning methods via crowdsourcing
MPhil Thesis Defence Title: "Knowledge Base Enhancement and Data Cleaning methods via crowdsourcing" By Mr. Linnan JIANG Abstract Knowledge base systems such as Freebase, YAGO, etc. have been designed and widely applied while most of the knowledge bases are far from being of a high quality. According to the recent researches, the low quality is mainly caused by the loss and low accuracy of the RDF triples, which are the main components of knowledge base systems. Therefore, it is important that we propose methods to enhance the RDF triples in knowledge bases, which is significant for providing good information retrieval service. On the other hand, the low accuracy of the data in the knowledge base comes from different data sources, thus it is not only important to improve the existing knowledge triples in the knowledge bases, but also important to care about the erroneous data in the data sources. Therefore, we also want to explore the data cleaning methods to filter and correct erroneous data for achieving higher quality knowledge bases. Overall, either the low accuracy knowledge triples in knowledge bases or erroneous data in the data sources are dirty data, which requires to be repaired. In this thesis, we formulate the problem of revising the low accuracy knowledge triples and erroneous data as a data fusion problem, and would like to solve the problem in two direction. Firstly, we specifically discuss how to discover and correct the low accuracy knowledge triples in the existing knowledge bases. Then we dive into a more common problem: how to discover and correct the erroneous data in the data sources as they are the information providers. During the process of solving the two problems, we applied crowdsourcing, which is a powerful and widely applied method in data field. Date: Friday, 13 August 2021 Time: 3:00pm - 5:00pm Zoom meeting: https://us05web.zoom.us/j/4728829752?pwd=OC8yU214L0hjcnRYRlMxeXdNYUp2QT09 Committee Members: Prof. Lei Chen (Supervisor) Dr. Wei Wang (Chairperson) Prof. Qiong Luo **** ALL are Welcome ****