Knowledge Base Enhancement and Data Cleaning methods via crowdsourcing

MPhil Thesis Defence


Title: "Knowledge Base Enhancement and Data Cleaning methods via crowdsourcing"

By

Mr. Linnan JIANG


Abstract

Knowledge base systems such as Freebase, YAGO, etc. have been designed and 
widely applied while most of the knowledge bases are far from being of a high 
quality. According to the recent researches, the low quality is mainly caused 
by the loss and low accuracy of the RDF triples, which are the main components 
of knowledge base systems. Therefore, it is important that we propose methods 
to enhance the RDF triples in knowledge bases, which is significant for 
providing good information retrieval service. On the other hand, the low 
accuracy of the data in the knowledge base comes from different data sources, 
thus it is not only important to improve the existing knowledge triples in the 
knowledge bases, but also important to care about the erroneous data in the 
data sources. Therefore, we also want to explore the data cleaning methods to 
filter and correct erroneous data for achieving higher quality knowledge bases. 
Overall, either the low accuracy knowledge triples in knowledge bases or 
erroneous data in the data sources are dirty data, which requires to be 
repaired.

In this thesis, we formulate the problem of revising the low accuracy knowledge 
triples and erroneous data as a data fusion problem, and would like to solve 
the problem in two direction. Firstly, we specifically discuss how to discover 
and correct the low accuracy knowledge triples in the existing knowledge bases. 
Then we dive into a more common problem: how to discover and correct the 
erroneous data in the data sources as they are the information providers. 
During the process of solving the two problems, we applied crowdsourcing, which 
is a powerful and widely applied method in data field.


Date:  			Friday, 13 August 2021

Time:			3:00pm - 5:00pm

Zoom meeting:
https://us05web.zoom.us/j/4728829752?pwd=OC8yU214L0hjcnRYRlMxeXdNYUp2QT09

Committee Members:	Prof. Lei Chen (Supervisor)
 			Dr. Wei Wang (Chairperson)
 			Prof. Qiong Luo


**** ALL are Welcome ****