More about HKUST
A Probabilistic Framework on Machine-Crowd Collaboration and its Applications on Data Integration
PhD Thesis Proposal Defence
Title: "A Probabilistic Framework on Machine-Crowd Collaboration and its
Applications on Data Integration"
by
Mr. Chen ZHANG
Abstract:
Recently, the popularity of crowdsourcing has brought a new opportunity
for engaging human intelligence into the process of data analysis.
Existing works on crowdsourcing have developed sophisticated methods by
utilizing the crowd as a new kind of processor, a.k.a HPU. One of the
drawbacks of these works is that they treat the crowd as the sole
information source for the human-intrinsic queries. However, on many
applications, such human-intrinsic queries can be also answered by
machine-alone systems (i.e. CPUs). On the one hand, the latency of using
HPUs to answer queries is much longer than that of CPUs, and the monetary
cost of HPUs is often high (e.g. crowdsoucing on Amazon Mechanical Turk),
but on the other hand, the answers obtained from CPUs often have high
uncertainty due to its incapability to recognize human-intrinsic
semantics. Therefore, it is natural to ask why we cannot combine the power
of CPUs and the wisdom of HPUs to answer human-intrinsic queries
accurately and fast, which is exactly the motivation of this work.
To summarize, our study covers four following aspects:
1) We propose three new specific human-machine hybrid system in three
different application background, to improve the data quality
2) We design a novel crowd-machine hybrid system of uncertain data
cleaning ;
3) We study the classic problem of schema mapping in the new crowdsourcing
perspective;
We validate our solutions through extensive experiments and discuss
several interesting research directions of CPU and HPU hybrid systems on
data integration.
Date: Monday, 4 May 2015
Time: 5:30pm - 7:30pm
Venue: Room 3501
lifts 25/26
Committee Members: Dr. Lei Chen (Supervisor)
Dr. Pan Hui (Chairperson)
Dr. Raymond Wong
Dr. Ke Yi
**** ALL are Welcome ****