More about HKUST
Incomplete Data Analysis in Smart City Applications
PhD Thesis Proposal Defence
Title: "Incomplete Data Analysis in Smart City Applications"
by
Mr. Siyuan Liu
ABSTRACT:
Incomplete data in my work is defined as the data with extremely limited
samples observed, which brings big challenges to data mining. Such extremely
limited sample data obviously gives us terrible bias and inaccurate results.
Given one over ten thousand of the whole set of vehicles in a city, how can we
still retrieve the vehicle distribution and detect the hot spots/crowded areas
in the city? The traditional density-based clustering methods work not well
because of the very limited and errorable vehicle density/location information.
Hence we need new algorithms to handle such incomplete data, in terms of
accuracy and scalability. On the other hand, the vehicle traces are typical
spatio-temporal data, which requires efficient approaches. In this paper, we
have an interesting observation that the vehicle speed can indicate the
crowdedness of a given area. In other words, if a given area is very crowded,
then the vehicles’ speed in this area is low; while if this area is not
crowded, then the vehicles’ speed in this area prefers high. As such the
mobility of samples is naturally incorporated and a novel non-density-based
clustering method is developed, called mobility-based clustering. Several key
factors beyond the vehicle crowdedness have been identified and techniques to
compensate these effects are proposed. We evaluate the performance of
mobility-based clustering based on real traffic situations. Experimental
results show that using 0.3 % of vehicles as the samples, mobility-based
clustering can accurately identify hot spots which can hardly be obtained by
the latest representative algorithm UMicro.
Date: Tuesday, 28 June 2011
Time: 3:30pm - 5:30pm
Venue: Room 4483
lifts 25/26
Committee Members: Prof. Lionel Ni (Supervisor)
Prof. Shing-Chi Cheung (Chairperson)
Dr. Qiong Luo
Dr. Raymond Wong
**** ALL are Welcome ****