More about HKUST
Visual analytics of temporal event data
PhD Thesis Proposal Defence
Title: "Visual analytics of temporal event data"
by
Mr. Yuanzhe CHEN
Abstract:
Temporal event sequences are becoming increasingly important in many
application domains such as website click streams, user interaction logs,
electronic health records and car service records. However, a real-world
dataset with a large number of event sequences of varying lengths is
complex and difficult to analyze. Visual analytics has been proven as an
effective approach to understanding such large amounts of data. For
example, by visually highlighting the common behaviors of website click
streams, usability issues and user behavior patterns can be identified to
inform better designs of the interface. In this thesis, we follow the
research in the area of event sequence visualization and report three
works in developing visual analytics techniques for temporal event data
from various application domains.
In the first work, we propose a novel visualization technique based on the
minimum description length (MDL) principle to construct a coarse-level
overview of event sequence data while balancing the information loss in
it. The method addresses a fundamental trade-off in visualization design:
reducing visual clutter vs. increasing the information content in a
visualization. The method enables simultaneous sequence clustering and
pattern extraction and is highly tolerant to noises such as missing or
additional events in the data. Based on this approach we propose a visual
analytics framework with multiple levels-of-detail to facilitate
interactive data exploration. We demonstrate the usability and
effectiveness of our approach through case studies with two real-world
datasets. One dataset showcases a new application domain for event
sequence visualization, i.e., fault development path analysis in vehicles
for predictive maintenance. We also discuss the strengths and limitations
of the proposed method based on user feedback.
In the second work, we study the temporal event data related to a specific
application domain, i.e., the web click streams in Massive Open Online
Courses (MOOCs). To be more specific, we try to understand the dropout
behavior in such data. To tackle this problem, we introduce a
comprehensive visual analytics system which not only helps instructors and
education experts understand the reasons for the dropout, but also allows
researchers to identify crucial features which can further improve the
performance of the models. Both the heterogeneous data extracted from
three different kinds of learner activity logs (i.e., clickstream, forum
posts and homework records) and the predicted results are visualized in
the proposed system.
The third work focus on the stage, that is, a frequently occurring
subsequence in the dataset. We introduce a novel visualization technique
to summarize event sequence data into a set of stage progression patterns.
The resulting overview is more concise compared with event-level
summarization and supports level-of-detail exploration. We further present
a visual analytics system with four linked views, which are stage view,
overview, tree view and sequences view to help users explore the data.
Date: Wednesday, 1 August 2018
Time: 10:30am - 12:30pm
Venue: Room 3494
(lifts 25/26)
Committee Members: Prof. Cunsheng Ding (Supervisor)
Prof. Huamin Qu (Chairperson)
Dr. Xiaojuan Ma
Dr. Raymond Wong
**** ALL are Welcome ****