More about HKUST
Visual analytics of temporal event data
PhD Thesis Proposal Defence Title: "Visual analytics of temporal event data" by Mr. Yuanzhe CHEN Abstract: Temporal event sequences are becoming increasingly important in many application domains such as website click streams, user interaction logs, electronic health records and car service records. However, a real-world dataset with a large number of event sequences of varying lengths is complex and difficult to analyze. Visual analytics has been proven as an effective approach to understanding such large amounts of data. For example, by visually highlighting the common behaviors of website click streams, usability issues and user behavior patterns can be identified to inform better designs of the interface. In this thesis, we follow the research in the area of event sequence visualization and report three works in developing visual analytics techniques for temporal event data from various application domains. In the first work, we propose a novel visualization technique based on the minimum description length (MDL) principle to construct a coarse-level overview of event sequence data while balancing the information loss in it. The method addresses a fundamental trade-off in visualization design: reducing visual clutter vs. increasing the information content in a visualization. The method enables simultaneous sequence clustering and pattern extraction and is highly tolerant to noises such as missing or additional events in the data. Based on this approach we propose a visual analytics framework with multiple levels-of-detail to facilitate interactive data exploration. We demonstrate the usability and effectiveness of our approach through case studies with two real-world datasets. One dataset showcases a new application domain for event sequence visualization, i.e., fault development path analysis in vehicles for predictive maintenance. We also discuss the strengths and limitations of the proposed method based on user feedback. In the second work, we study the temporal event data related to a specific application domain, i.e., the web click streams in Massive Open Online Courses (MOOCs). To be more specific, we try to understand the dropout behavior in such data. To tackle this problem, we introduce a comprehensive visual analytics system which not only helps instructors and education experts understand the reasons for the dropout, but also allows researchers to identify crucial features which can further improve the performance of the models. Both the heterogeneous data extracted from three different kinds of learner activity logs (i.e., clickstream, forum posts and homework records) and the predicted results are visualized in the proposed system. The third work focus on the stage, that is, a frequently occurring subsequence in the dataset. We introduce a novel visualization technique to summarize event sequence data into a set of stage progression patterns. The resulting overview is more concise compared with event-level summarization and supports level-of-detail exploration. We further present a visual analytics system with four linked views, which are stage view, overview, tree view and sequences view to help users explore the data. Date: Wednesday, 1 August 2018 Time: 10:30am - 12:30pm Venue: Room 3494 (lifts 25/26) Committee Members: Prof. Cunsheng Ding (Supervisor) Prof. Huamin Qu (Chairperson) Dr. Xiaojuan Ma Dr. Raymond Wong **** ALL are Welcome ****