Visual Analytics Approaches for Multimodal Temporal Data

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Visual Analytics Approaches for Multimodal Temporal Data"

By

Mr. Kam-Kwai WONG


Abstract:

Many real-world analytic tasks focus on time-series data that is often 
enriched by additional modalities, such as text, audio, images, or domain 
knowledge. While temporal data is essential, it rarely captures the full 
complexity of a phenomenon. Its interpretability can be substantially 
enhanced when paired with these complementary data sources. However, 
integrating such heterogeneous data sources for temporal analysis poses 
significant challenges. Differences in data format, scale, granularity, and 
semantics make it difficult for analysts to connect events across modalities 
and derive meaningful insights over time. This thesis addresses these 
challenges through visual analytics approaches that effectively combine 
multiple data modalities with temporal data to augment human reasoning about 
temporal phenomena.

I demonstrate the approach through the design and development of three novel 
visual analytics systems, each tackling a distinct domain problem. All three 
systems were created following a human-centered design study methodology in 
collaboration with domain experts. First, LandSAR is an immersive analytics 
system for enhancing situational awareness of landslide risks. It addresses 
the disembodied gap in geospatial data by synthesizing data 
physicalization (tangible 3D terrain models) with data visceralization 
(real-time, steerable simulations). This approach merges analytical 
visualization overlays with an embodied understanding of the dynamic process, 
enhancing all three levels of SA. Second, Anchorage is a visual analytics 
system for evaluating customer satisfaction in video-based customer service 
interactions. Anchorage summarizes multimodal behavioral features from 
service videos (e.g., facial expressions) and reveals abnormal service 
operations via intuitive visualizations. By structuring videos around key 
"anchor events," the system helps service providers quickly navigate long 
recordings and assess customer satisfaction at both overall service and finer 
operational levels. Third, Prismatic is a visual analytics system that 
integrates quantitative time-series performance data with qualitative 
business knowledge to analyze concept stocks in finance. Prismatic enables 
interactive clustering of related stocks through a coordinated multi-view 
interface, combining data-driven correlations with knowledge-driven 
similarities (e.g., industry relationships) for a cross-validated 
understanding of market trends. This thesis contributes new visual analytics 
techniques and systems that integrate multimodal data for temporal 
reasoning. The findings illustrate how combining heterogeneous data sources, 
from video and audio to knowledge representations and physical models, can 
increase the information density and interpretability of time-series 
analysis, leading to deeper insights and more informed decision-making.


Date:                   Monday, 15 December 2025

Time:                   1:30pm - 3:30pm

Venue:                  Room 2128B
                        Lift 22

Chairman:               Dr. Becki Yi KUANG (CBE)

Committee Members:      Prof. Huamin QU (Supervisor)
                        Dr. Arpit NARECHANIA
                        Prof. Long QUAN
                        Dr. Wenhan LUO (AMC)
                        Prof. Baoquan CHEN (PKU)