More about HKUST
INTERACTIVE VISUAL ANALYTICS FORUNDERSTANDING AND FACILITATINGHUMAN COMMUNICATION
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "INTERACTIVE VISUAL ANALYTICS FORUNDERSTANDING AND FACILITATINGHUMAN COMMUNICATION" By Mr. Xingbo WANG Abstract People often communicate with each other through multimodal verbal and non-verbal behavior, including voice, words, facial expression, and body language. Interpreting human communication behavior has great value for many applications, such as business, healthcare, and education. For example, if students show signs of boredom or confusion during the courses, teachers can adjust the teaching methods to improve students’ engagement. With the rapid development of digital technology and social media, a huge amount of multimodal human communication data (e.g., opinion videos) is generated and collected. To facilitate the analysis of human communication data, researchers adopt computational approaches to quantify human behavior with multimodal features. However, it is still demanding and inefficient to manually extract insights (e.g., social meanings of the features) in the large and complex feature space. Furthermore, it remains challenging to utilize the knowledge distilled from the computational features to enhance human communication skills. Meanwhile, interactive visual analytics combines computational algorithms with human-centered visualization to effectively supports information representation, knowledge discovery, and skills acquisition. It demonstrates great potential to solve the challenges above. In this thesis, we design and build novel interactive visual analytics systems to 1) help users discover valuable behavioral patterns in multimodal human communication video and 2) further provide end-users with visual feedback and guidance to improve their communication skills. In the first work, we present DeHumor, a visual analytics system that visually decomposes humor speeches into quantifiable multimodal features and enables humor researchers and communication coaches to systematically explore humorous verbal content and vocal delivery. In the second work, we further characterize and investigate the intra- and inter-modal interactions between visual, acoustic, and language modalities, including dominance, complement, and conflict. Then, we develop M2Lens, a visual analytics system that helps model developers and users conduct multi- level and multi-faceted exploration of the influences of individual modalities and their interplay on model predictions for multimodal sentiment analysis. Besides understanding multimodal human communication behavior, in the third work, we present VoiceCoach, a visual analytics system that can evaluate speakers’ voice modulation skills regarding volume, pitch, speed, and pause, and recommend good learning examples of voice modulation in TED Talks to follow. Moreover, during the practice, the system can provide immediate visual feedback to speakers for self-awareness and performance improvement. Date: Tuesday, 16 August 2022 Time: 2:00pm - 4:00pm Zoom Meeting: https://hkust.zoom.us/j/4210096111 Chairperson: Prof. Mengze SHI (MARK) Committee Members: Prof. Huamin QU (Supervisor) Prof. Minhao CHENG Prof. Cunsheng DING Prof. Jimmy FUNG (ENVR) Prof. Hongbo FU (CityU) **** ALL are Welcome ****