Augmentation and Analytics of Human Action in Videos

PhD Thesis Proposal Defence


Title: "Augmentation and Analytics of Human Action in Videos"

by

Miss Jingyuan LIU


Abstract:

Analyzing human action in videos and augmenting human action videos with 
visual effects are common tasks for video understanding and editing. 
However, they are challenging in mainly three aspects. First, analyzing 
human action automatically and augmenting videos with visual effects often 
require programming or professional tools, and are thus tedious and not 
friendly to novice users. Second, videos have the intrinsic perspective 
shortening problem that both observation and computation of human action 
attributes are affected by the viewpoint. Third, the application-specific 
analytical features of a human action would also require programming to 
generalize to new instances, limiting the capability of supporting 
customized analysis, especially for novices.

This thesis aims to address the above limitations in both the analytics 
and the augmentation of human action videos. We first present a tool 
PoseTween that simplifies the users’ authoring process of adding visual 
effects to augment their actions in videos. We propose to model the visual 
effects as tween animations of virtual objects driven by the video 
subject’s movements. It achieves natural motion paths of the augmented 
virtual objects while largely simplifying the editing process. Then, we 
study the problem of automatic visual effects transfer between two videos 
containing the same action to further reduce user interventions. 
Specifically, to find the temporal alignment between two videos of the 
same action, we propose a deep learning-based method that normalizes the 
human poses and extracts features from the normalized poses, such that the 
pose features are invariant to the i variations in videos, such as camera 
viewpoint and subject anthropometry. The normalized pose features can then 
be used for measuring similarity of human poses and deciding the timings 
for visual effect transfer. Besides the global human pose comparison, we 
further study the analysis and visualization of local human pose 
differences. We design the interactive annotation operations and the 
visualization method w.r.t. common human pose biomechanical attributes 
such that novice users can perform customizable analysis from human action 
videos without explicit programming. These result in a prototype for a 
video-based running coaching tool, namely VCoach.

We conducted extensive quantitative evaluations and user studies to 
evaluate the effectiveness of our proposed methods. Our tools are friendly 
to novice users and can generalize to other usage scenarios, such as 
action recognition and digital storytelling.


Date:			Wednesday, 18 May 2022

Time:                  	3:00pm - 5:00pm

Zoom Meeting:		https://hkust.zoom.us/j/9759430635

Committee Members:	Prof. Chiew-Lan Tai (Supervisor)
  			Dr. Qifeng Chen (Chairperson)
 			Dr. Xiaojuan Ma
 			Prof. Pedro Sander


**** ALL are Welcome ****