Vision-powered Mobile Sensing Design for the Vulnerable Population

MPhil Thesis Defence


Title: "Vision-powered Mobile Sensing Design for the Vulnerable Population"

By

Mr. Kecheng AN


Abstract

With the advancement in Computer Vision, many new applications of mobile 
sensing platforms have emerged that can satisfy the special need of certain 
population groups and enhance their daily life. However, there are still some 
unsolved challenges. This thesis focuses on two population groups and presents 
our explorations in resolving some of their crucial needs.

For the deaf-mute, there are significant barriers in communicating with the 
hearing population. However, given the complexity of data collection remains a 
challenge in realizing a realistic smartwatch-based sign language recognition 
system. We describe our exploration in transferring online sign language videos 
into training data for smartwatch-based sign language recognition. We tested 
pipelines of different human pose/mesh 3D reconstruction methods from imagery 
and data processing techniques in an attempt to eliminate the need of 
collecting real smartwatch-collected data.

For many elderly people, dysphagia is a common disease, making them subject to 
choking risk if the viscosity of liquid intake is not well controlled. We 
present ViscoCam, the first liquid viscosity classification system for 
dysphagia patients or caregivers, which only requires a smartphone. It is easy 
to operate, widely deployable, and robust for daily use. ViscoCam classifies 
visually indistinguishable liquid of various viscosity levels by exploiting the 
fact that the sloshing motion of viscous liquid decays faster than thin liquid. 
To perform a measurement, the user shakes a cup of liquid and their smartphone 
to induce the liquid sloshing motion. Then, ViscoCam senses the cup's motion 
using the smartphone’s built-in accelerometer or microphone and infers liquid 
viscosity from the fluid surface motion captured by the flashlight camera. To 
combat changes in camera position, lighting conditions, and liquid sloshing 
motion, a 3D convolutional neural network is trained to ext! ract reliable 
motion features for classification. We present the evaluation of ViscoCam's 
performance in classifying three levels against the IDDSI standard, which is 
the most up-to-date and internationally adopted one for dysphagia patients.


Date:  			Tuesday, 10 August 2021

Time:			4:30pm - 6:30pm

Zoom meeting:
https://hkust.zoom.com.cn/j/96519101023?pwd=QUhka3ZOamw4SjRoWGZxL0ptQUZTQT09

Committee Members:	Prof. Qian Zhang (Supervisor)
 			Prof. Gary Chan (Chairperson)
 			Dr. Xiaojuan Ma


**** ALL are Welcome ****