Cross-modal Steganography: Hiding Video in Audio

MPhil Thesis Defence


Title: "Cross-modal Steganography: Hiding Video in Audio"

By

Mr. Hyukryul YANG


Abstract

Steganography has been largely studied in the computer vision community 
for digital watermarking or secret messaging. Recently, various deep 
learning-based approaches have been proposed to tackle the steganography 
task. However, in terms of capacity, existing methods are not enough to 
hide large data types like video content due to its high bitrate. In this 
thesis, we push this limit by investigating methods for hiding video 
content inside audio files. To solve this novel cross-modal steganography, 
we first introduce several models based on existing methods as the 
baseline. However, we discovered the clear limitations of those models in 
evaluation. Finally, to mitigate these limitations, we devise a new 
optimization-based method named HvFO to improve not only the reconstructed 
video quality but also embedded audio fidelity.

The HvFO uses recent advances in flow-based generative models that enable 
effective mapping audio to latent codes with further optimization so that 
nearby codes correspond to perceptually similar signals. We show that 
compressed video data can be concealed in the latent codes of audio 
sequences while preserving the fidelity of the hidden video and the 
original audio. We can embed 128x128 video inside same-duration audio, or 
higher-resolution video inside longer audio sequences. Quantitative 
experiments show that our approach outperforms relevant baselines in 
steganographic capacity and fidelity.


Date:  			Wednesday, 12 August 2020

Time:			2:00pm - 4:00pm

Zoom meeting:           https://hkust.zoom.us/j/91260314537

Committee Members:	Dr. Qifeng Chen (Supervisor)
   			Prof. Raymond Wong (Chairperson)
  			Prof. Chiew-Lan Tai


**** ALL are Welcome ****