Advancing Compression-Driven Approaches in 2D and 3D Vision

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Advancing Compression-Driven Approaches in 2D and 3D Vision"

By

Mr. Ka Leong CHENG


Abstract:

The exponential growth of visual data, including 2D images and 3D modalities 
such as videos and multi-view images, has placed a premium on efficient 
representation and processing techniques that extend beyond traditional data 
compression. This thesis explores compression-driven approaches as innovative 
solutions for 2D and 3D vision, leveraging compression not merely for data 
reduction but as a transformative tool to enhance efficiency, versatility, 
and quality across various vision tasks. By leveraging the principles of 
compression, this thesis addresses a range of challenges, including 
reversible image transformations, learned lossy compression, joint task 
optimization with compression, as well as efficient 3D editing with compact 
yet expressive representations.

To this end, we first develop frameworks that utilize invertible neural 
networks to encode and recover visual information with high fidelity, 
enabling tasks such as reversible image conversion and lossy image 
compression. By exploring the inherent invertibility of neural networks, we 
demonstrate that compression can serve as a reversible conduit for hiding and 
reconstructing multiple images within a single embedding, as well as 
improving the quality and efficiency of learned image compression codecs.

We then extend image compression methods beyond data reduction by 
investigating their synergy with other tasks. Specifically, we propose a 
joint learning framework for image compression and denoising, leveraging the 
innate denoising capabilities of compression models. This approach achieves 
superior results in both domains while revealing the latent connections 
between data reduction and noise suppression.

Finally, we expand the scope of compression to 3D vision, introducing a novel 
editing- friendly framework that encapsulates the appearance of 3D scenes 
into compact 2D canonical images. By treating the canonical image as a 
compressive representation of the 3D scene, this approach enables efficient 
3D editing through standard 2D tools, eliminating the need for costly 
re-optimization while maintaining fidelity to the original scene.


Date:                   Thursday, 6 February 2025

Time:                   4:00pm - 6:00pm

Venue:                  Room 4472
                        Lifts 25/26

Chairman:               Dr. Man Hoi WONG (ECE)

Committee Members:      Dr. Qifeng CHEN (Supervisor)
                        Prof. Pedro SANDER
                        Prof. Raymond WONG
                        Dr. Shenghui SONG (ISD)
                        Dr. Rynson LAU (CityU)