Exploring Neural Stylization and Rendering Across 2D and 3D Visual Modalities

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Exploring Neural Stylization and Rendering Across 2D and 3D Visual 
Modalities"

By

Miss Yingshu CHEN


Abstract:

The rapid advancement of generative machine learning has shown promise for 
digital content generation. Neural network-driven techniques, including neural 
rendering, neural representations and neural stylization, have empowered 
artists, creators, and even non-professionals to blend artistic elements and 
technical intelligence revolutionizing the way we generate, manipulate, and 
stylize visual content across diverse modalities.

This thesis begins by providing a comprehensive review of the fundamental 
theories and developments in neural stylization, covering key concepts such as 
neural style transfer, neural rendering, and the latest advancements in 
generative AI. It then establishes a systematic taxonomy to navigate the 
categorization of neural stylization for various digital content types, from 2D 
imagery to 3D assets. Starting with these fundamentals, the thesis further 
explores the difficulties of photorealistic stylization, focusing on outdoor 
scenarios in 2D and 3D realms. Under the outdoor cityscape circumstance, we aim 
to maintain the foreground's geometry and structure while skillfully 
combining dynamic color and texture styles for the sky background. The 
TimeOfDay framework solves the issue of architectural style transfer on 
photographs, utilizing high-frequency-aware image-to-image translation models. 
Moving to 3D space, the StyleCity system supervises the compact neural texture 
optimization using multi- view features extracted from the large-scale 
pre-trained vision and language models in a progressive and scale-adaptive 
manner to handle extra challenges in 3D urban scenes including the large 
magnitude of area and scale-adaptive 3D consistency. Considering the holistic 
style representation in 3D scenarios, we proposed SC-OmniGS, a novel 
omnidirectional Gaussian splatting with camera calibration, which further 
facilities omnidirectional stylization in 3D scenes with medium such as subsea 
contexts.

By seamlessly integrating neural rendering, generative models, and 
vision-language models, neural stylization has demonstrated its potential to 
revolutionize the creation and interaction with digital modalities. The 
insights and innovations in this thesis showcase a promising future where 
reality, technology, and art are interlinked.


Date:                   Monday, 26 August 2024

Time:                   10:00am - 12:00noon

Venue:                  Room 5510
                        Lifts 25/26

Chairman:               Prof. Chik Patrick YUE (ECE)

Committee Members:      Prof. Sai-Kit YEUNG (Supervisor)
                        Prof. Ajay JONEJA (Supervisor)
                        Dr. Tristan BRAUD
                        Prof. Huamin QU
                        Prof. Hongbo FU (EMIA)
                        Dr. Jing LIAO (CityU)