Exploring Neural Stylization and Rendering Across 2D and 3D Visual Modalities

PhD Thesis Proposal Defence


Title: "Exploring Neural Stylization and Rendering Across 2D and 3D Visual 
Modalities"

by

Miss Yingshu CHEN


Abstract:

The rapid advancement of deep learning and generative machine learning has 
shown promise for digital content generation. Neural network-driven techniques, 
including neural rendering, neural representations and neural stylization, have 
empowered artists, creators, and even non-professionals to blend artistic 
elements and technical intelligence revolutionizing the way we generate, 
manipulate, and stylize visual content across diverse modalities.

This thesis begins by providing a comprehensive review of the fundamental 
theories and state-of-the-art developments in neural stylization, covering key 
concepts such as neural style transfer, neural rendering, and the latest 
advancements in generative AI. It then establishes a systematic taxonomy to 
navigate the categorization of neural stylization for various digital content 
types, from 2D imagery to 3D assets. Starting with these fundamentals, the 
thesis further explores the difficulties of photorealistic stylization, 
focusing on real-world outdoor scenarios in 2D and city-level scenarios in the 
3D realm. Under the outdoor cityscape circumstance, we aim to maintain the 
foreground's geometry and structure while skillfully combining dynamic color 
and texture styles for the sky background. The TimeOfDay framework solves the 
issue of architectural style transfer on photographs, utilizing 
high-frequency-aware image-to-image translation models. Moving to 3D space, the 
StyleCity system supervises the compact neural texture optimization using 
multi-view features extracted from the large-scale pre-trained vision and 
language models in a progressive and scale-adaptive manner to handle extra 
challenges in 3D urban scenes including the large magnitude of area and 
scale-adaptive 3D consistency. In addition, StyleCity synthesizes an 
omnidirectional sky in a harmonious style with the foreground elements using 
the latest generative diffusion model.

By seamlessly integrating neural rendering, generative models, and 
vision-language models, neural stylization has demonstrated its potential to 
revolutionize the creation and interaction with digital modalities. The 
insights and innovations in this thesis showcase a promising future where 
reality, technology, and art are interlinked.


Date:                   Tuesday, 23 July 2024

Time:                   2:00pm - 4:00pm

Venue:                  Room 3494
                        Lifts 25/26

Committee Members:      Prof. Sai-Kit Yeung (Supervisor)
                        Prof. Ajay Joneja (Supervisor)
                        Prof. Pedro Sander (Chairperson)
                        Dr. Dan Xu