More about HKUST
Exploring Neural Stylization and Rendering Across 2D and 3D Visual Modalities
PhD Thesis Proposal Defence
Title: "Exploring Neural Stylization and Rendering Across 2D and 3D Visual
Modalities"
by
Miss Yingshu CHEN
Abstract:
The rapid advancement of deep learning and generative machine learning has
shown promise for digital content generation. Neural network-driven techniques,
including neural rendering, neural representations and neural stylization, have
empowered artists, creators, and even non-professionals to blend artistic
elements and technical intelligence revolutionizing the way we generate,
manipulate, and stylize visual content across diverse modalities.
This thesis begins by providing a comprehensive review of the fundamental
theories and state-of-the-art developments in neural stylization, covering key
concepts such as neural style transfer, neural rendering, and the latest
advancements in generative AI. It then establishes a systematic taxonomy to
navigate the categorization of neural stylization for various digital content
types, from 2D imagery to 3D assets. Starting with these fundamentals, the
thesis further explores the difficulties of photorealistic stylization,
focusing on real-world outdoor scenarios in 2D and city-level scenarios in the
3D realm. Under the outdoor cityscape circumstance, we aim to maintain the
foreground's geometry and structure while skillfully combining dynamic color
and texture styles for the sky background. The TimeOfDay framework solves the
issue of architectural style transfer on photographs, utilizing
high-frequency-aware image-to-image translation models. Moving to 3D space, the
StyleCity system supervises the compact neural texture optimization using
multi-view features extracted from the large-scale pre-trained vision and
language models in a progressive and scale-adaptive manner to handle extra
challenges in 3D urban scenes including the large magnitude of area and
scale-adaptive 3D consistency. In addition, StyleCity synthesizes an
omnidirectional sky in a harmonious style with the foreground elements using
the latest generative diffusion model.
By seamlessly integrating neural rendering, generative models, and
vision-language models, neural stylization has demonstrated its potential to
revolutionize the creation and interaction with digital modalities. The
insights and innovations in this thesis showcase a promising future where
reality, technology, and art are interlinked.
Date: Tuesday, 23 July 2024
Time: 2:00pm - 4:00pm
Venue: Room 3494
Lifts 25/26
Committee Members: Prof. Sai-Kit Yeung (Supervisor)
Prof. Ajay Joneja (Supervisor)
Prof. Pedro Sander (Chairperson)
Dr. Dan Xu