More about HKUST
Exploring Neural Stylization and Rendering Across 2D and 3D Visual Modalities
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Exploring Neural Stylization and Rendering Across 2D and 3D Visual Modalities" By Miss Yingshu CHEN Abstract: The rapid advancement of generative machine learning has shown promise for digital content generation. Neural network-driven techniques, including neural rendering, neural representations and neural stylization, have empowered artists, creators, and even non-professionals to blend artistic elements and technical intelligence revolutionizing the way we generate, manipulate, and stylize visual content across diverse modalities. This thesis begins by providing a comprehensive review of the fundamental theories and developments in neural stylization, covering key concepts such as neural style transfer, neural rendering, and the latest advancements in generative AI. It then establishes a systematic taxonomy to navigate the categorization of neural stylization for various digital content types, from 2D imagery to 3D assets. Starting with these fundamentals, the thesis further explores the difficulties of photorealistic stylization, focusing on outdoor scenarios in 2D and 3D realms. Under the outdoor cityscape circumstance, we aim to maintain the foreground's geometry and structure while skillfully combining dynamic color and texture styles for the sky background. The TimeOfDay framework solves the issue of architectural style transfer on photographs, utilizing high-frequency-aware image-to-image translation models. Moving to 3D space, the StyleCity system supervises the compact neural texture optimization using multi- view features extracted from the large-scale pre-trained vision and language models in a progressive and scale-adaptive manner to handle extra challenges in 3D urban scenes including the large magnitude of area and scale-adaptive 3D consistency. Considering the holistic style representation in 3D scenarios, we proposed SC-OmniGS, a novel omnidirectional Gaussian splatting with camera calibration, which further facilities omnidirectional stylization in 3D scenes with medium such as subsea contexts. By seamlessly integrating neural rendering, generative models, and vision-language models, neural stylization has demonstrated its potential to revolutionize the creation and interaction with digital modalities. The insights and innovations in this thesis showcase a promising future where reality, technology, and art are interlinked. Date: Monday, 26 August 2024 Time: 10:00am - 12:00noon Venue: Room 5510 Lifts 25/26 Chairman: Prof. Chik Patrick YUE (ECE) Committee Members: Prof. Sai-Kit YEUNG (Supervisor) Prof. Ajay JONEJA (Supervisor) Dr. Tristan BRAUD Prof. Huamin QU Prof. Hongbo FU (EMIA) Dr. Jing LIAO (CityU)