More about HKUST
Exploring Neural Stylization and Rendering Across 2D and 3D Visual Modalities
PhD Thesis Proposal Defence Title: "Exploring Neural Stylization and Rendering Across 2D and 3D Visual Modalities" by Miss Yingshu CHEN Abstract: The rapid advancement of deep learning and generative machine learning has shown promise for digital content generation. Neural network-driven techniques, including neural rendering, neural representations and neural stylization, have empowered artists, creators, and even non-professionals to blend artistic elements and technical intelligence revolutionizing the way we generate, manipulate, and stylize visual content across diverse modalities. This thesis begins by providing a comprehensive review of the fundamental theories and state-of-the-art developments in neural stylization, covering key concepts such as neural style transfer, neural rendering, and the latest advancements in generative AI. It then establishes a systematic taxonomy to navigate the categorization of neural stylization for various digital content types, from 2D imagery to 3D assets. Starting with these fundamentals, the thesis further explores the difficulties of photorealistic stylization, focusing on real-world outdoor scenarios in 2D and city-level scenarios in the 3D realm. Under the outdoor cityscape circumstance, we aim to maintain the foreground's geometry and structure while skillfully combining dynamic color and texture styles for the sky background. The TimeOfDay framework solves the issue of architectural style transfer on photographs, utilizing high-frequency-aware image-to-image translation models. Moving to 3D space, the StyleCity system supervises the compact neural texture optimization using multi-view features extracted from the large-scale pre-trained vision and language models in a progressive and scale-adaptive manner to handle extra challenges in 3D urban scenes including the large magnitude of area and scale-adaptive 3D consistency. In addition, StyleCity synthesizes an omnidirectional sky in a harmonious style with the foreground elements using the latest generative diffusion model. By seamlessly integrating neural rendering, generative models, and vision-language models, neural stylization has demonstrated its potential to revolutionize the creation and interaction with digital modalities. The insights and innovations in this thesis showcase a promising future where reality, technology, and art are interlinked. Date: Tuesday, 23 July 2024 Time: 2:00pm - 4:00pm Venue: Room 3494 Lifts 25/26 Committee Members: Prof. Sai-Kit Yeung (Supervisor) Prof. Ajay Joneja (Supervisor) Prof. Pedro Sander (Chairperson) Dr. Dan Xu