More about HKUST
Re-creating 3D World from 2D Captures with 3D Generative AI
Speaker: Dr. Yinghao XU Stanford University Title: "Re-creating 3D World from 2D Captures with 3D Generative AI" Date: Wednesday; 8 January 2025 Time: 4:00pm - 5:00pm Venue: Room 4502 (via lift 25/26), HKUST Abstract: Generative Artificial Intelligence (AI), with its capacity to synthesize, simulate, and predict data, has made significant strides in understanding and generating one-dimensional natural language and two-dimensional images and videos. However, human intelligence is fundamentally grounded in the 3D world. While we capture 2D visual information with our eyes, our brain transforms it into rich, 3D mental representations, enabling us to perceive, understand, and interact with our 3D environment. This makes the development of 3D generative AI a crucial step toward achieving more human-like artificial intelligence. In this talk, I will present my work on developing 3D generative AI systems to generate three-dimensional content from 2D observations. These systems demonstrate robust capabilities in perceiving and generating rich 3D representations of environments and leveraging them to understand, interact with, and reason about the 3D world. I will begin with differentiable scene representation, discussing how we learn generic scene priors from multi-view imagery using the inverse graphics pipeline. These representations and scene priors have inspired me to explore novel 3D generative paradigms as alternatives to conventional graphics workflows, enabling us to build 3D generative models from 2D images without direct supervision from 3D data. Finally, I will discuss how to leverage domain-specific knowledge from 3D modeling to facilitate 3D control over generated videos, advancing into 4D content generation. Through this research, I aim to pave the way for more advanced spatial reasoning and decision-making in AI agents, bringing us closer to achieving human-like intelligence in 3D understanding and interaction. *************** Biography: Yinghao Xu is a postdoctoral researcher at the Stanford Computational Imaging Lab, Stanford University, advised by Prof. Gordon Wetzstein. Previously, he was a Ph.D. student at The Chinese University of Hong Kong. He has a deep interest in the intersection of Computer Graphics and Computer Vision. His current research focuses on generative models and neural rendering, particularly in the area of 3D generative models. One of his papers was nominated as a best paper candidate at CVPR 2020. He was also awarded the WAIC Rising Star 2024 and was nominated for the Snap Fellowship 2022.