Multimodal Commonsense Reasoning
Speaker:
Dr. Zhecan (James) Wang
UCLA
Title: Multimodal Commonsense Reasoning
Date: Monday, 24 March 2025
Time: 2:00pm - 3:00pm
Venue: Room 4475 (via lift 25/26), HKUST
Abstract:
In my previous work, I have focused on enabling AI models to achieve human-level commonsense reasoning through two complementary avenues. The first avenue enhances reasoning capabilities by extracting and integrating fine-grained, multimodal knowledge—emphasizing the acquisition of contextual information and its incorporation into complex reasoning processes. The second avenue addresses model reliability from three perspectives: prediction consistency, transparent (or explainable) reasoning steps, and faithful performance in biased or ambiguous scenarios. By leveraging such detailed, multimodal knowledge, AI models can improve their reasoning, robustness, and interpretability, thereby strengthening human trust and understanding in human-AI interactions. Building on these foundations, my future research will continue to advance more generalized and human-centered AI, exploring areas such as real-world learning, multimodal math reasoning, security in reasoning, agent-based learning, embodied learning, interactive learning with human feedback, and AI for science, social good, and beyond.
Biography:
Zhecan (James) Wang is a Postdoctoral Research Fellow at UCLA’s NLP group, working under Prof. Kai-Wei Chang and Prof. Nanyun Peng, and earned his Ph.D. in Computer Science from Columbia University under Prof. Shih-Fu Chang. His research spans Natural Language Processing, Vision-Language Understanding, Multimodal Reasoning, Neural-Symbolic Learning, and Trustworthy, Explainable, and Human-Centered AI, with significant contributions to DARPA’s Machine Commonsense (MCS) and ECOLE programs, state-of-the-art benchmark achievements (e.g., VCR, VQA v2, OKVQA), and a first-place win in the Microsoft Global MS-Celeb-1M Challenge and DARPA MCS Benchmark Leaderboard. Additionally, his impactful industry research includes work at Google DeepMind, Microsoft Research, MIT Media Lab, Xpeng Motors, NUS LV Lab, and Panasonic AI Lab, and his contributions are reflected in 17 top-tier conference papers, 7 workshop papers, 8 AI-related patents, over 1200 Google Scholar citations, and collaborations with 17 professors across 12 institutions, with his work being featured by PaperWeekly, AI2, DARPA, 新智源, and 量子位.