Human-Centered Physical AI: Contextual Intelligence for Real-World Activities

Speaker: Mingzhe LI
Carnegie Mellon University

Title: Human-Centered Physical AI: Contextual Intelligence for Real-World Activities

Date: Monday, 26 January 2026

Time: 3:00pm - 4:00pm

Venue: Lecture Theater F
(Leung Yat Sing Lecture Theater), near lift 25/26, HKUST

Abstract:

AI systems excel at digital tasks such as language generation and content creation, yet they often struggle in everyday physical activities that demand perception, temporal reasoning, and real-time interaction. Tasks like cooking, applying makeup, or navigating a museum require a contextual grammar of physical activity—capturing evolving object states, procedural structure, safety constraints, multimodal cues, and human goals. In this talk, I present a research pipeline for building this contextual grammar as a foundation for human-centered physical AI, motivated by accessibility and grounded in empirical studies with people with disabilities. Accessibility serves as a lens for revealing fundamental breakdowns in current AI systems and for identifying the representations and interactions needed for reliable physical assistance. I introduce object-status ontologies, spatial schemas, and temporal–causal models, and demonstrate them through systems such as OSCAR and AROMA, which combine multimodal perception, real-time inference, and assistive feedback for non-visual cooking and creative expression. Together, my research shows how accessibility-driven design can produce more robust, context-aware AI agents that benefit everyone operating in the complex, dynamic realities of everyday life.


Biography:

Franklin Mingzhe Li is a final-year Ph.D. candidate in the Human-Computer Interaction Institute at Carnegie Mellon University, advised by Prof. Patrick Carrington. He is an NSERC Postgraduate Fellow and a Stuart K. Card Fellow. His research bridges HCI, accessibility, and ubiquitous computing to close the cyber–physical gap in AI systems. Franklin studies how people engage in everyday physical activities, develops design frameworks and contextual grammars, and builds contextual AI systems that integrate vision, modeling, and interaction to support real-world activities. His work has been published in over 30 full papers at top HCI venues, including CHI, UIST, CSCW, IMWUT, and ASSETS, and has received the Best Artifact Award at ASSETS, as well as recognition from IEEE Spectrum, Communications of the ACM, New Scientist, and Yahoo News. Franklin has also helped secure over USD 1.5 million in competitive grants and fellowships spanning academia, industry, and government.