More about HKUST
Measuring and Modulating the Social Impact of Generative AI
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Measuring and Modulating the Social Impact of Generative AI" By Miss Yueqi XIE Abstract: Generative AI, especially Large Language Models (LLMs), has seen widespread adoption, bringing with it far-reaching social impacts. Measuring and modulating these impacts are critical steps toward socially responsible AI development and informed policy-making. While the AI community and the social science community have developed well-established instruments within their respective traditional domains, integrated approaches remain lacking for understanding and modulating the emerging social effects of generative AI. This thesis approaches the problem from two complementary fronts: (1) evaluating and improving generative AI models, and (2) analyzing human-AI interactions, with a focus on domains of significant societal relevance. In the first part, we focus on identifying and mitigating critical safety risks associated with state-of-the-art LLMs. We conduct a systematic evaluation of jailbreak attacks—adversarial prompts designed to bypass ethical safeguards and elicit harmful outputs. Inspired by the psychological concept of self-reminders, we propose a simple yet effective defense mechanism called System-Mode Self-Reminder, which helps LLMs maintain alignment with their intended behavior with negligible cost. To further understand the safety mechanisms inherent in LLMs and provide more robust, multi- layered protection, we study the internal parameters of LLMs. We observe that unsafe prompts trigger distinctive patterns in safety-critical parameters. Leveraging this observation, we introduce GradSafe, a novel detection method that accurately and reliably identifies jailbreak attempts without requiring additional model fine-tuning. In the second part, we turn to the challenge of analyzing human-AI interactions as a foundation for understanding how generative AI shapes content creation as a whole. The widespread application of generative AI in content creation presents notable challenges for delineating the originality of AI-assisted content. We raise the research question of how to quantify human contribution in AI-assisted content generation, moving beyond the binary detection of AI-generated output. We propose an information- theoretic framework that quantifies human contribution and demonstrate its effectiveness across diverse domains. Altogether, this thesis aims to contribute to this emerging field by identifying critical issues, developing measurement instruments, and proposing actionable strategies for socially responsible generative AI. Date: Thursday, 31 July 2025 Time: 10:00am - 12:00noon Venue: Room 3494 Lifts 25/26 Chairman: Prof. Ross MURCH (ECE) Committee Members: Dr. Qifeng CHEN (Supervisor) Dr. Shuai WANG Dr. Binhang YUAN Dr. Jun ZHANG (ECE) Prof. Haibo HU (PolyU)