More about HKUST
Reinforcement Learning Simulation for Web Application Deception Strategies
The Hong Kong University of Science and Technology Department of Computer Science and Engineering MPhil Thesis Defence Title: "Reinforcement Learning Simulation for Web Application Deception Strategies" By Mr. Andrei KVASOV Abstract: Web applications are constantly under attack as the public-facing components of information systems. One defense mechanism is deception, which introduces deceptive components into the application intended to distract attackers from the successful attack path and make the system notice their behavior. Despite previous partial explorations of this idea, widespread adoption in the industry remains limited due to the absence of a comprehensive end-to-end solution, among other factors. Moreover, assessing the effectiveness of deception strategy poses a challenge as it necessitates conducting human experiments, which can be both costly and impractical for every individual web application scenario. Therefore, our research endeavours to address this issue by proposing a methodology to simulate attacks on web applications that employ deceptive strategies. Leveraging the power of deep reinforcement learning (RL), specifically the Double Deep Q-Network (DDQN) method, we train an attacker model with the objective of successfully infiltrating a simulated web application, thereby allowing us to comprehensively evaluate the efficacy of defensive measures. Within this framework, known as CyberBattleSim, we meticulously construct web application components and proceed to conduct a series of experiments to address the following research inquiries: (1) How does the presence of deceptive elements impact the attacker's detection time, that is the frequency of triggers and the number of steps an attacker agent takes before any honeytoken is triggered? (2) How does the effect of detection points on defensive capabilities changes when the number of deception elements increases? (3) Which types of honeytokens and detection points prove most effective in reducing detection time, and what factors contribute to their efficacy? Our empirical findings reveal that even without incorporating explicit cyberattack concepts, the reinforcement learning (RL) agent successfully attains its objective and learns the optimal attack vector. Moreover, more protected, complex environments impede agents learning, revealing the different impact each honeytoken has on the success of the attack. With our confined, but flexible, high-level approach to replicating web application attacks, the simulator can generalize and scale to even more sophisticated, realistic examples. Date: Wednesday, 23 August 2023 Time: 10:00am - 12:00noon Venue: Room 3494 lifts 25/26 Committee Members: Prof. Huamin Qu (Supervisor) Prof. Nevin Zhang (Chairperson) Dr. Shuai Wang **** ALL are Welcome ****