AQX: Explaining Air Quality Forecast for Verifying Domain Knowledge using Feature Importance Visualization

MPhil Thesis Defence


Title: "AQX: Explaining Air Quality Forecast for Verifying Domain 
Knowledge using Feature Importance Visualization"

By

Miss Reshika PALANIYAPPAN VELUMANI


Abstract

Air pollution forecast has become critical because of its direct impact on 
human health and the increased production of air pollutants caused by 
rapid industrialization. Machine learning (ML) solutions are being 
drastically explored in this domain because of their potential to produce 
highly accurate results with access to historical data. However, experts 
in the environmental area are skeptical about adopting ML solutions in 
real-world applications and policy-making due to their black-box nature. 
In contrast, despite having low accuracy sometimes, the existing 
traditional simulation models (e.g., CMAQ) are widely used and follow 
well-defined and transparent equations. Therefore, presenting the 
knowledge learned by the ML model can make it transparent as well as 
comprehensible. In addition, validating the ML model's learning with the 
existing domain knowledge might aid in addressing the expert's skepticism, 
building appropriate trust, and better utilizing ML models. In 
collaboration with three experts having an average of five years' research 
experience in the air pollution domain, we identified that feature 
(meteorological feature like wind) contribution towards the final forecast 
as the vital information to be verified with domain knowledge. In 
addition, the performance of the ML model compared with the traditional 
simulation model and visualization of raw wind trajectories are essential 
for domain experts to validate the feature contribution information. We 
designed and developed AQX, a visual analytics system to help experts 
validate and verify the ML model's learning with their domain knowledge 
based on the identified information. The system includes coordinated 
multiple views to present the contributions of input features at different 
levels of aggregation in both temporal and spatial dimensions. It also 
provides a performance comparison of ML and traditional models in terms of 
accuracy and spatial map, along with the animation of raw wind 
trajectories for the input period. We further demonstrated two case 
studies and conducted expert interviews with two domain experts to show 
the effectiveness and usefulness of AQX.


Date:  			Wednesday, 12 January 2022

Time:			10:30am - 12:30pm

Venue:			Room 3494
 			Lifts 25/26

Committee Members:	Prof. Huamin Qu (Supervisor)
 			Prof. Chiew-Lan Tai (Chairperson)
 			Dr. Xiaojuan Ma


**** ALL are Welcome ****