A SPATIAL-TEMPORAL MODEL FOR AIR QUALITY PREDICTION IN HONG KONG USING AN ATTENTION BASED ENCODER-DECODER AND CNN ARCHITECTURE WITH A CORRESPONDING VISUALIZATION EFFORT

MPhil Thesis Defence


Title: "A SPATIAL-TEMPORAL MODEL FOR AIR QUALITY PREDICTION IN HONG KONG USING 
AN ATTENTION BASED ENCODER-DECODER AND CNN ARCHITECTURE WITH A CORRESPONDING 
VISUALIZATION EFFORT"

By

Mr. Yihong Fanise ZHOU


Abstract

This work presents a deep learning model on the prediction of air pollution for 
Hong Kong city, together with a corresponding visualization system. Rapid 
urbanization and its increasing density have induced a serious air pollution 
problem. Thus, air pollution prediction has become a necessity for helping 
people to protect their health. In this study, we introduce a spatial-temporal 
deep learning model based on attention mechanism assisted encoder-decoder and 
1D CNN architecture to predict the concentration of five target pollutants 
(i.e., O3, NO2, SO2, PM2.5, and PM10) in the next 12 hours by using past 24 
hours data from the air quality monitoring stations as the input.

The temporal model implemented in this study was constructed by encoder-decoder 
architecture, which has the advantage to handle the time-series dataset. Long 
Short-Term Memory (LSTM), an advanced Recurrent Neural Network (RNN) network, 
is used in this deep learning model as the encoder and decoder stacked unit to 
extract the temporal relations; and the attention mechanism is used to deepen 
the correlation between the encoder and decoder for enhancing the prediction 
accuracy.

The implemented spatial model was constructed by a 1D CNN network, which has 
the advantage to handle image-like dataset. This deep learning model is 
composed of a convolutional layer, a pooling layer and a fully connection 
layer. It can effectively extract the spacial features to infer the 
fine-grained air qualities based on the sparse dataset from the predictions of 
temporal model and the spatial information such as POI.

The hyperparameters of the model were adjusted by gradient descent method in 
the training process. The index of agreement (IOA) was used as an accuracy 
indicator in this study to evaluate the model performance. The combined 
spatial-temporal model was implemented with the input of  hourly air quality 
data from 16 air quality monitoring stations in Hong Kong. With the past 24 
hours data, the model can provide the next 12 hours prediction for five target 
pollutants. The accuracy of the prediction for every pollutant at an arbitrary 
point on the grid-mesh with 1km interval is good. The highest IOA can achieve 
0.98.

A visualization system was also established to create an user friendly 
interface for both normal users and domain experts. The visualization system 
includes generating a 2D map for input parameters and output predictions, 
displaying the feature information with suitable circles and colors and 
providing the tools for comparison and analysis. Normal users can find the 
visualized information they are interested in; domain experts can use the 
visualized tools for identification and analysis. A visualized labeling system 
was also introduced for domain experts in the environmental area to perform 
error labeling in an easy way with a user friendly interface. The labeled data 
could be used selectively by the training process, which is very useful to 
refine the processing and improve the prediction function of the machine 
learning model.


Date:  			Monday, 21 March 2022

Time:			3:00pm - 5:00pm

Zoom Meeting:
https://hkust.zoom.us/j/96255779657?pwd=R2lHUm9NeitRb3JPdmd1R3ErQzNUUT09

Committee Members:	Prof. Huamin Qu (Supervisor)
 			Prof. Cunsheng Ding (Chairperson)
 			Prof. Ke Yi


**** ALL are Welcome ****