Visual Explanation of Black Box Algorithms

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering

PhD Thesis Defence

Title: "Visual Explanation of Black Box Algorithms"


Miss Xun ZHAO


As the explosively growing of available multi-dimensional data, many 
machine learning and data mining algorithms have been developed to analyze 
and utilize these data. However, most of these algorithms are black boxes, 
which hinders users from understanding and trusting the decisions made by 
these algorithms. By taking advantages of human's strong visual perception 
capability, visualization techniques can be utilized to facilitate the 
interpretation of these algorithms and their decisions. In this proposal, 
we propose several visualization techniques to tackle with various black 
box algorithms.

In the first work, we focus on explaining skyline, which is widely applied 
to facilitate multi-criteria decision making. By automatically removing 
incompetent candidates, skyline queries allow users to focus on a subset 
of superior data items (i.e., the skyline). However, users are still 
required to interpret and compare these superior items manually 
before making a successful choice. We therefore propose SkyLens, a visual 
analytic system aiming at revealing the superiority of skyline points from 
different perspectives and at different scales to aid users in their 
decision making. Two usage scenarios and one user study are conducted to 
demonstrate the effectiveness of our system.

The second work studies the explanation of random forest algorithms. As an 
ensemble model that consists of many independent decision trees, random 
forests generate predictions by feeding the input to internal trees and 
summarizing their outputs. However, random forests suffer from a poor 
model interpretability, which significantly hinders the model from being 
used in fields that require transparent and explainable predictions, such 
as medical diagnosis and financial fraud detection. To address this issue, 
we propose an interactive visualization system aiming at interpreting 
random forest models and predictions. We carried out two usage scenarios 
and one user study to evaluate the usefulness of the proposed technique.

The third work investigates the interpretation of outliers, the data 
instances that do not conform with normal patterns in a dataset. As 
different domains usually have different considerations about outliers, 
understanding the defining characteristics of outliers is essential for 
users to select and filter appropriate outliers based on their domain 
requirements. However, most existing work focuses on the efficiency and 
accuracy of outlier detection, while neglecting the importance of outlier 
interpretation. Hence, we propose a visual analytic system that helps 
users understand, interpret, and select the outliers detected by various 
algorithms. One usage scenario and one user study are carried out to 
evaluate the proposed solution.

Date:			Friday, 14 December 2018

Time:			2:00pm - 4:00pm

Venue:			Room 3598
 			Lifts 27/28

Chairman:		Prof. Sujata Visaria (ECON)

Committee Members:	Prof. Dik-Lun Lee (Supervisor)
 			Prof. Huamin Qu (Supervisor)
 			Prof. Xiaojuan Ma
 			Prof. Long Quan
 			Prof. Kai Tang (MAE)
 			Prof. Oliver Deussen (University of Konstanz)

**** ALL are Welcome ****