More about HKUST
Automatic Spreadsheet Cell Clustering and Smell Detection using Strong and Weak Features
MPhil Thesis Defence Title: "Automatic Spreadsheet Cell Clustering and Smell Detection using Strong and Weak Features" By Miss Wanjun CHEN Abstract Spreadsheets are error-prone. Although various techniques are proposed to detect errors in terms of smells, they suffer from two issues. First, they cannot uniformly characterize and detect smells. Each technique targets some specific smell types, and fails to leverage information derived by previous works to improve detection accuracy. Second, smells are often detected as violations of pre-defined rules, thus failing to adapt to diverse user practices. In this thesis, we propose to derive cell clusters automatically using a two-stage technique based on strong and weak features that capture different user practices. Smells can then be detected as outliers of these clusters in feature space. We implemented our technique and applied it to 70 spreadsheet files randomly sampled from EUSES Corpus. Experiment results show that our technique is effective to cluster cells and capable of detecting multiple types of smells with a precision 0.73, recall 0.61, F-measure 0.67 compared with existing work 0.59, 0.51, 0.55 respectively. Date: Thursday, 23 July 2015 Time: 10:00am - 12:00noon Venue: Room 2132C Lift 19 Committee Members: Prof. Shing-Chi Cheung (Supervisor) Dr. Sunghun Kim (Chairperson) Dr. Qiong Luo **** ALL are Welcome ****