More about HKUST
ACCURATE PROBABILITY ESTIMATES FROM LARGE-SCALE DATA IN THE APPLICATIONS OF DISPLAY ADVERTISING
MPhil Thesis Defence Title: "ACCURATE PROBABILITY ESTIMATES FROM LARGE-SCALE DATA IN THE APPLICATIONS OF DISPLAY ADVERTISING" By Miss Liya JI Abstract Class membership probability estimates are important for many applications, especially Click-Through Rate (CTR) prediction in online advertising, in which classification outputs are combined with other sources, such as bid price, for decision-making. Existing calibration models can well learn a mapping function from predicted probabilities to empirical CTRs and thus reduce the systematic bias (the differences between the average predicted and observed CTRs on some slices of data). Yet, current methods have some theoretical issues and the classifier used in display advertising has some special properties. In this thesis, in order to address those limitations, we propose a model, called Calibration Trees (CT) as a post-processing to calibrate the bias of predictions. CT is scalable to large-scale data and robust for extremely imbalanced data. The experimental results on two data sets of display advertising systems show that our model significantly outperforms the state-of-the-art calibration models in terms of accuracy and well-calibrated properties. An advanced version of CT, called Calibration Forest, also allows implementation in a distributed system and further improves the performance of predictions. Date: Tuesday, 5 May 2015 Time: 3:00pm - 5:00pm Venue: Room 3501 Lifts 25/26 Committee Members: Prof. Qiang Yang (Supervisor) Dr. Raymond Wong (Chairperson) Prof. Dit-Yan Yeung **** ALL are Welcome ****