More about HKUST
Attentive RSA: Accelerating All FCN-based Detection Methods with Higher Accuracy
Speaker: Yu Liu Chinese University of Hong Kong Title: "Attentive RSA: Accelerating All FCN-based Detection Methods with Higher Accuracy" Date: Thursday, 19 October 2017 Time: 4:00pm - 5:00pm Venue: Lecture Theater G (near lift 25/26), HKUST Abstract: Fully convolutional neural network (FCN) has been dominating the game of object detection for several years. It boils down most of the redundant calculation with its congenital capability of searching in sliding windows. As a result, most recent state-of-the-art methods such as Faster R-CNN, SSD, YOLO and FPN use FCN as their backbone. Here comes one question: Is there a fundamental method that accelerates FCN, and thus accelerating all the recent FCN-based methods? Examining all the pipelines above, we can easily find the three bottlenecks of speed: a) Not all layers in image pyramid contain object with valid scale; b) Anterior layers in FCN take much more amount of calculation than posterior layers do; c) Scale-friendly FCN requires deep and wide structure for both large receptive field and better adaptivity on various scales. In this talk, I'd like to introduce our recent work, 'Attentive RSA', that overcome all these difficulties. First, a scale and location attention module is proposed to select the valid layers in the image pyramid. Only the largest valid image is sent into the anterior part of FCN and we get the largest feature map. In this way, we avoid the bottleneck 'a'. After that, instead of sending down-sampled images subsequently into the anterior part, we directly approximate their feature maps by the largest feature map with a recurrent scale approximation unit (RSA). So most down-sampled images will not pass through anterior layers in FCN, therefore eliminating the bottleneck 'b'. Finally, since we use an image-pyramid method to deal with multi-scale detection, the detector (FCN) only need to verify foreground and background in a certain scale range. So different from the general approach in the bottleneck 'c', the network can be designed in a very thin and shallow form. Experiments show that Attentive RSA accelerates FCN by 6 times while recalling 30% more missing faces on three face detection benchmarks and achieves the first place on FDDB benchmark. ********************* Biography: Yu Liu is a first year PhD student at the Chinese University of Hong Kong, advised by Prof. Xiaogang Wang. Before that he was a research intern at Microsoft Research Asia and Sensetime. His research interests include computer vision and machine learning, especially object detection and recognition. For more details please visit http://liuyu.us/.