More about HKUST
A Deep Learning Approach for Video Object Detection
The Hong Kong University of Science and Technology Department of Computer Science and Engineering Final Year Thesis Oral Presentation Title: "A Deep Learning Approach for Video Object Detection" by Mr. Hengyuan HU Abstract: Object detection is one of the most fundamental problems in the field of computer vision. In the past several decades, the focus of object detection research was mainly on analyzing single images. This was caused by the limitation on computational power and the lack of well-labeled video datasets. However, we have witnessed a change of landscape, notably exemplified during the second half of 2015 by the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) when a dataset was introduced for object detection from video inputs. With the ample data generously offered by the ILSVRC organizers, mature deep learning framework and implementation (Caffe), and advanced GPU acceleration technology, it is an opportune time for intensifying the research on object detection from video, particularly using the state-of-the-art convolutional neural network and related deep learning methods. In this final year thesis, I carefully studied this problem in two stages. In the first stage, we tailored an existing image object detection architecture, the Fast-RCNN, in order to improve the performance on the video dataset. With the experience gained in the first stage, I have devised several innovative architectures in the second stage, which are carefully engineered to detect video objects by adequately considering the inherent temporal information in the data. Date : 9 May 2016 (Monday) Time : 3:30pm to 4:30pm Venue : Room 5510 (lift 25/26) Advisor : Prof. C.K. TANG 2nd Reader : Prof. Long QUAN