Object Detection from Videos

The Hong Kong University of Science and Technology 
Department of Computer Science and Engineering 

Final Year Thesis Oral Presentation

Title: "Object Detection from Videos"

by

Mr. Rui PENG


Abstract:
 
Object detection from videos is an emerging area in very large scale 
visual recognition. While previous state-of-the-art image object detection 
algorithms can be directly employed in video object detection in a 
frame-by-frame manner to produce acceptable results, this straightforward 
solution fails to exploit the rich temporal information inherent video 
input. In this thesis, I propose a comprehensive video object detection 
pipeline consisting of a number of flexible modules to produce end-to-end 
video object detection results. Working in tandem with the Fast-RCNN on 
VGG16 architecture, our pipeline scores a mAP of 42.1% in the ILSVRC2015 
"Object Detection from Video" contest track, placing us the **fifth* in 
one of the best known worldwide big-data competition in visual 
recognition.  Along the direction of incorporating temporal information, 
we present the Fusion-Net, the main network architecture of our detector. 
Fusion-Net utilizes the convolution layers (or RNN) to operate on feature 
maps to provide a comprehensive fusion effect of features along the 
temporal axis.  Preliminary experiments show it is capable of pushing the 
mAP frontier at least by a margin of 0.5%.
  

Date                 : 9 May 2016 (Monday)

Time                 : 2:30pm to 3:30pm

Venue                : Room 5510 (lift 25/26)

Advisor              : Prof. C.K. TANG
                              
2nd Reader           : Dr. Pedro SANDER