Synthesizing Images and Videos from Large-scale Datasets

PhD Thesis Proposal Defence


Title: "Synthesizing Images and Videos from Large-scale Datasets"

by

Miss Mingming HE


Abstract:

In this digital era, the explosion of large-scale visual data has 
increasingly inspired more sophisticated algorithms to process, understand 
and augment these resources. Particularly visual data synthesis techniques 
are in high demand in practice, in order to make visual content editing 
easy for non-experts. However, even with the rapid advancement of 
data-processing techniques, a number of problems in visual synthesis still 
remain unresolved due to the lack of specific domain knowledge, the broad 
variety of target subjects, and the complexity of human perception and 
visual data. In this thesis, we focus on developing algorithms for 
synthesizing both static color effects and dynamic motion behaviors, to 
help create context-consistent and photo-realistic visual content by 
leveraging these large-scale datasets.

First, we propose a novel algorithm to transfer photo color style from one 
image to another based on semantically meaningful dense correspondence, 
which is built with latent feature descriptors extracted from a 
large-scale image dataset. To achieve accurate color transfer results that 
respect the semantic relationship between image content, our algorithm 
learns the latent features with deep neural networks to build the dense 
correspondence and optimizes local linear color models to enforce both 
local and global consistency. Our proposed approach jointly optimizes 
semantic matching and color models in a coarse-to-fine manner. 
Furthermore, It can also be extended from "one-to-one" to "one-to-many" 
color transfer to boost the matching reliability by introducing more 
reference candidates.

However, for various synthesis applications including color transfer and 
colorization, it is still challenging to handle image pairs involving 
unrelated content elements, even with multiple references. Our next work 
tries to take advantage of deep neural networks to better predict 
consistent chrominance across the whole image, including those mismatching 
elements, to achieve robust single-reference image colorization. 
Specifically, we design a convolutional neural network to directly map a 
grayscale image to the colorized image. Rather than using hand-crafted 
rules as in traditional exemplar-based methods, our end-to-end 
colorization network learns how to select, propagate, and predict colors 
from the large-scale data. This approach performs robustly and generalizes 
well even when using reference images that are unrelated to the input 
grayscale image.

Finally, besides synthesizing static images, we also explore video 
synthesis techniques from large-scale captures to manipulate their 
dynamism. We present an approach to create wide-angle, high-resolution 
looping panoramic videos. Starting with a 2D grid of registered videos 
acquired on a robotic mount, we formulate a combinatorial optimization to 
determine for each output pixel the source video and looping parameters 
that jointly maximize spatiotemporal consistency. To optimize such large 
size of video data is challenging. We accelerate the optimization by 
reducing the set of source labels using a graph-coloring scheme, and 
parallelize the computation and implement it out-of-core by partitioning 
the domain along low-importance paths. These techniques can be combined to 
create gigapixel-sized looping panoramas.

In our current work, we are exploring doing a factored analysis of video 
textures in terms of appearance and temporal dynamics. This factorization 
can enable generation of a novel dynamic texture instance with learned 
dynamic features, and enable color transfer of video textures by combining 
our color synthesis techniques. This method will bring static photos to 
life and easily extend to large panoramas.


Date:			Thursday, 28 June 2018

Time:                  	3:00pm - 5:00pm

Venue:                  Room 5501
                         (lifts 25/26)

Committee Members:	Dr. Pedro Sander (Supervisor)
 			Prof. Huamin Qu (Chairperson)
 			Prof. Long Quan
 			Prof. Chiew-Lan Tai


**** ALL are Welcome ****
Privacy Sitemap
Synthesizing Images and Videos from Large-scale Datasets

About

People

Research

Academics

Admissions