More about HKUST
Synthesizing Images and Videos from Large-scale Datasets
PhD Thesis Proposal Defence
Title: "Synthesizing Images and Videos from Large-scale Datasets"
by
Miss Mingming HE
Abstract:
In this digital era, the explosion of large-scale visual data has
increasingly inspired more sophisticated algorithms to process, understand
and augment these resources. Particularly visual data synthesis techniques
are in high demand in practice, in order to make visual content editing
easy for non-experts. However, even with the rapid advancement of
data-processing techniques, a number of problems in visual synthesis still
remain unresolved due to the lack of specific domain knowledge, the broad
variety of target subjects, and the complexity of human perception and
visual data. In this thesis, we focus on developing algorithms for
synthesizing both static color effects and dynamic motion behaviors, to
help create context-consistent and photo-realistic visual content by
leveraging these large-scale datasets.
First, we propose a novel algorithm to transfer photo color style from one
image to another based on semantically meaningful dense correspondence,
which is built with latent feature descriptors extracted from a
large-scale image dataset. To achieve accurate color transfer results that
respect the semantic relationship between image content, our algorithm
learns the latent features with deep neural networks to build the dense
correspondence and optimizes local linear color models to enforce both
local and global consistency. Our proposed approach jointly optimizes
semantic matching and color models in a coarse-to-fine manner.
Furthermore, It can also be extended from "one-to-one" to "one-to-many"
color transfer to boost the matching reliability by introducing more
reference candidates.
However, for various synthesis applications including color transfer and
colorization, it is still challenging to handle image pairs involving
unrelated content elements, even with multiple references. Our next work
tries to take advantage of deep neural networks to better predict
consistent chrominance across the whole image, including those mismatching
elements, to achieve robust single-reference image colorization.
Specifically, we design a convolutional neural network to directly map a
grayscale image to the colorized image. Rather than using hand-crafted
rules as in traditional exemplar-based methods, our end-to-end
colorization network learns how to select, propagate, and predict colors
from the large-scale data. This approach performs robustly and generalizes
well even when using reference images that are unrelated to the input
grayscale image.
Finally, besides synthesizing static images, we also explore video
synthesis techniques from large-scale captures to manipulate their
dynamism. We present an approach to create wide-angle, high-resolution
looping panoramic videos. Starting with a 2D grid of registered videos
acquired on a robotic mount, we formulate a combinatorial optimization to
determine for each output pixel the source video and looping parameters
that jointly maximize spatiotemporal consistency. To optimize such large
size of video data is challenging. We accelerate the optimization by
reducing the set of source labels using a graph-coloring scheme, and
parallelize the computation and implement it out-of-core by partitioning
the domain along low-importance paths. These techniques can be combined to
create gigapixel-sized looping panoramas.
In our current work, we are exploring doing a factored analysis of video
textures in terms of appearance and temporal dynamics. This factorization
can enable generation of a novel dynamic texture instance with learned
dynamic features, and enable color transfer of video textures by combining
our color synthesis techniques. This method will bring static photos to
life and easily extend to large panoramas.
Date: Thursday, 28 June 2018
Time: 3:00pm - 5:00pm
Venue: Room 5501
(lifts 25/26)
Committee Members: Dr. Pedro Sander (Supervisor)
Prof. Huamin Qu (Chairperson)
Prof. Long Quan
Prof. Chiew-Lan Tai
**** ALL are Welcome ****