More about HKUST
Synthesizing Images and Videos from Large-scale Datasets
PhD Thesis Proposal Defence Title: "Synthesizing Images and Videos from Large-scale Datasets" by Miss Mingming HE Abstract: In this digital era, the explosion of large-scale visual data has increasingly inspired more sophisticated algorithms to process, understand and augment these resources. Particularly visual data synthesis techniques are in high demand in practice, in order to make visual content editing easy for non-experts. However, even with the rapid advancement of data-processing techniques, a number of problems in visual synthesis still remain unresolved due to the lack of specific domain knowledge, the broad variety of target subjects, and the complexity of human perception and visual data. In this thesis, we focus on developing algorithms for synthesizing both static color effects and dynamic motion behaviors, to help create context-consistent and photo-realistic visual content by leveraging these large-scale datasets. First, we propose a novel algorithm to transfer photo color style from one image to another based on semantically meaningful dense correspondence, which is built with latent feature descriptors extracted from a large-scale image dataset. To achieve accurate color transfer results that respect the semantic relationship between image content, our algorithm learns the latent features with deep neural networks to build the dense correspondence and optimizes local linear color models to enforce both local and global consistency. Our proposed approach jointly optimizes semantic matching and color models in a coarse-to-fine manner. Furthermore, It can also be extended from "one-to-one" to "one-to-many" color transfer to boost the matching reliability by introducing more reference candidates. However, for various synthesis applications including color transfer and colorization, it is still challenging to handle image pairs involving unrelated content elements, even with multiple references. Our next work tries to take advantage of deep neural networks to better predict consistent chrominance across the whole image, including those mismatching elements, to achieve robust single-reference image colorization. Specifically, we design a convolutional neural network to directly map a grayscale image to the colorized image. Rather than using hand-crafted rules as in traditional exemplar-based methods, our end-to-end colorization network learns how to select, propagate, and predict colors from the large-scale data. This approach performs robustly and generalizes well even when using reference images that are unrelated to the input grayscale image. Finally, besides synthesizing static images, we also explore video synthesis techniques from large-scale captures to manipulate their dynamism. We present an approach to create wide-angle, high-resolution looping panoramic videos. Starting with a 2D grid of registered videos acquired on a robotic mount, we formulate a combinatorial optimization to determine for each output pixel the source video and looping parameters that jointly maximize spatiotemporal consistency. To optimize such large size of video data is challenging. We accelerate the optimization by reducing the set of source labels using a graph-coloring scheme, and parallelize the computation and implement it out-of-core by partitioning the domain along low-importance paths. These techniques can be combined to create gigapixel-sized looping panoramas. In our current work, we are exploring doing a factored analysis of video textures in terms of appearance and temporal dynamics. This factorization can enable generation of a novel dynamic texture instance with learned dynamic features, and enable color transfer of video textures by combining our color synthesis techniques. This method will bring static photos to life and easily extend to large panoramas. Date: Thursday, 28 June 2018 Time: 3:00pm - 5:00pm Venue: Room 5501 (lifts 25/26) Committee Members: Dr. Pedro Sander (Supervisor) Prof. Huamin Qu (Chairperson) Prof. Long Quan Prof. Chiew-Lan Tai **** ALL are Welcome ****