Data-driven Sketch Analysis

PhD Thesis Proposal Defence


Title: "Data-driven Sketch Analysis"

by

Mr. Lei LI


Abstract:

Freehand sketching is an artistic expression frequently adopted in
human-human and human-computer communications. Interpreting the semantics
of sketches is an algorithmic challenge for machines. Humans commonly
introduce various levels of abstraction and distortion in their creations,
which cannot be easily accommodated by hand-crafted features or rules. The
recent availability of large-scale sketch datasets and 3D geometry datasets
opens up new opportunities to analyze sketches in a data-driven manner. In
this thesis, we draw on these two types of data and present a line of
data-driven techniques for semantic sketch analysis, including recognition
with vector inputs, segmentation with 3D geometry labeling transfer, and
reconstruction with 3D geometry templates. We also propose a robust local
multi-view descriptor for processing the 3D geometries collected online.

First, as a global analysis of sketches, we introduce an end-to-end network
architecture named Sketch-R2CNN for sketched object recognition. Existing
studies commonly cast the problem as an image recognition task by
rasterizing input sketches to pixel images. Instead, we propose to extract
descriptive features from the widely available vector sketch data with
recurrent neural networks (RNNs). We design a differentiable line
rasterization module that converts the vector sketches and the RNN features
to point feature maps. Subsequent convolutional neural networks (CNNs)
readily take the informative point feature maps as inputs for object
category prediction.

Second, as a step towards finer-level analysis, we introduce an efficient
learning-based segmentation method to identify semantic parts of sketched
objects. Due to the lack of sketch datasets with segmentation labelings, we
resort to segmented 3D geometry datasets for synthesizing line drawings.
Our method, combining CNNs and multi-label graph cuts, can effectively
transfer segmentations from 3D geometries to freehand sketches.

Third, with the above global and part-level analysis, we explore a
template-based method for freehand sketch reconstruction. We retrieve 3D
geometries from a large repository with part structures similar to input
sketches. Then the 3D geometries serve as proxies for lifting 2D sketches
to 3D, which is formulated as a quadratic energy minimization problem.

Lastly, to aid the analysis of 3D geometries collected online, such as
orientation alignment or component grouping, we propose a robust
learning-based local multi-view descriptor. Extending the differentiable
rasterization idea of Sketch-R2CNN, we represent 3D local geometry as
multi-view images through a differentiable renderer within neural networks.
The used rendering viewpoints are thus optimizable instead of being fixed
with hand-crafted rules. A novel soft-view pooling module is developed to
adaptively integrate all the convolutional features extracted from each
view image to a single compact descriptor.


Date:                   Wednesday, 22 April 2020

Time:                   9:30am - 11:30am

Zoom Meeting:           https://hkust.zoom.com.cn/j/641316405

Committee Members:      Prof. Chiew-Lan Tai (Supervisor)
                        Dr. Qifeng Chen (Chairperson)
                        Dr. Xiaojuan Ma
                        Dr. Pedro Sander


**** ALL are Welcome ****