Large Scale Machine Learning for Multi-Label and Multi-Modal Applications

PhD Thesis Proposal Defence


Title: "Large Scale Machine Learning for Multi-Label and Multi-Modal 
Applications"

by

Miss Elham JEBALBAREZI SARBIJAN


Abstract:

Large scale machine learning deals with the learning problems for the big 
data sets which are too large or complex to be dealt with by traditional 
methods. In this work, we go through the Extreme classification problems 
with large number of the labels and the multimodal data integration for 
the predictions tasks.

Extreme classification is a classification task on an extremely large 
number of the labels (tags). User generated labels for any type of online 
data can be sparing per individual user but intractably large among all 
users. For example, in web and document categorization, image semantic 
analysis, protein function detection and social network analysis, multiple 
outputs should be predicted, simultaneously. In these problems, modeling 
output label dependencies improves the output predictions. Many of the 
existing algorithms do not adequately address multi-label classification 
with label dependencies and large number of labels. In this research, we 
investigate multi-label classification with label dependencies and many 
labels. We can then solve efficiently the problem of multi-label learning 
with an intractably large number of interdependent labels, such as 
automatic tagging of Wikipedia pages.

In this research, our objective is to find an efficient approach to solve 
the multi label classification problem by considering output space 
dependencies and large scale. We make several contributions to large scale 
multi-label learning: First, we have studied the nature of label 
dependencies and efficiency of the distributed multi-label learning 
methods. Then, we have proposed assumption-free label sampling approach to 
handle huge number of the labels. Finally, we have investigated and 
compared chain-ordered label dependency and order-free learning methods 
for multi-label datasets.

In the second part of our large-scale challenge investigation, as most of 
the learning tasks around us include several sensory modalities which 
represent our primary channels of communication and sensation, such as 
vision or touch, we go through multimodal learning complexities. There are 
various challenges with multimodal data, from among we concentrate on the 
multimodal fusion which is to integrate information from two or more 
modalities to perform a prediction.

Our aim is understanding and modulating the relative contribution of each 
modality in multimodal inference tasks. Moreover, we concentrate on the 
curse of dimensionality happening by integrating the data from several 
sources, and will propose some solutions for that. We make several 
contributions to multimodal data processing: First, we have investigated 
various basic fusion methods with an application to personality 
recognition tasks. In contrast to the previous approaches which use simple 
linear or concatenation approaches, we propose to generate a $(M + 1)$-way 
high-order dependency structure (tensor) to consider the high-order 
relationships between $M$ modalities and the output layer of a neural 
network model. Applying a modalitybased tensor factorization method, which 
adopts different factors for different modalities, results in removing 
information present in a modality that can be compensated by other 
modalities, with respect to model outputs. This helps to understand the 
relative utility of information in each modality and handle the scale 
issues of the problem. In addition it leads to a less complicated model 
with less parameters and therefore could be applied as a regularizer 
avoiding overfitting.


Date:			Wednesday, 3 June 2020

Time:                  	10:00am - 12:00noon

Zoom Meeting:		https://hkust.zoom.us/j/96784264870

Committee Members:	Prof. Pascale Fung (Supervisor, ECE)
  			Dr. Qifeng Chen (Chairperson)
 			Dr. Ming Liu
 			Prof. Tong Zhang


**** ALL are Welcome ****