More about HKUST
Exploring Dependencies in Complex Input and Complex Output Machine Learning Problems
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Exploring Dependencies in Complex Input and Complex Output Machine Learning Problems" By Miss Elham JEBALBAREZI SARBIJAN Abstract Multi-input and multi-output machine learning are some of the chief challenges in the era of big data (Variety of the data). These big datasets are too large and too complex to be handled by traditional machine learning methods and new solutions must be found. In this thesis, we investigate the effect of dependencies between multiple input and multiple output and we show that these dependencies help to solve the problems in a more accurate and less expensive way with fewer parameters. We choose prediction tasks on multi-label (each label is equivalent to an output task) and multimodal (each modality is equivalent to an input channel) as case studies. Multi-label learning is an example of an extreme classification task on an extremely large number of labels (tags). User generated labels for any type of online data can be sparse in terms of individual user but intractably large among all users. For example, in web and document categorization, image semantic analysis, protein function detection and social network analysis, multiple outputs must be predicted simultaneously. In these problems, modelling output label dependencies improves the output predictions. Many of the existing algorithms do not adequately address multi-label classification with label dependencies and a large number of labels. In this thesis, we investigate multi-label classification with dependencies between many labels. We can then efficiently solve the problem of multi-label learning with an intractably large number of interdependent labels, such as the automatic tagging of Wikipedia pages. In this thesis, we have studied the nature of label dependencies and the efficiency of distributed multi-label learning methods. Then, we have proposed an assumption-free label sampling approach to handle a huge number of the labels. Finally, we have investigated and compared chain-ordered label dependency and order-free learning methods for multi-label datasets. In the second part of our dependency challenge investigation, we investigate multimodal learning complexities, as most of the learning tasks include several sensory modalities, such as vision and speech, which represent our primary channels of communication and perception. We focus on how to utilize the modality dependencies for multimodal fusion in order to integrate information from two or more modalities for better prediction. Our aim is to understand and modulate the relative contribution of each modality in multimodal inference tasks by investigating input modality dependencies. Moreover, we propose some solutions to solve the curse of dimensionality which happens by high-order integrating the data from several sources. We make several contributions to multimodal data processing: First, we have investigated various basic fusion methods. In contrast to the previous approaches which use simple linear or concatenation approaches, we propose to generate a $(M + 1)$-way high-order dependency structure (tensor) to consider the high-order relationships between M modalities and the output layer of a neural network model. Applying a modality-based tensor factorization method, which adopts different factors for different modalities, results in removing information present in a modality that can be compensated by other modalities, with respect to the model outputs. Moreover, this modality-based tensor factorization approach helps to understand the relative utility of information in each modality and handles the scale issues of the problem. In addition, it leads to a less complicated model with fewer parameters and therefore could be applied as a regularizer to avoid overfitting. According to our investigations and the experimental results, we find that including the dependencies in the prediction tasks lead to the approaches with simpler models and fewer parameters, while improving the prediction results. We aim to use the challenge of the dimensionality of big data as an opportunity by extracting their dependencies and using them as extra information to solve the prediction problems. We have shown that divide and conquer based on the label dependencies results in a smaller but more accurate method in comparison to the methods which ignore the dependencies. Then, we have shown that a small subset of the labels could provide a lot of information about the remaining labels, therefore we can use a small subset to perform the prediction tasks. Then, we have investigated the order-based dependency extraction vs order-free methods which concludes to the superiority of the order-free methods which are more general and accurate especially for the larger datasets. We have shown that a high-order integration of the modalities represents more information of the inter and intra modality dependencies, however it suffers from the polynomial growth of the dimensionality. Therefore, we propose a fully differentiable framework based on tensor factorization which could be included in any neural based learning method. In a nutshell, our results demonstrate that the dependencies between multiple inputs or outputs could help to make the problem simpler, smaller, and easier to train by combining the prediction tasks with dependency-based sampling, compression, or clustering methods. Date: Tuesday, 22 September 2020 Time: 10:00am - 12:00noon Zoom Meeting: https://hkust.zoom.us/j/97120508164?pwd=ejQrcGtzT0RhNWRBRVBQZ3FDNWN5Zz09 Chairperson: Prof. Wenjing YE (MAE) Committee Members: Prof. Pascale FUNG (Supervisor) Prof. Ming LIU Prof. Tong ZHANG Prof. Daniel PALOMAR (ECE) Prof. Rada MIHALCEA (University of Michigan)