More about HKUST
Weak Supervision for Information Extraction
PhD Thesis Proposal Defence Title: "Weak Supervision for Information Extraction" by Mr. Hongliang DAI Abstract: Deep learning models have gained much success in Information Extraction (IE) from text. Such models usually require a large number of labeled samples to train. Since human annotation can be difficult and time consuming, automatically generated weak supervision is widely leveraged. We investigate the creation and the use of weak annotations for IE with two tasks: Aspect and Opinion Term Extraction (AOTE), and Entity Typing. These two tasks belong to two main categories of tasks in IE: extraction tasks and classification tasks, respectively. First, we are interested in generating context-dependent weak annotations without much human effort. For AOTE, we propose an approach to annotating a large number of training samples with automatic annotation rules. The rules are mined from a small human labeled sample set, and thus do not need to be designed manually. For the task of entity typing, we plan to propose an approach that generates entity type labels by exploiting a pretrained masked language model. For the use of the generated weak annotations, we consider two settings. One setting is that only a set of weakly labeled samples is available. Under this setting, we propose to improve the performance of an entity typing model by leveraging external knowledge. Another setting is that both a set of weakly labeled samples and a small set of human annotated samples are available. We show that pretraining neural models with weak supervision, then fine-tuning them on human annotated data can yield good results. Then, with the task of entity typing, we investigate a framework that obtains a better performing system by first training multiple models with the weakly labeled data, then stacking them with the help of a small high quality sample set. Date: Tuesday, 13 April 2021 Time: 3:00pm - 5:00pm Zoom Meeting: https://hkust.zoom.com.cn/j/91403747618?pwd=Y1FPbHZLVHFKaTNFLzIwRGY4N3Jxdz09 Committee Members: Dr. Yangqiu Song (Supervisor) Dr. Qifeng Chen (Chairperson) Prof. Fangzhen Lin Dr. Xiaojuan Ma **** ALL are Welcome ****