MOOC Data Analytics: Probabilistic Topic Modeling of Discussion Forum Data

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering

Final Year Thesis Oral Presentation

Title: "MOOC Data Analytics: Probabilistic Topic Modeling of Discussion 
Forum Data"

By

Lei SUN

Abstract

Massive Open Online Course(MOOC) has expanded over Internet in the last few 
years. Finding that the learning experience is not that satisfying for some 
students, researchers are now trying to explore more features from the data 
collected on MOOC platform to analyze enrolled students' performance during 
the course session, so that more specific adjustment in terms of teaching 
content can be conducted in time. Besides obvious features that can imply 
students' learning progress like assignment grade and video attendance, the 
forum contents of students will be given special attention as features that 
can characterize users in this project. Techniques including Probabilistic 
topic modeling is applied to extract dominant topics behind discussion 
forum posts. Specifically, Latent Dirichlet allocation(LDA) model is used 
to analyze user's forum posts and to represent their posts using topic 
vectors, so that the similarity between users can be calculated using 
KL-Divergence, Euclidean distance and other distance measure, and produce 
clusters based on the similarity measure. The cluster information can help 
us predict other user's performance and possibility to drop along the 
course session.  Besides user's cluster, we will utilize topic modeling to 
learn the similarity between different forum threads and try to aggregate 
threads based on their similarity. The aggregated threads can also provide 
with new statistics illustrating the performance of users who involved in 
those threads.  Other aspects involving user's topic evolution will be 
covered as well.

Date:                   Tuesday, 28 April 2015

Time:                   11:20 - 12:00noon

Venue:                  Room 5560
                        Lifts 27/28

Committee Members:      Prof. Dit-Yan Yeung (Supervisor)
                        Dr. Raymond Wong (Reader)