COMP5331:
Knowledge Discovery in Databases
Project
Important Dates
- Group Forming deadline: 19
Sept (send your group name, id and
emails to TA)
- Proposal deadline: 26 Sept 10:30
(in class)
- Final report deadline: 6
December 10:30 (in class)
·
PPT/Source
Code submission deadline: : one day before your presentation date (11:59pm)
No. of Students
Each group
has no more than two people.
Please send
an email to our TA, Mr. Zheng Liu (zliual@cse.ust.hk), with email title
"COMP5331 Group Forming" for your group information. Our TA will
assign a group ID to your group.
- Information of each member of
your group
Proposal
Please write
a proposal including the following items.
- A specific topic (or title) for
this project
- Type of this project:
Survey/Implementation/Research
- Group No.
- Information of each member of
your group
- Student ID
- Student name
- Your research supervisor (if any)
- Your own research topic (if
any) and an explanation why this project is different from your own
research topic
o A brief description about this
project (about 1000~2000 words)
o A list of papers to be read in this
project
Final Report
Please write
a proposal including the following items.
- A specific topic (or title) for
this project
- Type of this project:
Survey/Implementation/Research
- Group No.
- Information of each member of
your group
- Student ID
- Student name
- Your research/FYP supervisor
(if any)
- Your own research/FYP topic (if
any) and an explanation why this project is different from your own
research topic
- Content
- Number of words
- Survey Type
- Implementation Type
- Research Type
- For each type,
- if you like, you can write in
more words
- Guideline
- Write a normal report (e.g.,
Introduction, Related Work, Algorithm, Conclusion, References, ...)
Suggested Topics
You can
select any topics you want. The following topics are suggested to those
students who have no idea about knowledge discovery in databases.
- For those who have no idea, you
can select one of the following topics.
If you select a topic, please think a specific topic (or
title) under this topic for your project which is related to the papers
you chose in this topic. You can select any papers under a topic. After
that, you can also find the related works/papers for these papers as the
paper list of your project.
- For those who have some ideas,
you can find any papers by yourself and propose your own specific topic
in this project. You can choose any papers from data mining conferences
(such as KDD, ICDM and SDM), databases conferences (such as SIGMOD, VLDB
and ICDE), machine learning conferences (such as NIPS, AAAI and ICML) and
IR conferences (such as SIGIR). Please ask for an instructor's approval to
include your selected papers in your project.
- The following papers are chosen
from KDD, SDM and ICDM, which does not mean that you must read the papers
from these conferences. In fact, there are many other data mining papers
which appear in databases conferences, machine learning conferences and IR
conferences. You can select the papers from these conferences for the
project.
Topic 1: Social Networks
- Jie Tang, Sen Wu, Jimeng Sun
Confluence: Conformity Influence in Large Social Networks
KDD 2013 (pdf)
- Hossein Maserrat,
Jian Pei
Community Preserving Lossy Compression of Social
Networks
ICDM 2012 (pdf)
- B. Aditya Prakash, Jilles Vreeken, and Christos
Faloutsos
Spotting Culprits in Epidemics: How many and Which
ones?
ICDM 2012 (pdf)
- A Majumder,
Samik Datta, Naidu
KVM,
Capacitated Team Formation Problem on Social Networks
KDD 2012 (pdf)
- Diego Saez-Trumper,
Giovanni Comarela, Virgilio Almeida, Ricardo Baeza-Yates, Fabricio Benevenuto,
Finding Trendsetters in Information Networks
KDD 2012 (pdf)
- Rui Li, Shengjie
Wang, Hongbo Deng, Kevin Chang,
Towards Social User Profiling: Unified and Discriminative Influence Model
for Inferring Home Locations
KDD 2012 (pdf)
- Xingjie Liu, QI HE, Yuanyuan
Tian, Wang-Chien Lee, John McPherson, Jiawei
Han,
Event-based Social Networks: Linking the Online and Offline Social Worlds
KDD 2012 (pdf)
- Xiwang Yang, Harald Steck, Yong Liu,
Circle-based Recommendation in Online Social Networks
KDD 2012 (pdf)
- Yasuhiro Fujiwara, Makoto Nakatsuji, Takeshi Yamamuro,
Hiroaki Shiokawa, Makoto Onizuka,
Efficient Personalized PageRank with Accuracy Assurance
KDD 2012 (pdf)
- Bahman Bahmani,
Stanford; Ravi Kumar, Mohammad Mahdian, Eli Upfal,
PageRank on an Evolving Graph
KDD 2012 (pdf)
- Anirban Dasgupta,
Ravi Kumar, D Sivakumar,
Social Sampling
KDD 2012 (pdf)
- Jiliang Tang, Huiji
Gao, Huan Liu, Atish
Das Sarma,
eTrust: Understanding Trust Evolution in an
Online World
KDD 2012 (pdf)
- Guan Wang, Yuchen
Zhao, Xiaoxiao Shi, Philip S Yu,
Magnet Community Identification on Social Networks
KDD 2012 (pdf)
- David Gleich,
C Seshadhri,
Vertex Neighborhoods, Low Conductance Cuts, and Good Seeds for Local
Community Methods
KDD 2012 (pdf)
- Liaoruo Wang, Tiancheng
Lou, Jie Tang, and John Hopcroft
Detecting Community Kernels in Large Social Networks
ICDM 2011 (pdf)
Topic 2: Web
- Reza Zafarani,
Huan Liu
Connecting Users across Social Media Sites: A Behavioral-Modeling Approach
KDD 2013 (pdf)
- Weinan Zhang, Ying Zhang, Bin Gao,
Yong Yu, Xiaojie Yuan, Tie-Yan Liu,
Joint Optimization of Bid and Budget Allocation in Sponsored Search
KDD 2012 (pdf)
- Vijay Bharadwaj,
Peiji Chen, Wenjing
Ma, Chandrashekar Nagarajan,
John Tomlin, Sergei Vassilvitskii, Erik Vee, Jian Yang,
SHALE: An Efficient Algorithm for Allocation of Guaranteed Display
Advertising
KDD 2012 (pdf)
- Neha Gupta, Abhimanyu Das,
Sandeep PandeyVijay Narayanan
Factoring Past Exposure in Display Advertising Targeting
KDD 2012 (pdf)
- Anand Bhalgat,
Jon Feldman, Vahab Mirrokni,
Online Allocation of Display Ads with Smooth Delivery
KDD 2012 (pdf)
- Michel Speiser,
Gianluca Antonini, and
Abderrahim Labbi
Ranking Web-based Partial Orders by Significance Using a Markov Reference
Model
ICDM 2011 (pdf)
- Bing Li, Weihua
Xiong, and weiming Hu
Web Horror Image Recognition Based on Context-Aware Multi-Instance
Learning
ICDM 2011 (pdf)
Topic 3: Association
- Stephan Guennemann,
Ines Faerber, Kittipat
Virochsiri, Thomas Seidl,
Subspace Correlation Clustering: Finding Locally Correlated Dimensions in
Subspace Projections of the Data
KDD 2012 (pdf)
- Kui Yu, Xindong
Wu, Wei Ding, and Hao Wang
Causal Associative Classification
ICDM 2011 (pdf)
- Nikolaj Tatti
and Fabian Moerchen
Finding Robust Itemsets under Subsampling
ICDM 2011 (pdf)
Topic 4: Graph Mining
- Charalampos Tsourakakis,
Francesco Bonchi, Aristides Gionis,
Francesco Gullo, Maria Tsiarli
Denser than the densest subgraph: extracting optimal quasi-cliques with
quality guarantees
KDD 2013 (pdf)
- Jia Wang, James Cheng, Ada Wai-Chee
Fu
Redundancy-Aware Maximal Cliques
KDD 2013 (pdf)
- Theodoros Lappas,
George Valkanas, Dimitrios Gunopulos,
Efficient and Domain-Invariant Competitor Mining
KDD 2012 (pdf)
- Isabelle Stanton, Gabriel Kliot,
Streaming Graph Partitioning for Large Distributed Graphs
KDD 2012 (pdf)
- Keith Henderson, Brian
Gallagher, Tina Eliassi-Rad, Hanghang
Tong, Sugato Basu,
Leman Akoglu, Danai Koutra, Lei Li, Christos Faloutsos,
RolX: Structural Role Extraction & Mining in
Large Graphs
KDD 2012 (pdf)
- James Cheng, Linhong
Zhu, Yiping Ke, Shumo Chu,
Fast Algorithms for Maximal Clique Enumeration with Limited Memory
KDD 2012 (pdf)
- Jing Feng, Xiao He, Bettina Konte, Christian Boehm; Claudia Plant,
Summarization-based Mining Bipartite Graphs
KDD 2012 (pdf)
- Brigitte Boden, Stephan Guennemann, Holger Hoffmann, Thomas Seidl,
Mining Coherent Subgraphs in Multi-Layer Graphs with Edge Labels
KDD 2012 (pdf)
- C. Seshadhri,
Ali Pinar, and Tamara Kolda
An In-Depth Study of Stochastic Kronecker Graphs
ICDM 2011 (pdf)
- Petko Bogdanov,
Misael Mongiovi, and Ambuj
K. Singh
Mining Heavy Subgraphs in Time-Evolving Networks
ICDM 2011 (pdf)
Topic 5: Classification
- Wenlin Chen, Yixin
Chen, Yi Mao, Baolong Guo
Density-Based Logistic Regression
KDD 2013 (pdf)
- Mithat Poyraz,
Zeynep Hilal Urhan, and Murat Can Ganiz
A Novel Semantic Smoothing Method based on Higher Order Paths for Text
Classification
ICDM 2012 (pdf)
- Redux Byron Wallace, Kevin
Small, Carla E. Brodley, and Thomas Trikalinos
Class Imbalance,
ICDM 2011 (pdf)
- Tushar Khot,
Sriraam Natarajan, Kristian Kersting,
and Jude Shavlik
Learning Markov Logic Networks via Functional Gradient Boosting
ICDM 2011 (pdf)
- Luite Stegeman
and Ad Feelders
On Generating all Optimal Monotone Classifications
ICDM 2011 (pdf)
- Dong Liu, Shuicheng
Yan, Yadong Mu, Xian-Sheng Hua, and Hong-Jiang
Zhang
Towards Optimal Discriminating Order for Multiclass Classification
ICDM 2011 (pdf)
- Charles Parker
An Analysis of Performance Measures for Binary Classifiers
ICDM 2011 (pdf)
Topic 6: Pattern
- Guimei Liu; Haojun
Zhang, Limsoon Wong,
Finding Minimum Representative Pattern Sets
KDD 2012 (pdf)
- Cheng-Wei Wu, Bai-En Shie, Philip Yu, Vincent
Tseng,
Mining Top-K High Utility Itemsets
KDD 2012 (pdf)
- Geng Li, Mohammed Zaki,
Sampling Minimal Frequent Boolean (DNF) Patterns
KDD 2012 (pdf)
- Zhenhui Li, Jingjing
Wang, Jiawei Han,
Mining Event Periodicity from Incomplete Observations
KDD 2012 (pdf)
- Fei Wang, Noah Lee, Jianying Hu, Jimeng Sun, Shahram Ebadollahi,
Towards Heterogeneous Temporal Clinical Event Pattern Discovery: A
Convolutional Approach
KDD 2012 (pdf)
- Arnaud Soulet,
Chedy Raïssi, Marc Plantevit, and Bruno Crémilleux
Mining Dominant Patterns in the Sky
ICDM 2011 (pdf)
- Shrink Hyungsul
Kim, David Sheridan, Sungjin Im,
Shobha Vasudevan,
Tarek Abdelzaher, and Jiawei Han
Signature Pattern Covering via Local Greedy Algorithm and Pattern
ICDM 2011 (pdf)
- Mikalai Tsytsarau,
Themis Palpanas, Francesco Bonchi,
and Aristides Gionis
Diverse Dimension Decomposition of an Itemsets
Space
ICDM 2011 (pdf)
Topic 7: Time Series
- Lu-An Tang, Xiao Yu, Quanquan Gu, Jiawei Han,
Alice Leung, Thomas La Porta
Mining Lines in the Sand: On Trajectory Discovery From Untrustworthy Data
in Cyber-Physical System
KDD 2013 (pdf)
- Thanawin Rakthanmanon,
Bilson Campana, Abdullah Mueen,
Gustavo Batista, Brandon Westover, Qiang Zhu, Jesin Zakaria, Eamonn Keogh,
Searching and Mining Trillions of Time Series Subsequences under Dynamic
Time Warping
KDD 2012 (pdf)
- Thanawin Rakthanmanon,
Eamonn Keogh, Stefano Lonardi,
and Scott Evans
Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring
Some Data
ICDM 2011 (pdf)
- Zhenxing Wang and Laiwan
Chan
Using Bayesian Network Learning Algorithm to Discover Causal Relations in
Multivariate Time Series
ICDM 2011 (pdf)
Topic 8: Clustering
- Frank Lin and William W. Cohen
A General and Scalable Approach to Mixed Membership Clustering
ICDM 2012 (pdf)
- Francesco Bonchi,
Aristides Gionis, Francesco Gullo,
Antti Ukkonen,
Chromatic Correlation Clustering
KDD 2012 (pdf)
- Fabian Wauthier,
Nebojsa Jojic, Michael
Jordan,
Active Spectral Clustering via Iterative Uncertainty Reduction
KDD 2012 (pdf)
- Stephan Guennemann,
Ines Faerber, Thomas Seidl,
Multi-View Clustering Using Mixture Models in Subspace Projections
KDD 2012 (pdf)
- Peter Haider,
Luca Chiarandini, Ulf Brefeld,
Discriminative Clustering for Market Segmentation
KDD 2012 (pdf)
- Madeleine Seeland,
Andreas Karwath, Stefan Kramer,
A Structural Cluster Kernel for Learning on Graphs
KDD 2012 (pdf)
- Wouter Duivesteijn,
Ad Feelders, Arno Knobbe,
Different Slopes for Different Folks
KDD 2012 (pdf)
- Xiaoran Xu and Zhi-Hong
Deng
BibClus: A Clustering Algorithm of Bibliographic
Networks by Message Passing on Center Linkage Structure
ICDM 2011 (pdf)
- Hao Huang, Shinjae
Yoo, Hong Qin, and Dantong
Yu
A Robust Clustering Algorithm Based on Aggregated Heat Kernel Mapping
ICDM 2011 (pdf)
- Stephan Günnemann,
Emmanuel Müller, Sebastian Raubach, and Thomas Seidl
Flexible Fault Tolerant Subspace Clustering
ICDM 2011 (pdf)
Topic 9: Recommendation
- Mahashweta Das, Gianmarco
De Francisci Morales, Aristides Gionis, Ingmar Weber
Learning to question: Leveraging user preferences for shopping advice
KDD 2013 (pdf)
- Hongzhi Yin, Yizhou
Sun, Bin Cui, Zhiting Hu, Ling Chen
LCARS: A Location-Content-Aware Recommender System
KDD 2013 (pdf)
- Yuandong Tian, Jun Zhu,
Learning from Crowds in the Presence of Schools of Thought
KDD 2012 (pdf)
- Ke Zhou, Hongyuan
Zha,
Learning Binary Codes for Collaborative Filtering
KDD 2012 (pdf)
- Qi Liu, Yong Ge, Zhongmou Li, EnHong Chen,
and Hui Xiong
Personalized Travel Package Recommendation
ICDM 2011 (pdf)
- Jinoh Oh, Sun Park, Hwanjo Yu, Min Song, and Seung-Taek
Park
Novel Recommendation based on Personal Popularity Tendency
ICDM 2011 (pdf)
- Jaewoo Lee, Chris Clifton,
Differential Identifiability
KDD 2012 (pdf)
- Yan Zhou, Murat Kantarcioglu, Bhavani Thuraisingham, Bowei
Xi,
Adversarial Support Vector Machine Learning
KDD 2012 (pdf)
Topic 10: Big Data
- Wook-Shin Han, Sangyeon
Lee, Kyungyeol Park, Jeong-Hoon
Lee, Min-Soo Kim, Jinha Kim, Hwanjo
Yu
TurboGraph: A Fast Parallel Graph Engine
Handling Billion-scale Graphs in a Single PC
KDD 2013 (pdf)
- Karthik Raman, Adith
Swaminathan, Thorsten Joachims,
Johannes Gehrke
A Probabilistic Framework for Big Data Pipelines
KDD 2013 (pdf)
- John Canny, Huasha
Zhao
Big Data Analytics with Small Footprint: Squaring the Cloud
KDD 2013 (pdf)
- Mingdong Ou,
Peng Cui, Fei Wang, Jun Wang
Comparing Apples to Oranges: A Scalable Solution with Heterogeneous
Hashing
KDD 2013 (pdf)
- En-Hsu Yen, Chun-Fu Chang,
Ting-Wei Lin, Shan-Wei Lin, Shou-De Lin
Indexed Block Coordinate Descent for Large-Scale Linear Classification
with Limited Memory
KDD 2013 (pdf)
- Siddharth Gopal, Yiming
Yang
Recursive Regularization for Large-scale Classification with Hierarchical
and Graphical Dependencies
KDD 2013 (pdf)
Topic 11: Crowd Mining
- A. Yael, Y. Grossman, T.Milo, and P. Senellart.
Crowd Mining.
SIGMOD’13 (pdf)
- Antoine Amarilli, Yael Amsterdamer, Tova Milo
On the Complexity of Mining Itemsets from the Crowd Using Taxonomies.
ICDT 2014: 15-25 (pdf)
- Yael Amsterdamer, Susan B. Davidson, Tova Milo, Slava Novgorodov, Amit Somech
OASSIS: query driven crowd mining.
SIGMOD Conference 2014: 589-600
(pdf)
- Yael Amsterdamer, Susan B. Davidson, Tova Milo, Slava Novgorodov, Amit Somech
Ontology Assisted Crowd Mining
VLDB
2014 (pdf)
Acknowledgement, Thanks for Prof. Raymond
Wong’s topic and paper list provided in his course webpage COMP5331