This course homepage is accessible from http://www.cs.ust.hk/~dlee/4321/
Spring 2020
Instructor: |
|
Email: |
dlee@cse.ust.hk |
Office: |
3534 (Lift 25/26) |
Office Hours: |
Emails are the best way to get quick response from me. If you want to meet and cannot make the office hours, try to make an appointment with me by email. |
|
|
Lectures: |
Tue/Thu 9:00am – 10:20am |
Lecture Room: |
Rm 4620 Lf 31/2 |
|
|
TA: |
NI, Wangze wniab@ust.hk |
|
YU Manli myuae@connect.ust.hk |
|
CHEN Hongkai hchencf@connect.ust.hk |
Lab LA1: |
Monday 7:00pm – 7:50 pm, Rm 4213 (Lift 19) |
Course grades will normally fall within the following
percentage bands:
A 15%
B 40%
C 40%
D/F 5%
There is no particular distribution within the subgrades of a grade but can be assumed to be equally divided.
Grades are first assigned to all students according to the distribution above without considering bonus points. Thresholds between subgrades are set. Then, bonus points are added to students. A student’s grade will be re-assigned (moved up) according to his/her new score. The end result is that students who do not have bonus points will not be penalized by other students having bonus points.
Both the mid-term and final exams are open book. You can bring your lecture notes (slides and notes) and one book to the exam venue. While you do not need to memorize everything (formula and pseudo code, etc.) by heart, the examinations are set assuming you know the materials well. That is, the notes/slides are there to help you with “is my cosine similarity formula correct?” and “if the PR formula 1-p… or p – 1 …” etc., but flipping through the slides page by page to find the answer of a question would waste too much time. At the end you do not have enough time to finish all of the questions. Bear in mind that you still need to study hard!
On successful completion of this course, students are expected to be able to:
(1) |
Design and implement a complete and functional search engine. |
(2) |
Test and evaluate the effectiveness of a search engine. |
(3) |
Identify the limitations of search engine technologies and develop solutions to meet application requirements. |
1. Introduction and course overview |
6. Retrieval effectiveness, benchmarking |
2. Business models |
7. Document preprocessing |
3. Information retrieval models and Inverted Files |
8. Query expansion and relevance feedback |
4. Web-based information retrieval |
9. Applications: text summarization |
5. Pattern matching and extended Boolean model |
10. Applications: recommendation systems |
Course Description
Text retrieval models, vector space model, document ranking, performance evaluation; indexing, pattern matching, relevance feedback, clustering; web search engines, authority-based ranking; enterprise data management, content creation, metadata, taxonomy, ontology; semantic web, digital libraries and knowledge management applications.
After completing the course, students will have acquired:
Homework/lab assignments must be done individually. Collaboration between students is strictly forbidden. Any violation will be passed to the Department's Undergraduate/Postgraduate Studies Committee for assessment. The result may lead to dismissal from the University.
Term project must be done by the individual group. No sharing of code and copying of code from previous projects are allowed.