Information Retrieval and Web Mining

Course Description

This course will introduce the latest development of information retrieval and web mining technologies. In the first part of the course, we will overview the fundamental concepts of information retrieval, such as crawling, parsing, indexing, searching, scoring, and compression. These techniques enable students to handle web scale datasets. In the second part, we will discuss how to extract knowledge from web scale datasets by link analysis, clustering, and recommendation techniques. Moreover, some latest implementation techniques (such as Apache Hadoop, Pig, and Lucene) will be studied thoroughly by the course project. The course is aimed at helping students to explore the latest techniques in information retrieve and web mining. Some research oriented projects will be given according to students’ background knowledge. The contents of the course will mix with lectures, tutorials, and group discussions.


