|Lecture hours||3 hours|
|Lab hours||2 hours|
|Digital resources||View on Aristarchus (Open e-Class)|
The aim of this course is learning fundamental concepts of information retrieval systems. The course’s contents cover all stages of system design and implementation for collection, indexing and searching of text documents, as well as evaluation methods. In addition, recent trends in information retrieval are also covered, for example information retrieval from the WWW.
Upon successful completion of the course, the students will be in position:
- to know representation models for text documents.
- to use techniques for indexing, compression, retrieval and scoring of documents.
- to develop applications that manage large volumes of text.
- to build the functionality of a search engine.
- to apply machine learning techniques for text classification.
- Introduction and basic IR concepts
- System architecture of IR systems
- Dictionaries and inverted indexes
- Construction and compression of dictionaries
- Information retrieval models (boolean model, vector space model, probability models)
- Scoring and ranking documents
- Language models
- Information retrieval from XML documents
- Basic concepts of information retrieval from the WWW
- Web crawling and indexing
- Text classification with machine learning techniques, support vector machines, algorithms for text classification
- Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008.
- Ricardo A. Baeza-Yates and Berthier Ribeiro-Neto. 1999. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.