The dramatic increase in the amount of data that is available on the Web in recent years means that automatic methods of Information Retrieval (IR) have acquired greater significance. Furthermore, this data exists in multiple forms (text, image, video, etc) and it is becoming increasingly important that the techniques deployed in IR are able to perform search and retrieval operations across these distinct formats. For the purpose of this course IR is the study of the indexing, processing, and querying of both textual and image data.
The aim of the course is to provide an introduction to the basic principles and techniques used in IR; to demonstrate how statistical models of language can be used to solve the document retrieval problem; to explore a range of image processing techniques used in IR; and to show how combined models for language and image processing can enhance document retrieval.
Information retrieval (Text Processing)
Text representation and processing
Retrieval models (Boolean, vector space, language model)
Indexing
Evaluation
Relevance feedback - real feedback, pseudo-relevance feedback
Document and concept clustering - hierarchical clustering, k-means
Web retrieval - Page rank, difficulties of Web retrieval
Document clustering
Information Retrieval (Image Processing)
Operations on images
Motion detection
Object recognition
Automatic image annotation and retrieval
Combined models of language and image processing