Aegis School of Business, Data Science, Cyber Security & Telecommunication

Aegis School of Business, Data Science, Cyber Security & Telecommunication

Application fee: 13.63 USD
Course fee: 340.85 USD
GST: 18 %

Information Retrieval

Application fee : 13.63 USD


Certification Body: Aegis School of Data Science
Location: Online Live interactive
Type: Certificate course
Director: Dr. Ilija Subasic
Coordinator: Ritin Joshi
Language: English
Course fee: 340.85 USD
GST: 18%
Total course fee: 402.2 USD
No Ratings


Course Details

Information retrieval

It is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches can be based on metadata or on full-text (or other content-based) indexing.
Automated information retrieval systems are used to reduce what has been called "information overload". Many universities and public libraries use IR systems to provide access to books, journals and other documents. Web search engines are the most visible IR applications.

Overview :

An information retrieval process begins when a user enters a query into the system. Queries are formal statements of information needs, for example search strings in web search engines. In information retrieval a query does not uniquely identify a single object in the collection. Instead, several objects may match the query, perhaps with different degrees of relevancy. An object is an entity that is represented by information in a database. User queries are matched against the database information. Depending on the application the data objects may be, for example, text documents, images, audio, mind maps or videos. Often the documents themselves are not kept or stored directly in the IR system, but are instead represented in the system by document surrogates or metadata. Most IR systems compute a numeric score on how well each object in the database matches the query, and rank the objects according to this value. The top ranking objects are then shown to the user. The process may then be iterated if the user wishes to refine the query.

Course Curriculum:
The course is organized in 24 lessons (4 hours per week ) in batches of two. It contains an overview of information retrieval theory and a practical part of using elasticsearch platform.
Week 1 (Introduction):

  • Introduction to information retrieval;
  • Example use cases;
  • Introduction to large web search and problems
  • Unstructured data
  • Boolean retrieval example

Week 2 (models I):

  • Term weighting
  • Vector space modeling (cosine distance)
  • Okapi bm25
  • Probabilistic models (KLdivergence)

Week 3 (models II+data preprocessing):

  • Topic modeling
  • Text classification
  • Linguistic processing (tokenization, nlp theory and resources (stemmers,
  • Wordnet, NER resolvers)

Week 4 (elasticsearch+project assignment):

  • Elasticsearch installing description
  • Hands on tutorial
  • Project assignment

Week 5 (web IR):

  • HITs, pagerank
  • Web search engine dissection (crawlers, scrapers)

Week 6 (IR evaluation + advanced topics/project discussions):

  • Precision/recall
  • DCG
  • Duplicate detection
  • Text summarization
  • Any student discussions.