Students should be able to:
- Understand/describe the basics of language modelling, tokenization, normalisation, stemming, lemmatization and Parts-of-Speech (POS) tagging.
- Compare between Bag-of-Words, N-Grams, TF-IDF and learned word embedding-based representations of text.
- Understand/describe common algorithms for question answering, text classification, text/document summarization, sentiment analysis, sentence similarity estimation, speech recognition and neural machine translation.