Syllabus

The syllabus consists of

  • Weekly slides
  • Weekly exercises
  • Mandatory assignments
  • Readings

The detailed syllabus for each week is listed under each week page.

This is an overview of the mandatory readings, so far. For recommended readings, exercises etc. see the weekly pages.

Jurafsky and Martin, Speech and Language Processing, 3. ed. (edition of 21 Sept 2021!)

  • For chapters 1-6, there are only smaller corrections in the 2021 edition compared to the edition of 30 Dec. 2020
  • Ch. 2 Regular expressions, etc.
    • Sec. 2.0
    • Sec. 2.2 Words
    • Sec. 2.3 Corpora
    • Sec. 2.4 Normalization, except 2.4.3 and the technical details of 2.4.1
    • Sec 2.5 Edit distance
  • Ch. 3, "N-gram Language Models"
    • Sections 3.0-3.4
  • Ch. 4,  "Na?ve Bayes Classification and Sentiment"
    • Except (for now) section 4.9 Statistical significance testing
  • Ch. 5,  "Logistic Regression"
    • Except some of the technicalities of sections 5.3, 5.4, 5.5, 5.8
  • Ch. 6, "Vector Semantics and Embeddings", everything except
    • Not section 6.6 Pointwise Mutual Information (PMI)
  • Ch. 7 "Neural Networks and Neural Language Models"
  • Ch. 8 "Sequence labeling"
    • Sec 8.0-8.2
    • Sec. 8.4 "HMM POS tagging"
      • Except 8.4.5-8.4.6 "The Viterbi Algorithm"
    • Sec. 8.5 CRF
    • Sec. 8.7-8.8
  • Ch. 9 Deep Learning Architectures for Sequence Processing
    • Sec. 9.1-9.5
  • Ch. 10 Machine Translation and Encoder-Decoder Models
    • Sec. 10.0, 10.2-10.4
  • Ch. 18, "Word Senses and Word Net"
    • Sec. 18.0-18.3
  • Ch. 24, "Dialogue systems and chatbots,
    • Sections 24.1-24.6
  • Ch. 25, "Phonetics"
    • Sections 25.1-25.5 (excluding the details not discussed in class)
  • Chap 26, "Speech Recognition and ASR"
    • Sections 26.1 and 26.5 (excluding the part on statistical significance)

NLTK Book

  • Ch. 3, sec. 6 Normalizing Text
  • Ch. 3, sec. 8 Segmentation
  • Ch. 5, sec. 1 Using a tagger
  • Ch. 5, sec. 2 Tagged corpora

Wikipedia

Other:

Published Aug. 18, 2021 12:16 PM - Last modified Dec. 2, 2021 2:28 PM