Wen
24 Aug |
Introduction, Machine Translation
Lecturer: Jan Tore
Mandatory reading
- Jurafsky and Martin, Speech and Language Processing, Ch. 25-25.2
- Koehn, Statistical Machine Translation (K:SMT), Ch. 1
|
Thu
25 Aug
|
Probabilities
Lecturer: Jan Tore
Recommended reading
It is a good idea to repeat the parts from INF1080 Logic on "Kombinatorikk"
|
Wen
31 Aug |
Machine translation evaluation
Mandatory reading
- Jurafsky and Martin, Speech and Language Processing, Ch. 25.9
- Koehn, Statistical Machine Translation Ch. 8 up to sec. 8.2.3 BLEU
|
Thu
1 Sep |
Probabilities and mathematical notation
- Probability
- Conditional probability
- Bayes’ rule
- Independent events
|
Wen
7 Sep
|
MT evaluation, the noisy channel model, language models
Presentation
Mandatory reading
- Koehn, Statistical Machine Translation Ch. 8, sec. 8.2 and 8.4, with errata page
- Jurafsky and Martin, Speech and Language Processing, Sec. 25.3-25.4
- Koehn, Statistical Machine Translation Ch. 4, sec. 4.3
Recommended reading
- Koehn, Statistical Machine Translation sec 8.3 (for they with sufficient background)
- Kishore Papineni et. al.: "Bleu: a Method for Automatic Evaluation of Machine Translation" in Fulltext.
- Chris Callison-Burch, Miles Osborne and Philipp Koehn: "Re-evaluation the Role of Bleu in Machine Translation Research". Fulltext
|
Thu
8 Sep |
|
Wen
14 Sep |
Word-based models and alignment
Presentation
Mandatory reading
- Jurafsky and Martin, Speech and Language Processing, Sec. 25.4-25.6
- Koehn, Statistical Machine Translation Ch. 4:
- Sec 4.1-4.2 (except the technical details of 4.2.4)
- Sec 4.4 (except technical details, to be explained)
- Sec 4.5-4.6
|
Thu
15 Sep |
No class |
Wen
21 Sep |
More on alignment, higher-order models
|
Thu
22 Sep |
Completing obligatory assignment 1
|
Fri
23 Sep |
|
Wen
28 Sep |
Phrase-based alignment
Presentation
Mandatory reading
- Jurafsky and Martin, Speech and Language Processing, Sec. 25.4, 25.6
- Koehn, Statistical Machine Translation Ch. 5, except
- The technical details of sec. 5.3.3-5.3.6
- Sec. 5.5
|
Thu
29 Sep |
|
Wen
5 Oct |
Decoding
Presentation
Mandatory reading
- Jurafsky and Martin, Speech and Language Processing, Sec. 25.8
- Koehn, Statistical Machine Translation Ch. 6: sec 6.0-6.3
|
Thu
6 Oct |
|
Wen
12 Oct |
Refinements
Presentation
Mandatory reading
- Koehn, Statistical Machine Translation Ch2,
- Sec 2.1.1 Tokenization,
- Sec. 2.3 Corpora
- Koehn, Statistical Machine Translation Ch9,
- Sec. 9.0
- Sec. 9.2, except 9.2.4
- Sec. 9.3 up to "finding threshold points", p. 266
- Koehn, Statistical Machine Translation Ch10,
- Sec. 10.0
- Sec. 10.1.1 and 10.1.2
- Sec. 10.2
Recommended reading
- Koehn, Statistical Machine Translation Ch2,
|
Thu
13 Oct |
No class! Dagen at IFI |
Wen
19 Oct |
Alternative translation strategies
|
Thu
20 Oct |
Work on obligatory assignment 2 |
Distributional Semantics: Extracting Meaning from Data
|
Wen
26 Oct
|
Introduction: linguistic foundations of distributional semantics
Lecturer: Andrei Kutuzov
Presentation
Mandatory Reading
- (Optional) Distributional Structure. Zellig Harris, 1954.
- The Distributional Hypothesis. Magnus Sahlgren, 2008.
- Speech and Language Processing. Daniel Jurafsky and James Martin. 3rd edition draft of April 9, 2016. Chapter 15, 'Vector Semantics'.
- From Frequency to Meaning: Vector Space Models of Semantics. Peter Turney and Patrick Pantel, 2010. Skip Section 5.
|
Thu
27 Oct
|
|
Wed
2 Nov
|
Distributional and distributed: inner mechanics of modern word embedding models (including word2vec)
Lecturer: Andrei Kutuzov
Presentation
Mandatory reading
- (Optional) A neural probabilistic language model. Bengio, Yoshua, et al., 2003
- Extracting semantic representations from word co-occurrence statistics: A computational study. Bullinaria, John A., and Joseph P. Levy, 2007
- Distributed representations of words and phrases and their compositionality. Mikolov, Tomas, et al., 2013.
- Word2vec parameter learning explained. Rong, Xin, 2014
- Speech and Language Processing. Daniel Jurafsky and James Martin. 3rd edition draft of April 11, 2016. Chapter 16 `Semantics with dense vectors'.
- (Optional) Glove: Global Vectors for Word Representation. Pennington, Jeffrey, Richard Socher, and Christopher D. Manning, 2014.
|
Thu
3 Nov
|
No class |
Wed
9 Nov
|
Practical aspects of training and using distributional models
Presentation
Mandatory reading
- Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. Marco Baroni, Georgiana Dinu, and Germán Kruszewski. ACL 2014.
- Improving Distributional Similarity with Lessons Learned from Word Embeddings. Omer Levy, Yoav Goldberg, and Ido Dagan. TACL 2015.
- SimLex-999: Evaluating Semantic Models with (Genuine) Similarity Estimation. Felix Hill, Roi Reichart and Anna Korhonen. Computational Linguistics. 2015
- (Optional) Correlation-based Intrinsic Evaluation of Word Vector Representations. Yulia Tsvetkov, Manaal Faruqui, and Chris Dyer. RepEval, ACL 2016
- Word2vec in Gensim tutorial
- (Optional) Vector representation of words in TensorFlow
|
Thu
10 Nov
|
Work on obligatory assignment 3.
Slides on setting up your isolated Python environment at IFI cluster.
|
Wed
16 Nov
|
Beyond words: distributional representations of texts
Presentation
Mandatory reading
- Distributed Representations of Sentences and Documents. Quoc Le, Tomas Mikolov. ICML 2014
- Learning Distributed Representations of Sentences from Unlabelled Data. Felix Hill, Kyunghyun Cho, Anna Korhonen. arXiv:1602.03483, 2016
- (Optional) Composition in Distributional Models of Semantics. Jeff Mitchell, Mirella Lapata. Cognitive Science, 2010
- (Optional) Document Classification by Inversion of Distributed Language Representations. Matt Taddy. ACL 2015
- (Optional) An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation. Jey Lau, Timothy Baldwin. ACL 2016
|
Thu
17 Nov
|
Work on obligatory assignment 3. |
Wed
23 Nov
|
Kings and queens, men and women: semantic relations between word embeddings
Presentation
Mandatory reading
- Exploiting similarities among languages for machine translation. Tomas Mikolov, Quoc Le, Ilya Sutskever. arXiv:1309.4168, 2013
- Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change. William Hamilton, Jure Leskovec, Dan Jurafsky. ACL 2016
- (Optional) Do Supervised Distributional Methods Really Learn Lexical Inference Relations? Omer Levy et al. NAACL 2015
|
Thu
24 Nov
|
No class |
Wed
30 Nov
|
What's going on: recent advances and trends in the word embeddings world
(+ exam information and discussion of the obligatory assignment)
Presentation
Mandatory reading
- (Optional ) Defining words with words: beyond the distributional hypothesis. Pontus Stenetorp et al. RepEval, ACL 2016.
|
Thu
1 Dec
|
Discussion of exam-like problems.
Attention: the room has changed to Datastue Fortress!
|
|
|
Thu
8 Dec
|
Discussion of exam problems in MT.
Attention: the room has changed to seminar room Pascal!
|