Syllabus

Textbook: Azzalini & Scarpa (2012). Data Analysis and Data Mining: An Introduction. Oxford University Press

Syllabus:

For all chapters, the understanding/interpretation of the different methods will be the main focus. When mathematics is present, some points that you should at least know is listed. In addition, you should look at the exercises including mathematical derivations.

In addition to the chapters listed below, all exercises given are also part of the syllabus.

Some links to relevant alternative reading in the ISLR book is given. Note however that not all topics we have been through is covered there.

Ch. 1: Introduction (not sec 1.1.3)

  • Mainly describing the setting
  • a statistical versus a machine learning approach;
  • Alternative reading: Ch 1 and sec 2.1 of ISLR

Ch. 2: A–B–C (not sec 2.2.1)

  • linear regression;
    • Mathematics: You should be able to derive the recursive equations in sec 2.2.3 and also that for linear regression the ML estimates and LS estimates coincide
    • Alternative reading: Ch 3 of ISLR
  • logistic regression
    • Mathematics: You should be able to derive the likelihood
    • Alternative reading: Sec 4.2-3 of ISLR

Ch. 3: Optimism, Conflicts, and Trade-offs

  • optimism, overfitting and bias-variance trade-off;
    • Mathematics: You should be able to derive that MSE can be expressed by bias and variance and the derivations related to ridge/lasso
  • Alternative reading: Sec 2.2 and sec 5.1 of ISLR

Ch. 4: Prediction of Quantitative Variables (excluding secs 4.2.3, 4.4.5 and 4.7.2)

  • Mathematics: Derivation of the weighted least square
  • Alternative reading: Ch 3, 6,7,8 of ISLR
  • Sec 4.10 is mainly support material, details about the specific case studies are not part of the syllabus

Ch. 5: Methods of Classification (not 5.2.4)

  • Mathematics: Deriving the class-probabilities in different settings, deriving the classification rules for LDA, QDA
  • Alternative reading: Ch 4, 8, 9 of ISLR
  • Sec 5.10 is mainly support material, details about the specific case studies are not part of the syllabus

Ch. 6: Methods of Internal Analysis

  • Only Section 6.1, clustering
  • Model based clustering, as presented in lecture notes
  • Alternative reading: Ch 10 of ISLR

All given exercises

Published May 31, 2021 8:28 AM - Last modified May 31, 2021 8:28 AM