Pensum/l?ringskrav

L?rebok/textbook: Azzalini & Scarpa (2012). Data Analysis and Data Mining: An Introduction. Oxford University Press

Pensum/syllabus:

Ch. 1: Introduction

  • Big Data and high-dimensional data;
  • a statistical versus a machine learning apporach;

Ch. 2: A–B–C

  • linear regression;
  • variable transformations;
  • multivariate responses;
  • computational aspects;
  • likelihood-based approaches;
  • logistic regression.

Ch. 3: Optimism, Conflicts, and Trade-offs

  • optimism, overfitting and bias-variance trade-off;
  • data split and cross-validation methods;
  • bootstrapping;
  • information-based criteria;
  • methods for variable selection;
  • principal component analysis and principal component regression;
  • methods of regularization (mainly lasso and ridge regression);

Ch. 4: Prediction of Quantitative Variables

  • k-nearest-neighbors and kernel-based methods;
  • the curse of dimensionality;
  • splines;
  • additive models and generalized additive models;
  • projection pursuit;
  • regression trees;
  • neural networks;

Ch. 5: Methods of Classification

  • classification and performance evaluation;
  • logistic regression;
  • classification via linear regression;
  • linear discriminant analysis;
  • quadratic discriminant analysis;
  • regularized appraoches for classification;
  • k-nearest-neighbor for classification;
  • classification trees;
  • neural networks for classification;
  • support vector machines;
  • bagging, AdaBoost and random forest (only general ideas);

Ch. 6: Methods of Internal Analysis

  • cluster analysis;
  • distances and dissimilarities;
  • nonhierarchical methods;
  • hierarchical methods.
Publisert 12. des. 2018 20:41 - Sist endret 17. apr. 2019 16:52