Time and place:
The course consists of two sessions:
Wednesday April 17th, 09:15-12:00, in seminar room Prolog, Ole-Johan Dahls hus
Friday April 19th, 09:15-12:00, in seminar room Prolog, Ole-Johan Dahls hus
Language:
English
Target audience:
UiO reseachers and students who want to get started with machine learning in R.
A video (approximately 25 minutes) has been prepared that might be useful for those that are completely new to machine learning, with example use-cases in research.
Prerequisites:
It is an advantage but not necessary that you are accustomed to writing code in R. Basic knowledge of descriptive statistics and tidyverse is a plus.
Contents:
- Exploratory data analysis
- Binary classification
- Feature importance
- Multiclass classification
- Cross-validation
- Additional topics
- Preprocessing data with "recipe"
- Building and evaluating multiple models
simultaneously - Statistically comparing models
- Hyperparamater tuning
- Predicting a continuous variable
Briefly about the course:
The focus will be on building and evaluating machine learning models in R rather than an in-depth breakdown of specific algorithms. We will be building models to distinguish between different categories of text based on linguistic features (including number of nouns, adjectives, etc.) using XGBoost.
Important: Participants must use their own PC or Mac (laptop) with both R and RStudio installed. Both R (≥ 3.3.0) and RStudio are free and do not require a licence. R can be installed from https://cran.r-project.org and RStudio from https://www.rstudio.com/products/rstudio/download/.
Contact IT-support from your faculty or department if you need help with installation. You can use UiO Programkiosk ("Statistikk fullskjerm") if it is not possible to install either R or RStudio on your own computer.
Install the following packages in R(studio) before the start of the course:
tidyverse, tidymodels, xgboost, vip, patchwork, workflowsets
*extra packages* doParallel, discrim, kernlab