MF9385 – Introduction to machine learning in biomedical research

Schedule, syllabus and examination date

Course content

The course will provide an introduction to machine learning (ML), with a focus on methods commonly encountered in biomedical research. Modern biomedical research is characterised by a large abundance of data from diverse sources, i.e. from medical imaging, large-scale in vitro drug screens, or various omics outputs from high-throughput technologies such as massive parallel sequencing. Statistical and machine learning methods are essential to analyse such data by exploiting in full their explanatory and predictive potential. For example, UMAP and other unsupervised ML methods help visualise single-cell RNA-seq data to identify different cell types; multi-omics biomarker signatures derived by supervised ML methods like random forests can predict a patient’s survival prognosis or treatment response; and both supervised and unsupervised deep learning methods can be used to segment and interpret medical images. Nonetheless, a critical understanding of the aim, characteristics, and potentialities of the previously mentioned ML methods is essential to be able to use them correctly.

This course will introduce you to this fascinating and quickly evolving field of science, starting with an overview of unsupervised, supervised and reinforcement machine learning - and of typical tasks where ML is important such as pattern recognition, prediction, and classification. We will discuss available choices of loss functions, their importance and how to decide what makes a good machine learning model. Setups for correct (unbiased) evaluation of the performance of a model will be introduced, including training/test splits and resampling methods like cross-validation.

?We will dive into some of the most commonly used ML methods in more detail, with a focus on concepts and key properties rather than mathematical detail, and we will identify some of the main characteristics that make these methods so successful. ?

For unsupervised learning, we will cover two common tasks that are important to pattern recognition:

  1. Clustering, e.g. k-nearest neighbours, hierarchical clustering, partitioning methods, self-organising maps
  2. Dimension reduction, e.g. principal component analysis, non-negative matrix factorisation, independent component analysis, uniform manifold approximation and projection (UMAP)

For supervised learning, the main task is often prediction or classification. While many methods exist, their success is due to the same key building blocks:

  1. Regularisation, e.g. in penalised regression, Bayesian hierarchical models
  2. Tree-based methods, e.g. in random forests
  3. Kernel-based methods, e.g. in support vector machines
  4. Ensemble methods, e.g. boosting, bagging

While the unsupervised and supervised ML methods mentioned above are static, i.e., they assume that all the data are available at the same time and can be used for training the model, reinforcement learning methods can be used if the data are constantly being updated over time (streaming data).

Deep neural networks (deep learning) have revolutionised ML in recent years. The availability of huge amounts of data and increasing computing power have enabled the training of enormously complex multi-layer neural networks with millions of features. Such methods have drastically improved model performance, which has resulted in the rise of artificial intelligence in many areas of daily life, including medicine. Examples are convolutional neural networks for medical image analysis and transformers in large language models.

Learning outcome

Knowledge

  • Understand the concepts supervised learning, unsupervised learning, and reinforcement learning,
  • Know which steps are involved in a typical task involving ML and which pitfalls need to be avoided,
  • Understand the issue of overfitting, cross-validation, training, test, and validation sets.
  • Be able to identify appropriate methods for a given problem, and to perform common ML tasks using appropriate R packages,
  • Be able to assess ML methods you read about and to put them in the wider context,
  • Be able to assess the performance of models, as they are typically reported in publications.

Skills

  • Be able to specify appropriate ML models for new research questions.
  • Implement ML models using appropriate R packages.
  • Be able to understand and critically evaluate ML procedures in biomedical research publications. ????????

Admission to the course

Applicants?admitted to a PhD programme at UiO?sign up for classes and exam to this course in?StudentWeb.

Applicants who are?not admitted to a PhD programme at UiO?must apply for a right to study before they can sign up for classes and exam to this course. See information here:?How to apply for a right to study and admission to elective PhD courses in medicine and health sciences

Applicants will upon registration receive an immediate reply in?StudentWeb?as to whether a seat at this course is granted or not.

Formal prerequisite knowledge

Passed exam in an introductory course in statistics, which includes multiple regression (e.g. MF9130/ MF9130E at UiO).

A solid background in statistics is beneficial, e.g. through successful participation in more advanced statistics courses beyond a single introductory course.

We expect students to have a working knowledge and practical experience in analysing data with?thestatisticalprogramminglanguageR?and developmentenvironment RStudio.

Overlapping courses

Teaching

The teaching will be organized as an intensive course over five days.

There will be lectures coupled with hands-on practical sessions and example data analyses in the computer labs as well as group project work.

Students should allocate sufficient time for course preparations, including recommended reading.

You have to participate in at least 80 % of the teaching to be allowed to take the exam. Attendance will be registered

Examination

Individual written assignment in the form of a comprehensive practical data analysis task. To be submitted four weeks after the course.

The candidates will be required to submit:

a) a paper that describes and interprets the data analysis results, and

b) all code with documentation.

Both parts will be included in the exam evaluation.

Language of examination

The examination text is given in English, and you submit your response in English.

Grading scale

Grades are awarded on a pass/fail scale. Read more about the grading system.

More about examinations at UiO

You will find further guides and resources at the web page on examinations at UiO.

Last updated from FS (Common Student System) Dec. 24, 2024 3:37:41 AM

Facts about this course

Level
PhD
Credits
5
Teaching
Spring

Sign up period spring 2025: ?StudentWeb?opens for registration 1.12.2024. The sign up deadline is published on the semester page.

Teaching: Dates will be posted on the semester page.

Examination
Spring
Teaching language
English