INF-STK5010 – Statistical Bioinformatics - Learning from big data in the life sciences
Course description
Schedule, syllabus and examination date
Course content
The focus of the course is statistical and bioinformatical methods commonly used to analyze big datasets found within the life sciences. Examples are taken from single nucleotide polymorphism (SNP) analyses, high throughput sequence analyses, expression analyses, genome wide association studies (GWAS) and other genomics related fields. Some of the methods taught are quality control of big datasets, statistical testing and significance for big datasets, clustering, classification, multidimensional analysis, visualization, resampling, bootstrapping and network analysis. The statistical environment R will be used. The course is intended both for students with a background in biology or medicine wishing to extend their knowledge into the statistical domain, and for statisticians or computer scientists who wish to gain insight into how statistical methods are used for big datasets in general and within biology in particular.
Learning outcome
- The student will get an overview of the unique statistical problems that accompany big datasets within the life sciences.
- The student will gain knowledge about how big datasets can be statistically and bioinformatically analyzed.
- By the end of the course you will be able to write your own workflows in R and use several already existing R packages to analyze your own big datasets.
Admission
Students who are admitted to study programmes at UiO must each semester register which courses and exams they wish to sign up for in Studentweb.
If you are not already enrolled as a student at UiO, please see our information about admission requirements and procedures.
Prerequisites
Recommended previous knowledge
No formal requirements. The two first weeks will consist of introductory lectures. Students with a background in statistics or computer science will get an introduction to biology and biological datasets. Students with a background in biology or medicine will get an introduction to statistics and the R environment. The two sets of introductory lectures will not take place at the same time hence students that want to follow both introductions are welcome to do that.
Overlapping courses
- 10 credits overlap with INF-STK9010 – Statistical Bioinformatics - Learning from big data in the life sciences (discontinued)
- 3 credits overlap with STK4030 – Statistical Learning: Advanced Regression and Classification (discontinued)
- 3 credits overlap with STK9030 – Statistical Learning: Advanced Regression and Classification (discontinued)
It is possible to take a Special Syllabus with three credits if needed.
Teaching
2 hours of lectures and 2 hours of exercises per week. Mandatory assignments must be completed during the course. Rules for mandatory assignments.
Examination
4 hour written exam. The mandatory assignments must be approved prior to the exam.
Examination support material
No examination support material is allowed.
Language of examination
The examination text is given in English, and you submit your response in English.
Grading scale
Grades are awarded on a scale from A to F, where A is the best grade and F is a fail. Read more about the grading system.
Explanations and appeals
Resit an examination
Students who can document a valid reason for absence from the regular examination are offered a postponed examination at the beginning of the next semester.
Re-scheduled examinations are not offered to students who withdraw during, or did not pass the original examination.
Special examination arrangements
Application form, deadline and requirements for special examination arrangements.
Evaluation
The course is subject to continuous evaluation. At regular intervals we also ask students to participate in a more comprehensive evaluation.