Project 1: Predicting monthly number of battle deaths (Gudmund Hermansen - gudmunhh@math.uio.no). In this project you will explore various statistical and machine learning models for modelling the monthly number of battle deaths in a given country. We will work with conflict data from https://viewsforecasting.org/ which is based on https://ucdp.uu.se/ (a comprehensive database on conflict data) and investigate battle deaths time series for several countries. The main focus of the project is to make a prediction model for the number of battle deaths in the following month, and the techniques for evaluating and comparing performance of competing prediction models. Note that no prior knowledge of time series analysis is required, and a significant part of the project will be about practical data analysis and exploration.
Project 2: Ridge regression for spare data (Gudmund Hermansen - gudmunhh@math.uio.no). In this project, you will work through the main theory behind what is known as ridge regression. Ridge is one (of several) popular techniques for regularisation used in statistics and machine learning. We can interpret ridge as giving additional weights (which could also be zero) to the input features, or covariates, in a regression model. The project will be based on the lecture notes https://arxiv.org/pdf/1509.09169.pdf. In addition to understanding the theory underlying ridge regression, the main focus of this project is to understand the effect of regularisation with sparse input data. Imagine estimating a linear regression model when some of the input features are mostly 0 (sparse). Regularisation techniques, such as ridge, has a tendency to give such sparse features either too high or too low weight, and we will investigate potential solutions to this problem.
Project 3: Machine learning and high frequency financial time series (Gudmund Hermansen - gudmunhh@math.uio.no). In this project you will compare more traditional statistical models developed for high frequency financial time series with competing methods from machine learning. You will work with several examples of high frequency tick data from foreign exchange, and explore possibilities and limitations of both approaches.
Project 4: Principal component analysis (PCA) for high frequency data (Gudmund Hermansen - gudmunhh@math.uio.no). This project is based on the work in http://galton.uchicago.edu/~mykland/paperlinks/PCA_v20180204.pdf. The first part of the project will be to read, understand and summarise the paper. The second part of the project focuses on implementation of the PCA method on high frequency foreign exchange data.
Project 5: Applied data analysis and statistical modelling for a kaggle-like competition or dataset. (Gudmund Hermansen - gudmunhh@math.uio.no). Within this project applied data analysis and predictive modelling will be carried out. A student is allowed to choose a competition or a data-set of interest for him/her on one of the popular data science platforms: https://www.kaggle.com/, https://www.topcoder.com/thrive/tracks?track=Data%20Science or https://archive.ics.uci.edu/datasets. Then preliminary data analysis should be performed, followed by careful statistical modelling, inference and eventually evaluation of predictions and explaining the results.
Project 6: Climate change and rain (Thordis Thorarinsdottir - thordist@math.uio.no). According to climate projections, Norway will become warmer and wetter over the course of the century. There will be more precipitation overall, and events with heavy rainfall will increase in magnitude and occur more frequently. In this project, we will investigate to which extent such changes can already be observed in Oslo.
We will use statistical methods to analyse over 50 years of hourly and daily observed precipitation data from Oslo Blindern. We will investigate various aspects of precipitation patterns related to both how often it rains and how much it rains each time, and search for changes in each of these patterns.
Project 7: Analysis of genetic data (Geir Storvik - geirs@math.uio.no). There are a lot of freely available datasets relating genetic data to different outcomes (diseases, expressions etc). In this project, the aim will be to download one of these datasets and apply some of the methods you have learned in introductory courses as well as looking at some more advanced methods. The main aim will be to evaluate strengths and weaknesses with the chosen methods for the particular application.
Project 8: Stochastic analysis and finance and insurance and risk. Students that are interested in a project within stochastic analysis, finance or insurance and risk should contact Gudmund Hermansen (gudmunhh@math.uio.no) for more information.
Project 9: Simulation of Particles in the Solar Atmosphere (Geir Storvik & Gudmund Hermansen - geirs@math.uio.no). Simulation of particles in the solar atmosphere project description (in Norwegian).