Data Science Analytics Group
Accelerating integration of data science in research projects
The core mission of the Data Science Analytics Group is to accelerate the integration of data science technologies – including data management, high-performance computing, machine learning, artificial intelligence, and interactive data analysis into research projects. We have a close collaboration with the IT Department at the University of Oslo (UiO) in the application of the existing infrastructure and knowledge around data and computational science.
Our aim is to bolster competitiveness and increase productivity of the researchers at UiO as well as among our dScience partners. In support of this vision, we concentrate our expertise and resource on two primary areas:
2. Direct involvement in data and computational science aspect of research projects
Most of the courses and workshops are developed upon encountering commonly raised issues by researchers. These workshops are tailored to address the current challenges researchers are facing. Here is a list of workshops that are regularly held by Data Science Analytics group (the list will be evolving as we are preparing material for more workshops):
- Version control using Git and GitHub
- Project management for PhD students
- Building packages in R
- Building packages in Python
- HPC data visualisation using ParaView
- Introduction to modern tools and packages in R
- Getting up-to-speed with Python
The workshops are open to all PhD students, Postdoc and researchers within the Faculty of Mathematics and Natural Sciences.
Direct involvement in research projects?
Every year around April, dScience issues a support call where research groups within the Faculty of Mathematics and Natural Sciences can send us a structured application describing the current data and computational challenge the groups face. Depending on the scope of the current challenge and its relevance to our team’s expertise and capacities, we assess the possibility of granting the support. Once the support project is deemed feasible, we collaborate with research groups to design an action plan to tackle the challenge at hand. The scope of this support is categorized in three main pillars: “Data preparation”, “Data analysis” and “Dissemination” with detailed sub-categories as follow:
Data preparation
- Data storage and access (systematic storage and sharing of data)
- Big data (challenges with storage and backup and memory for processing)
- Data organization and management (need to develop a system)
- Data pipeline and automation (use the data efficiently and effectively in pipeline)
- Data wrangling and transformation (prepare and structure data for analysis)
Data analysis
- High performance computing
- Programming support (need limited programming support on particular issues)
- Prototyping and implementation (turning your idea into a model)
- Statistical data analysis (you know what but don’t know how)
- Machine learning (from brainstorming on ideas to implementation and interpretation)
- Exploratory data analysis (need to explore and understand your data)
Dissemination
- Data visualization and communication (effective visualization of results for publications)
- Platform development (turning code to packages and graphical user interface)
The direct involvement support is open to all researchers and group leaders within the Faculty of Mathematics and Natural Sciences. If you have any questions, please send us an email at contact@dscience.uio.no.
Projects
Insight Oslofjord
Insight Oslofjord is an online platform using data from student-led boat trips on the Oslofjord, organized by Inspiria Science Center. The data, collected with sensors, is processed and visualized in an R Shiny app. The platform supports both education and research by providing tools for data exploration and hypothesis testing.
Dr?bak Lander
The lander is an underwater observatory in Dr?bak sund, Oslofjord, monitoring fish and currents as part of the Frisk Oslofjord project. In collaboration with dScience, data is collected for research, with findings shared through the Insight Oslofjord app for educational use.
GeoVis
The ‘Geovis’ project, initiated by METOS researchers at the University of Oslo, aimed to develop a custom 3D visualization tool for climate datasets, improving upon the limitations of existing tools like “ncview” by enabling geographic projections and enhanced user interactions for better communication of complex climate data.
OCR-GPT pipeline
The OCR-GPT pipeline was developed to improve text data preparation for the IUROPA project by combining Adobe’s OCR for better paragraph recognition and GPT-4 for correcting character recognition errors in older EU court judgments.
dScience collaborates closely with USIT – The University Centre for Information Technology and their Department for Research Computing.