Abstract
Insight Oslofjord is an online analytical platform built upon the data collected in the context of the Frisk Oslofjord project. As organized by Inspiria Science Center, students from different high schools around Oslo, attend daily exploratory boat trips, conduct excursions into the Oslofjord and record their observation in the Marine Citizen Science website. The boat is equipped with various sensors provided by Kongberg Discovery. which collect live data from the water column and weather condition along the trip. The data from different sources are pulled to an Educloud repository using various APIs and preprocessed there. The data is further analyzed and integrated using a multi-step automated data pipeline. The results are visualized in a R Shiny application which is accessible to students as an online solution as complement to teaching material. Furthermore, domain researchers can use this platform as a data exploration tool to form and test their data driven hypothesis and later their findings can be integrated into the same platform to enrich its content.
Background
Oceans play a crucial role in preserving life on earth. They absorb 25% of human produced CO2 and produce around 50% of the O2 in the atmosphere, at the same time, act as giant thermo-stabilizers for the earth. Therefore, alteration in ocean ecosystems caused by various human activities can have catastrophic implications in the long run. The first step in the effort to prevent alterations and preserve the ocean is monitoring and observation. The Oslofjord as one of the portals to the ocean, and an area heavily influenced by human activities is an interesting environment for such observations. The Frisk Oslofjord project, initiated by the Institute of Marine Research, is a step towards better observation and management of the ocean with three well-defined overall goals:
- Generate new knowledge and a basis for future management of the fjord
- Test and verify new technologies for monitoring
- Communicate to the policy-makers, general public and especially younger generations
Methodology
Due to the multi-faceted nature of in situ ocean observations, data collection and analysis play a very important role. The data that are used in the Insight Oslofjord application are collected by the Inspiria Science Center in collaboration with Kongsberg Discovery and are analyzed by data scientists and domain experts at the dScience Center at the University of Oslo. Data from different sources are initially pulled to an Educloud repository at the University of Oslo using “Blue Insight” automatic data streaming service and various APIs. Afterwards, the machine-native sensor data are preprocessed and combined with the data that students collect. At this point, the data adds up to around 6 GB per day. To make the analysis more efficient, we extract the relevant parameters from the data and transform it into a tidy data format which is reduced to around 30 KB per day.
The tidy data are used to produce various visualization inside an R Shiny application. This pipeline is a multi-step analysis procedure including several R, Python and Shell scripts that are run sequentially, starting with raw data and ending with the final product. The pipeline functions automatically and upon arrival of new data all the preprocessing, analysis and visualization steps take place inside Educloud. The automatic data analysis pipeline and setting up the persistent servers on Educould was done in a close collaboration with the IT Department at the University of Oslo.
The application includes four main modules and one complementary module visualizing data from the Lander observatory (link to the lander project page). In the first tab (Hjem), students get an initial explanation of the application. After selecting the date of their excursion, the application loads the targeted data for which they can view initial details such as name of the school, area of exploration and number of observations. In the second tab (Ditt tokt), the geolocation of the trip along with student observations and echosounder signal are visualized.
In the third tab (Dine data), students can see details of the collected data as well as secondary extracted parameters along the time for the entire trip. Finally, in the fourth tab (Analyser data), students can plot all the parameters and explore the correlations between them visually. They can produce up to eight different plots with different parameters and settings in each round of analysis.
The visualization in the fifth tab (Fjordobservatoriet) is explained here. Based on the feedback from students and teachers who use the application, the content will be enriched by domain experts and the features of the platform are evolving to fit the educational need better. Furthermore, researchers can use this platform to explore the data and test and/or form hypothesis for further analysis. They can integrate the result of their research into this platform as well and visualize their findings interactively. Once mature enough such platform can be a great portal to communicate important findings to public and policy makers effectively.