Overview
HF1 is a Hardanger fiddle dataset with polyphonic performances spanning five different emotional expressions: normal, angry, sad, happy, and tender. The performances thus cover the four quadrants of the activity/valence-space. The onsets and offsets, together with an associated pitch, were human-annotated for each note in each performance by the fiddle players themselves. First, they annotated the normal version from scratch using the software Annotemus (see below). These annotations were then transferred to the expressive performances using music alignment and finally human-verified. Two separate music alignment methods based on image registration were developed for this purpose, as presented in the associated ISMIR 2021 paper. The strategy gives us a few advantages:
- Does not introduce bias concerning timing.
- Ensures that annotators annotate songs they are familiar with. It is easier to be accurate and efficient when annotating a song that you are familiar with, and note sheets are not exhaustive since they do not cover the rich ornamentation in Hardanger fiddle music.
- Provides five times the training and testing data for polyphonic transcription.
- Creates a dataset that can be used for additional tasks in future studies. Our experimental design provides us with both audio and symbolic data of performances with varying emotional expressions.
- Enables us to scale future annotation tasks within the same framework.
The dataset will be used in MIR research for tasks involving polyphonic transcription, score alignment, beat tracking, downbeat tracking, tempo estimation, and classification of emotional expressions.
The final dataset is around 43 minutes long and consists of 19 734 notes of Hardanger fiddle music, across 40 stereo recordings of 8 tunes. The recordings were done by two Hardanger fiddle musicians, Astrid Garmo and Henrik Nordtun Gjertsen, who were students at the Norwegian Academy of Music. The annotations were done by the same musicians using our software Annotemus, developed within the context of the MIRAGE project (see the video presentation of Annotemus software).
Data and source code
Dataset (2.71 Gb)
Source code (93 Kb)
Acknowledgments
We thank Astrid Garmo and Henrik Nordtun Gjertsen for their fiddle performances and annotations. This study was supported by the Research Council of Norway through its Centers of Excellence scheme, project number 262762, and the MIRAGE project, grant number 287152. The ablation study was performed on resources provided by UNINETT Sigma2 - the National Infrastructure for High Performance Computing and Data Storage in Norway.