Oluremi Falowo

Embodied Music Emotion Representation Learning with Deep Multimodal Networks

When

Thematic Session 3: Modeling and Analysis (Tuesday, 09:40)

Abstract

This presentation will outline a work in progress as part of a PhD project about embodiment for intelligent musical systems. The aim of the research is to investigate which specific concepts and mechanisms from embodied music cognition and embodied cognition as a whole can be applied to intelligent music systems. In a first study, the mechanism of multimodal representations will be explored in the context of a music emotion recognition task.

Music has previously been described as a sonic representation of a gestural language in the embodied music cognition field. The proposed study aims to investigate whether this representation can be learned by a deep neural network model through examining if using music gestural data can improve a deep learning model tasked with emotion recognition for music. Specifically, we will first train a multimodal recurrent network with both music signals and skeletal positioning data of participants freely moving to the music matched with valence arousal scores. The data for the study will be obtained from the AIST++ (Li et al., 2021) and MUSE (Akiki & Burghardt, 2020) datasets as well as a collaborative research project with researchers at Goldsmiths University of London who will be running a large sample sized study involving the recording of movements to music signals to examine the acquisition of music emotion discrimination in children. We will then compare the emotion recognition performance of this model given only music data with that of a model trained solely with audio.

The results of the study may give insight into how a more modern/radical perspective on cognition could help improve intelligent musical systems beyond tackling tasks as mostly engineering problems.

Bio

 

Published Oct. 22, 2022 7:39 PM - Last modified Oct. 22, 2022 7:39 PM