We use cookies to ensure that we give you the best experience on our website. You can change your cookie settings at any time. Otherwise, we'll assume you're OK to continue.

Durham University



Publication details for Professor Tuomas Eerola

Saari, Pasi, Eerola, Tuomas, Fazekas, György, Barthet, Mathieu, Lartillot, Olivier & Sandler, Mark (2013), The Role of Audio and Tags in Music Mood Prediction: a Study Using Semantic Layer Projection, in de Souza Britto Junior, Alceu, Gouyon, Fabien & Dixon, Simon eds, ISMIR. Curitiba, Brazil, International Society for Music Information Retrieval, Curitiba, 201-206.

Author(s) from Durham


Semantic Layer Projection (SLP) is a method for automatically
annotating music tracks according to expressed
mood based on audio. We evaluate this method by comparing
it to a system that infers the mood of a given track
using associated tags only. SLP differs from conventional
auto-tagging algorithms in that it maps audio features to
a low-dimensional semantic layer congruent with the circumplex
model of emotion, rather than training a model
for each tag separately. We build the semantic layer using
two large-scale data sets – crowd-sourced tags from, and editorial annotations from the I Like Music
(ILM) production music corpus – and use subsets of these
corpora to train SLP for mapping audio features to the semantic
layer. The performance of the system is assessed
in predicting mood ratings on continuous scales in the two
data sets mentioned above. The results show that audio is
in general more efficient in predicting perceived mood than
tags. Furthermore, we analytically demonstrate the benefit
of using a combination of semantic tags and audio features
in automatic mood annotation.