Skip to main content

Research Repository

Advanced Search

Extracting Coarse Body Movements from Video in Music Performance: A Comparison of Automated Computer Vision Techniques with Motion Capture Data

Jakubowski, Kelly; Eerola, Tuomas; Alborno, Paolo; Volpe, Gualtiero; Camurri, Antonio; Clayton, Martin

Extracting Coarse Body Movements from Video in Music Performance: A Comparison of Automated Computer Vision Techniques with Motion Capture Data Thumbnail


Authors

Paolo Alborno

Gualtiero Volpe

Antonio Camurri



Abstract

The measurement and tracking of body movement within musical performances can provide valuable sources of data for studying interpersonal interaction and coordination between musicians. The continued development of tools to extract such data from video recordings will offer new opportunities to research musical movement across a diverse range of settings, including field research and other ecological contexts in which the implementation of complex motion capture systems is not feasible or affordable. Such work might also make use of the multitude of video recordings of musical performances that are already available to researchers. The present study made use of such existing data, specifically, three video datasets of ensemble performances from different genres, settings, and instrumentation (a pop piano duo, three jazz duos, and a string quartet). Three different computer vision techniques were applied to these video datasets—frame differencing, optical flow, and kernelized correlation filters (KCF)—with the aim of quantifying and tracking movements of the individual performers. All three computer vision techniques exhibited high correlations with motion capture data collected from the same musical performances, with median correlation (Pearson’s r) values of .75 to .94. The techniques that track movement in two dimensions (optical flow and KCF) provided more accurate measures of movement than a technique that provides a single estimate of overall movement change by frame for each performer (frame differencing). Measurements of performer’s movements were also more accurate when the computer vision techniques were applied to more narrowly-defined regions of interest (head) than when the same techniques were applied to larger regions (entire upper body, above the chest or waist). Some differences in movement tracking accuracy emerged between the three video datasets, which may have been due to instrument-specific motions that resulted in occlusions of the body part of interest (e.g. a violinist’s right hand occluding the head whilst tracking head movement). These results indicate that computer vision techniques can be effective in quantifying body movement from videos of musical performances, while also highlighting constraints that must be dealt with when applying such techniques in ensemble coordination research.

Citation

Jakubowski, K., Eerola, T., Alborno, P., Volpe, G., Camurri, A., & Clayton, M. (2017). Extracting Coarse Body Movements from Video in Music Performance: A Comparison of Automated Computer Vision Techniques with Motion Capture Data. Frontiers in digital humanities, 4, Article 9. https://doi.org/10.3389/fdigh.2017.00009

Journal Article Type Article
Acceptance Date Mar 21, 2017
Online Publication Date Apr 6, 2017
Publication Date Apr 6, 2017
Deposit Date Mar 24, 2017
Publicly Available Date Mar 29, 2024
Journal Frontiers in Digital Humanities
Publisher Frontiers Media
Peer Reviewed Peer Reviewed
Volume 4
Article Number 9
DOI https://doi.org/10.3389/fdigh.2017.00009

Files

Accepted Journal Article (7.8 Mb)
PDF

Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/

Copyright Statement
Copyright: © 2017 Jakubowski, Eerola, Alborno, Volpe, Camurri and Clayton. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.





You might also like



Downloadable Citations