Introduction to alignment demo

In this webpage, we present some results from audio-to-score alignment. The detail of synchronization procedure is described in [1]. In the paper, we quantitatively reported alignment results on MAPS dataset [2]. Here, we additionally prepared real-world data to test out its generalize property. Audio files were gethered from IMSLP [3] and vienna-4x22 corpus [4]. When we aligned IMSLP data, we used MIDI files from MAPS dataset, and for vienna-4x22 corpus, MIDI of first player was aligned to others audio. Please check the license from original webpage when you use the MIDI or audio.

Abstract

We propose a framework for audio-to-score alignment on piano performance that employs automatic music transcription (AMT) using neural networks. Even though the AMT result may contain some errors, the note prediction output can be regarded as a learned feature representation that is directly comparable to MIDI note or chroma representation. To this end, we employ two recurrent neural networks that work as the AMT-based feature extractors to the alignment algorithm. One predicts the presence of 88 notes or 12 chroma in frame-level and the other detects note onsets in 12 chroma. We combine the two types of learned features for the audio-to-score alignment. For comparability, we apply dynamic time warping as an alignment algorithm without any additional post-processing. We evaluate the proposed framework on the MAPS dataset and compare it to previous work. The result shows that the alignment framework with the learned features significantly improves the accuracy, achieving less than 10 ms in mean onset error.

Results

IMSLP data

Identifier Audio MIDI MIDI(aligned) Combined (left:Audio, right:Synthsized from MIDI)
Bach_BMV_850 Link download download
Beethoven_Sonata_no.8_2nd Link download download
Beethoven_Sonata_no.8_3rd Link download download
Chopin_Fantasia_impromptu Link download download
Chopin_Etude_Op.10_No.1 Link download download
Chopin_Etude_Op.10_No.12 Link download download
Chopin_Etude_Op.25_No.2 Link download download
Chopin_Etude_Op.25_No.3 Link download download
Chopin_Etude_Op.25_No.4 Link download download

Vienna 4x22 corpus

Identifier Audio MIDI MIDI(aligned) Combined (left:Audio, right:Synthsized from MIDI)
Chopin_Ballade_p02 Link download
Chopin_Ballade_p03 Link download
Chopin_Ballade_p04 Link download
Chopin_Ballade_p05 Link download
Chopin_etude_p02 Link download
Chopin_etude_p03 Link download
Chopin_etude_p04 Link download
Chopin_etude_p05 Link download
Mozart_K331_1st-mov_p02 Link download
Mozart_K331_1st-mov_p03 Link download
Mozart_K331_1st-mov_p04 Link download
Mozart_K331_1st-mov_p05 Link download
Schubert_D783_no15_p02 Link download
Schubert_D783_no15_p03 Link download
Schubert_D783_no15_p04 Link download
Schubert_D783_no15_p05 Link download

Reference

[1]
Taegyun Kwon, Dasaem Jeong and Juhan Nam
Audio-to-Score Alignment Of Piano Music Using RNN-based Automatic Music Transcription
Proceedings of the 14th Sound and Music Computing Conference (SMC), 2017 (Accepted)
[2]
Valentin Emiya
MAPS Database ‐ A piano database for multipitch estimation and automatic transcription of music
Link
[3]
Werner Goebl
The Vienna 4x22 Piano Corpus
Link
[4]
International Music Score Library Project
Link