Coming Soon!

Supplementary Materials

Album Selection

12 of the 17 albums we use here appear in an analysis of tempi (Palmer/Bach, 2004). We excluded the 13th album available in Palmer’s analysis as only a vinyl edition is available, where the others are available on CD. Four of the albums we use here appear in an analysis of cues (Anderson & Schutz, 2023). Gulda’s recording appears in both Palmer’s analysis and Anderson & Schutz (2023). An additional two recordings by Newman were available from a currently unpublished project, but met our criteria as a prominent performance of the WTC available commercially on CD.

Table 1: Performances of the Well Tempered Clavier used, with details. Year Recorded corresponds to the latest year listed if recorded across multiple years.
Performer Year Recorded Instrument Label
Edwin Fischer 1936 Piano EMI Records
Wanda Landowska 1951 Harpsichord RCA Victor Red Seal
Rosalyn Tureck 1953 Harpsichord Deutsche Grammophon
Jörg Demus 1956 Piano MCA Records
Ralph Kirkpatrick 1959 Piano Deutsche Grammophon
João Carlos Martins 1964 Piano Labour Records
Martin Galling 1964 Harpsichord Vox Records
Malcom Hamilton 1964 Harpsichord Everest
Glenn Gould 1965 Piano Sony Classical
Sviatoslav Richter 1970 Piano BMG Classics
Friedrich Gulda 1972 Piano Decca
Gustav Leonhardt 1973 Piano BMG Classics
Anthony Newman 2000 Harpsichord Vox Cum Laude
Anthony Newman 2001 Piano KHAEON World Music
Daniel Barenboim 2003 Piano Warner Classics
Vladimir Ashkenazy 2005 Piano Decca
Pietro De Maria 2014 Piano Decca
Synthesized MIDI NA Piano NA

Algorithm Selection

With the exception of Spectral Centroid there are multiple methods for extraction, as described above. Here we select one method for each extracted feature/tool combination using a two step process. First, we began with a MIDI representations of the first eight measures of the 24 preludes from Bach’s Well Tempered Clavier, extracting the tempo encoded in the MIDI file, number of onsets, and mode (using the pitch class distribution based mode extraction algorithm described previously). We repeated this process on audio synthesized from the MIDI with a generic piano sound font, calculated the mean squared error of the difference for each piece (Table 2), and selected the lowest MSE within an extracted feature/tool for subsequent version analysis. MIRtoolbox by default uses a Short-time Fourier Transform with the mirmode algorithm. MIRtoolbox also implements a Constant-Q Transform that could be applied to the mode algorithm, not considered here. Essentia implements both a Constant-Q Transform and a Short-time Fourier Transform, and Librosa implements a Short-time Fourier Transform, Constant-Q Transform, Constant-Q Transform with CENS, and Variable-Q Transform. Details on these extractors can be found in each of the toolbox’s extensive documentations. Although Librosa implements only one algorithm for onset extraction, Essentia has six. MIRtoolbox also has a number of additional parameters for the mirevents algorithm, including variants for the envelope used and filtering options. However, we only consider the default options of mirevents here. MIRtoolbox and Librosa each have two methods for extracting tempo whereas we implement three methods from Essentia here.

Table 2: Mean Squared Error of synthesized audio compared to MIDI for each feature and algorithm. The algorithm ‘org’ refers to the regular algorithm in the case of tools without multiple algorithms for the same feature.
(a) Relative Mode
Tool Algorithm Feature MSE
Essentia cqt Relative Mode 0.0273
Essentia stft Relative Mode 0.0336
Librosa cens Relative Mode 0.0157
Librosa stft Relative Mode 0.0161
Librosa cqt Relative Mode 0.0162
Librosa vqt Relative Mode 0.0261
MIRtoolbox stft Relative Mode 0.0327
(b) Onsets (#)
Tool Algorithm Feature MSE
Essentia complex Onsets (#) 887
Essentia flux Onsets (#) 895
Essentia rms Onsets (#) 1172
Essentia hfc Onsets (#) 1198
Essentia complexphase Onsets (#) 1256
Essentia melflux Onsets (#) 2625
Librosa org Onsets (#) 896
MIRtoolbox org Onsets (#) 644
(c) Tempo (BPM)
Tool Algorithm Feature MSE
Essentia percival Tempo (BPM) 2379
Essentia multifeature Tempo (BPM) 3206
Essentia degara Tempo (BPM) 3633
Librosa beattrack Tempo (BPM) 2053
Librosa org Tempo (BPM) 2483
MIRtoolbox metre Tempo (BPM) 1982
MIRtoolbox classical Tempo (BPM) 3663

Mode Algorithm Implementation

mirmode values for all 408 audio files in our corpus. The x axis shows values from the MIRtoolbox mirmode algorithm, while the y axis shows values from MIRtoolbox mirchromagram values inputted into our python reproduction of the mirmode algorithm. The values are identical, indicating our reproduction was successful.

Pairwise Permutation Test Results

Table 3: Results of pairwise permutation tests. All tests were performed with difference of means as test statistics and 10000 permutations.
(a) Q1: How do tools compare when extracting the same feature?
Group 1 Group 2 Feature P Value Test Statistic Confidence Interval
Essentia Librosa Relative Mode 0.000 -0.345 -0.123 - 0.117
Essentia MIRtoolbox Relative Mode 0.004 -0.149 -0.09 - 0.105
Librosa MIRtoolbox Relative Mode 0.000 0.196 -0.092 - 0.099
Essentia Librosa Onsets (#) 0.047 0.173 -0.169 - 0.179
Essentia MIRtoolbox Onsets (#) 0.205 0.087 -0.131 - 0.141
Librosa MIRtoolbox Onsets (#) 0.298 -0.086 -0.157 - 0.165
Essentia Librosa Tempo (BPM) 0.732 -0.022 -0.134 - 0.128
Essentia MIRtoolbox Tempo (BPM) 0.561 -0.038 -0.124 - 0.123
Librosa MIRtoolbox Tempo (BPM) 0.826 -0.015 -0.134 - 0.134
Essentia Librosa Spectral Centroid (Hz) 0.829 0.008 -0.066 - 0.062
Essentia MIRtoolbox Spectral Centroid (Hz) 0.739 0.011 -0.062 - 0.063
Librosa MIRtoolbox Spectral Centroid (Hz) 0.898 0.003 -0.048 - 0.052
(b) Q2: Which features are extracted most consistently?
Group 1 Group 2 Tool P Value Test Statistic Confidence Interval
Relative Mode Onsets (#) Essentia 0.000 0.418 -0.166 - 0.161
Relative Mode Tempo (BPM) Essentia 0.144 0.080 -0.113 - 0.108
Relative Mode Spectral Centroid (Hz) Essentia 0.000 0.165 -0.093 - 0.095
Onsets (#) Tempo (BPM) Essentia 0.000 0.498 -0.193 - 0.185
Onsets (#) Spectral Centroid (Hz) Essentia 0.000 0.583 -0.188 - 0.205
Tempo (BPM) Spectral Centroid (Hz) Essentia 0.121 -0.085 -0.108 - 0.106
Relative Mode Onsets (#) Librosa 0.163 -0.100 -0.141 - 0.14
Relative Mode Tempo (BPM) Librosa 0.000 0.403 -0.149 - 0.153
Relative Mode Spectral Centroid (Hz) Librosa 0.000 0.518 -0.153 - 0.157
Onsets (#) Tempo (BPM) Librosa 0.003 0.303 -0.179 - 0.186
Onsets (#) Spectral Centroid (Hz) Librosa 0.000 0.418 -0.172 - 0.179
Tempo (BPM) Spectral Centroid (Hz) Librosa 0.041 -0.115 -0.112 - 0.109
Relative Mode Onsets (#) MIRtoolbox 0.000 0.182 -0.118 - 0.121
Relative Mode Tempo (BPM) MIRtoolbox 0.004 0.192 -0.127 - 0.125
Relative Mode Spectral Centroid (Hz) MIRtoolbox 0.000 0.325 -0.112 - 0.124
Onsets (#) Tempo (BPM) MIRtoolbox 0.000 0.374 -0.156 - 0.155
Onsets (#) Spectral Centroid (Hz) MIRtoolbox 0.000 0.507 -0.16 - 0.169
Tempo (BPM) Spectral Centroid (Hz) MIRtoolbox 0.011 -0.133 -0.1 - 0.113
Back to top