Does AI Hear What We Hear?

Testing Music Technology’s Human Touch

Konrad Swierczek

Overview

  • Introduction
  • Part 1: Three Case Studies
    • Consonance & Dissonance
    • Concert Studies
    • Emotion in Music
  • Part 2: Lessons Learned
  • Part 3: How Do We Move Forward?

Why Music, Technology, & The Mind?

  • Performance
  • Composition & Music Theory
  • Music Community
  • Audio Engineering
  • Open Source Technology
  • Computation & Music
  • Music Perception & Cognition
  • Open Science
  • Science Communication

What Are We Talking About?

Artificial Intelligence Machine Learning Deep Learning Neural Network Data Science Unsupervised Learning Natural Language Processing Generative AI Reinforcement Learning

It’s All About Prediction!

AI & Music???

Music Content Analysis

“Extracting meaningful aspects of music from audio files.”

PART 1: THREE CASE STUDIES

Consonance & Dissonance

Consonance & Dissonance

Consonance & Dissonance

Concert Studies

Emotion in Music

PART 2: LESSONS LEARNED

Chasing Wild Horses

Clever Hans with his trainer, Wilhelm von Osten, 1904.

“we propose to determine whether a MIR system is actually a ‘horse:’ a system appearing capable of a remarkable human feat, e.g., music genre recognition, but actually working by using irrelevant characteristics (confounds).”

The Tip of the Iceberg

  • Music is defined by humans!
  • Do acoustic features directly represent music features?
  • Measuring human sensation/perception/cognition of music is hard!
  • “…only cognitive models are likely to succeed in processing Music in a human-like way.”

Perspectives

“Any conclusion from this experiment that is more general than ‘the model has learned something about this dataset’ lacks validity. One must resist the urge to conclude that a model must be doing whatever is hoped for.

  • Who does the model represent?
  • What assumptions does the model make?
  • Is the data a good representation of the phenomenon it’s predicting?
  • How often should the model be updated?

PART 3: HOW TO GET AI TO CARNEGIE HALL?

Investigating Variation

Irrelevant Transformations

Summary

  • Prediction about music from audio is challenging!
  • Music practice and science can inform prediction
  • Accuracy is not the full story!
  • Music is HUMAN
  • Your point of view matters!
  • Novel approaches solve problems
  • Wherever there is prediction, there is noise—and more of it than you think.

Thank You!

Visit my GitHub

Works Cited