GCT634 (AI613) Fall 2021

Musical Applications of Machine Learning

Course Description

This course aims to learn machine learning with applications to in the music and audio domains. Specificially, we handle various tasks in the topics of music and audio classification, automatic music transcription, source separation, sound synthesis, and music generation. Student will have hands-on experiences using audio processing and machine learning libraries through the assignments and gain experience of the full cycle of research through the final research project.

General Information

  • Instructor: Juhan Nam (남주한)
  • TAs
    • GSCT: Wonil Kim (김원일), Minuk Choi (최민석), Seungheon Doh (도승헌)
    • GSAI: Jinhyeon Kim (김진현), Yeongho Jeong (정영호), Minyeong Hwang (황민영)
  • Course Format
    • Pre-recorded video lectures: KLMS
    • Online session: Mon 13:00 - 14:30 (Zoom meeting)

Grading Policy

  • Assignments: 50 %
  • Research Project: 50%
    • Paper review
    • Presentation
    • Report

Textbooks

                                   

Schedule

Week Topics
1
2
  • [Sep 6, Zoom Meeting] Audio Data Representations Using Librosa
  • [Video] Audio Feature Extraction [Slides]
  • [Video] Traditional Machine Learning: Unsupervised Learning [Slides]
  • Suggested Readings
    • The PRML book (Chapter 9: Mixture Models and EM)
    • The PRML book (Chapter 12: Continuous Latent Variables)
3
  • [Sep 13, Zoom Meeting] Audio Feature Extraction, Dimensionality Reduction, Data Compression, Vector Quantization and Classification Using SciKit-Learn
  • [Video] Traditional Machine Learning: Supervised Learning [Slides]
  • [Homework #1] Musical Instrument Recognition (Due Sep 26, 11:59 PM) [link]
  • Suggested Readings
4
5
  • [Sep 27, Zoom Meeting] TBD
  • [Video] Convolutional Neural Network (CNN) [Slides]
  • [Video] Music Classification and Tagging
  • Suggested Readings
    • The DL book (Chapter 9: Convolutional Networks)
6
  • [Oct 6 (Wed), Zoom Meeting] Music Classification with PyTorch
  • [Video] Representation Learning and Metric Learning [Slides]
  • [Video] Music Retrieval and Recommendation[Slides]
  • [Homework #2] Music Tagging and Retrieval [link]
  • Suggested Readings
    • The FMP book (Chapter 7: Content-based Audio Retrieval)
7
  • No Class (Hangeul Day)
  • [Video] Sequence Modeling and Recurrent Neural Network (RNN) [Slides]
  • [Video] Pitch estimation and Melody Extraction [Slides]
  • [Video] Automatic music transcription [Slides]
  • Suggested Readings
    • The DL book (Chapter 10: Sequence Modeling: Recurrent and Recursive Nets)
    • The FMP book (Chapter 3: Music Syncrhonization)
8
  • Midterm Break
9
  • [Oct 25, Zoom Meeting]
  • [Video] Dynamic Time Warping [Slides]
  • [Video] Audio-to-score alignment [Slides]
  • [Video] Onset Detection, Tempo Estimation and Beat-Tracking [Slides]
  • Suggested Readings
    • The DL book (Chapter 14: Auto-Encoder)
10
  • [Nov 1, Zoom Meeting]
  • [Video] Auto-Encoder and U-Net [Slides]
  • [Video] Music Source Seperation [Slides]
  • Suggested Readings
    • The DL book (Chapter 14: Auto-Encoder)
11
  • [Nov 8, Zoom Meeting]
  • [Video] Generative Models: Variational auto-encoder (VAE), Generative adversarial network (GAN) [Slides]
  • [Video] Audio Vocoder (WaveNet, DDSP)
  • [Video] Sound Synthesis and Digital Audio Effects
  • Suggested Readings
    • The DL book
12
  • [Nov 15, Zoom Meeting]
  • [Video] Symbolic music representation: MIDI and Music Score
  • [Video] Musical Language Models and Symbolic Music Generation
13
  • [Nov 22, Zoom Meeting]
  • [Video] Transformer
  • [Video] Symbolic Music Generation Using Transformer
14
  • Invited Talks or Advanced Topics
15
  • Invited Talks or Advanced Topics
16
  • Final Project