GCT634 (AI613) Fall 2021

Musical Applications of Machine Learning

Course Description

This course aims to learn machine learning with applications to in the music and audio domains. Specificially, we handle various tasks in the topics of music and audio classification, automatic music transcription, source separation, sound synthesis, and music generation. Student will have hands-on experiences using audio processing and machine learning libraries through the assignments and gain experience of the full cycle of research through the final research project.

General Information

  • Instructor: Juhan Nam (남주한)
  • TAs
    • GSCT: Wonil Kim (김원일), Minuk Choi (최민석), Seungheon Doh (도승헌)
    • GSAI: Jinhyeon Kim (김진현), Yeongho Jeong (정영호), Minyeong Hwang (황민영)
  • Course Format
    • Pre-recorded video lectures: KLMS
    • Online session: Mon 13:00 - 14:30 (Zoom meeting)

Grading Policy

  • Assignments: 50 %
  • Research Project: 50%
    • Paper review
    • Presentation
    • Report

Textbooks

                                   

Schedule

Week Topics
1
2
  • [Sep 6, Zoom Meeting] Audio Data Representations Using Librosa
  • [Video] Audio Feature Extraction [slides]
  • [Video] Traditional Machine Learning: Unsupervised Learning [slides]
  • Suggested Readings
    • The PRML book (Chapter 9: Mixture Models and EM)
    • The PRML book (Chapter 12: Continuous Latent Variables)
3
  • [Sep 13, Zoom Meeting] Audio Feature Extraction, Dimensionality Reduction, Data Compression, Vector Quantization and Classification Using SciKit-Learn
  • [Video] Traditional Machine Learning: Supervised Learning [slides]
  • [Homework #1] Musical Instrument Recognition (Due Sep 26, 11:59 PM) [link]
  • Suggested Readings
4
5
  • No Zoom Meeting
  • [Video] Convolutional Neural Network (CNN) [slides]
  • Suggested Readings
    • The DL book (Chapter 9: Convolutional Networks)
6
7
  • No Class (Hangeul Day)
  • [Video] Representation Learning for Music [slides]
  • [Video] Cross-Modality Representation Learning for Music [slides]
8
  • Midterm Break
9
  • [Oct 25, Zoom Meeting] Review and Cross-Modal Representation Learning for Music
  • [Video] Recurrent Neural Network [slides]
  • Suggested Readings
    • The DL book (Chapter 10: Sequence Modeling: Recurrent and Recursive Nets)
10
  • [Nov 3 (Wed), Zoom Meeting] HW2 review and HW3 introduction
  • [Video] Automatic Music Transcription [slides]
  • [Homework #3] Polyphonic Piano Transcription (Due Nov 14, 11:59 PM) [link]
  • Suggested Readings
11
  • No Zoom Meeting (ISMIR Conference)
  • [Video] Auto-Encoder, U-Net, and Source Separation [slides]
  • Suggested Readings
    • The DL book (Chapter 14: Auto-Encoder)
12
  • [Nov 15 (Mon), Zoom Meeting]
  • [Video] Symbolic Music Generation: Basics [slides]
  • [Video] Symbolic Music Generation with VAE [slides]
13
  • [Nov 24, Zoom Meeting] HW4 introduction
  • Final Project Meetings
  • [Homework #4] Symbolic Music Generation (Due Dec 12, 11:59 PM) [link]
14
  • [Nov 29, Zoom Meeting] Invited Talk: "Controllable Synthesis of Speech and Singing Voice with Neural Networks", Hyeong-Suk Choi (SNU)
  • [Dec 1, Zoom Meeting] HW3 review
  • [Video] Symbolic Music Generation with Transformer [slides]
15
  • No Class
16