Music and Audio Computing Lab


  • Children's Song Dataset (CSD): this dataset contains 50 Korean and 50 English songs sung by one Korean female professional pop singer. Each song is recorded in two separate keys resulting in a total of 200 audio recordings. Each audio recording is paired with a MIDI transcription and lyrics annotations in both grapheme-level and phoneme-level.
  • dim-sim: The dim-sim dataset is a collection of user-annotated music similarity triplet ratings used to evaluate music similarity search and related algorithms. Our similarity ratings are linked to the Million Song Dataset (MSD) and were collected for the following paper.
  • K-pop Vocal Tagging (KVT) : The KVT dataset provides semantic labels of singing voices from K-pop songs.

Source Codes and Software Library

  • PyTSMod : Python-based Time Scale Modification (TSM) algorithms including Overlap-Add (OLA), Pitch Synchronous Overlap-Add (TD-PSOLA), Waveform Similarity Overlap-Add (WSOLA), Phase Vocoder (PV), and TSM based on harmonic-percussive source separation (HPTSM).
  • JDC vocal melody extraction : a pre-trained deep neural network model for singing voice detection and melody extraction

Courses and Tutorial (by Prof. Juhan Nam)