Music and Audio Computing Lab

Main menu: Home | People | Research | Publications |

Content-based Music Annotation and Retrieval

In the recent past, music has become ubiquitous as digital data and the scale of music content in streaming services has significantly increased. This posed challenges in terms of music search and recommendation. The goal of this research subject is exploring various machine learning algorithms, particularly deep learning, to extract high-level musical features from the audio signals, and applying them to various music Annotation and retrieval problems.

Music Galaxy HitchHiker: 3D web music navigation system through learned audio feature space



We have worked on various topics for content-based music annotation and retrieval. Here are the lists of highlight papers.


General

  • Deep Learning for Audio-based Music Classification and Tagging
    Juhan Nam, Keunwoo Choi, Jongpil Lee, Szu-Yu Chou, and Yi-Hsuan Yang
    IEEE Signal Processing Magazine, 2019 [pdf]

Representation Learning and Deep Metric Learning

  • Metric Learning VS Classification for Disentangled Music Representation Learning
    Jongpil Lee, Nicholas J. Bryan, Justin Salamon, Zeyu Jin, and Juhan Nam
    Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR), 2020 (accepted)
  • Drum Sample Retrieval from Mixed Audio via a Joint Embedding Space of Mixed and Single Audio Samples
    Wonil Kim and Juhan Nam
    Proceedings of the 149th Audio Engineering Society Convention (AES), 2020 (accepted)
  • Disentangled Multidimensional Metric Learning for Music Similarity
    Jongpil Lee, Nicholas J. Bryan, Justin Salamon, Zeyu Jin, and Juhan Nam
    Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020 [pdf] [dataset]
    *** IEEE SPS Student Travel Grant ***
  • Representation Learning of Music Using Artist, Album, and Track information
    Jongpil Lee, Jiyoung Park, and Juhan Nam
    Machine Learning for Music Discovery Workshop, the 36th International Conference on Machine Learning (ICML), 2019 [pdf]
  • Learning a Joint Embedding Space of Monophonic and Mixed Music Signals for Singing Voice
    Kyungyun Lee and Juhan Nam
    Proceedings of the 20th International Society for Music Information Retrieval Conference (ISMIR), 2019 [pdf] [website] [code]
  • Representation Learning of Music Using Artist Labels
    Jiyoung Park, Jongpil Lee, Jangyeon Park, Jung-Woo Ha and Juhan Nam
    Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018 [pdf]

Connections with Natural Language Processing

  • Musical Word Embedding: Bridging the Gap between Listening Contexts and Music
    Seungheon Doh, Jongpil Lee, Tae Hong Park, and Juhan Nam
    Machine Learning for Media Discovery Workshop, International Conference on Machine Learning (ICML), 2020 [pdf] [website]
  • Zero-shot Learning for Audio-based Music Classification and Tagging
    Jeong Choi, Jongpil Lee, Jiyoung Park, and Juhan Nam
    Proceedings of the 20th International Society for Music Information Retrieval Conference (ISMIR), 2019 [pdf] [code]

Neural Network Architecures For Music and Audio Classification

  • Comparison and Analysis of SampleCNN Architectures for Audio Classification
    Taejun Kim, Jongpil Lee, and Juhan Nam
    IEEE Journal of Selected Topics in Signal Processing, 2019 [pdf]
  • Sample-level CNN Architectures for Music Auto-tagging Using Raw Waveforms
    Taejun Kim, Jongpil Lee, and Juhan Nam
    Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018 [pdf]
  • SampleCNN: End-to-End Deep Convolutional Neural Networks Using Very Small Filters for Music Classification
    Jongpil Lee, Jiyoung Park, Keunhyoung Luke Kim and Juhan Nam
    Applied Sciences, 2018 [pdf]
  • Sample-level Deep Convolutional Neural Networks for Music Auto-Tagging Using Raw Waveforms
    Jongpil Lee, Jiyoung Park, Keunhyoung Luke Kim and Juhan Nam
    Proceedings of the 14th Sound and Music Computing Conference, 2017 [pdf]
  • Multi-Level and Multi-Scale Feature Aggregation Using Pre-trained Convolutional Neural Networks for Music Auto-Tagging
    Jongpil Lee and Juhan Nam
    IEEE Signal Processing Letters, 2017 [pdf]

Visualizations

  • Music Galaxy Hitchhiter: 3D Web Music Navigation Through Audio Space Trained with Tag and Artist Labels
    Dongwoo Suh, Kyungyun Lee, Jongpil Lee, Jiyoung Park and Juhan Nam
    Late Breaking Demo in the 18th International Society for Musical Information Retrieval Conference (ISMIR), 2017 [pdf]

Funding

We have received the following funds to support this research.

  • Adobe Research - gift fund, 2019-2020
  • Naver - industry research fund, 2017-2019
  • National Research Foundation of Korea, 2015-2018
  • KAIST - research start-up fund for new faculty, 2014-2017