PhonoNet: Multi-Stage Deep Neural Networks for Raga Identification in Hindustani Classical Music


Audio information retrieval (AIR) is a field with potential applications in automatic annotation, music recommendation, as well as music tutoring and accuracy verification systems. Extracting the raga, or melodic style, of improvisational Hindustani Classical music is a challenging problem in AIR due to the music’s melodic variation and inconsistent temporal spacing. In this work, a hierarchical deep learning system, PhonoNet, is proposed for extracting information from audio data with temporal variation. PhonoNet is applied to a comprehensive Hindustani Classical music dataset and achieves a new state-of-the-art 98.9% accuracy in raga prediction.

International Conference on Multimedia Retrieval (ICMR) 2019
comments powered by Disqus