Musical Asset Management

HP Labs


»	Research


»	News and events
»	Technical reports


»	About HP Labs
»	Careers @ HP Labs
»	People
»	Worldwide sites


»	Downloads

Nowadays there are more and more musical assets in digital formats available, and they are becoming more and more popular in people's life. Then comes the problem of how to efficiently arrange musical assets so as to make them easy for browsing and retrieving. Since the volume of digital music materials is growing rapidly, manual indexing and retrieval are just impossible. In this project, we focus on building systems and methods for automatic categorization and retrieval of musical assets. There are many applications of this work such as online music shopping, personal music library organization, searching for preferred music channels on the web, and retrieving video segments based on music content. In the following, three technologies which have been developed for this project are described.

Semi-Automatic Approach for Music Classification

Audio categorization is essential when managing a music database, either a professional library or a personal collection. However, a complete automation in categorizing music into proper classes for browsing and searching is not yet supported by today’s technology. Also, the issue of music classification is subjective to some extent as each user may have his own criteria for categorizing music. Therefore, we proposed the idea of semi-automatic music classification. With this approach, a music browsing system is set up which contains a set of tools for separating music into a number of broad types (e.g. male solo, female solo, string instruments performance, etc.) using existing music analysis methods. With results of the automatic process, the user may further cluster music pieces in the database into finer classes and/or adjust misclassifications manually according to his own preferences and definitions. Such a system may greatly improve the efficiency of music browsing and retrieval, while at the same time guarantee accuracy and user’s satisfaction of the results.

Here are illustrations of the system. For more details of this work, please refer to the following paper:

Tong Zhang, "Semi-automatic approach for music classification," SPIE's Conference on Internet Multimedia Management Systems IV (part of ITCom'03), vol. 5242, Orlando, Sep. 2003. (PDF Download)

Automatic Music Instrument Classification

While most previous work on musical instrument recognition is focused on the classification of single notes in monophonic music, we proposed a scheme in this work for the distinction of instruments in continuous music pieces which may contain one or more kinds of instruments. Highlights of the system include music segmentation into notes, harmonic partial estimation in polyphonic sound, note feature calculation and normalization, note classification using a set of neural networks, and music piece categorization with fuzzy logic principles. Example outputs of the system are "the music piece is 100% guitar (with 90% likelihood)" and "the music piece is 60% violin and 40% piano, thus a violin/piano duet". The system has been tested with twelve
kinds of musical instruments, and very promising experimental results have been obtained. An accuracy of about 80% is achieved, and the number can be raised to 90% if mis-classifications within the same instrument family are tolerated (e.g. cello, viola and violin). A demonstration system for musical instrument classification and music timbre retrieval was also developed.

For algorithmic details, please refer to the following paper:

Tong Zhang, "Instrument classification in polyphonic music based on timbre analysis," SPIE's Conference on Internet Multimedia Management Systems II (part of ITCom'01), vol. 4519, p. 136-147, Denver, Aug. 2001. (PDF Download)

Automatic Singer Identification

The singer’s information is essential in organizing, browsing and retrieving music collections. In this work, a system for automatic singer identification is developed which recognizes the singer of a song by analyzing the music signal. Meanwhile, songs which are similar in terms of singer’s voice are clustered. The proposed scheme follows the framework of common speaker identification systems, but special efforts are made to distinguish the singing voice from instrumental sounds in a song. A statistical model is trained for each singer’s voice with typical song(s) of the singer. Then, for a song to be identified, the starting point of singing voice is detected and a portion of the song is excerpted from that point. Audio features are extracted and matched with singers’ voice models in the database. The song is assigned to the model having the best match. Promising results are obtained on a small set of samples, and accuracy rates of around 80% are achieved.

In the proposed approach, four audio features are extracted and integrated to detect the start of the singer's voice in a song, which helps to remove prelude of the song. This technique can be used in several other music management applications in addition to singer identification, such as audio thumbnailling. Here is an example.

For more details, please refer to the following paper:

Tong Zhang, "Automatic singer identification," Proc. of ICME'03, Baltimore, July 2003. (PDF Download)

Contact

For more information about the technology, please contact Tong Zhang (tong.zhang@hp.com) at Imaging Technology Dept., HP Labs.

Printing and Imaging Research Center
	»	PIRC web site



»	Imaging Systems Laboratory
	»	Projects

Printable version


Privacy statement	Using this site means you accept its terms	Feedback to HP Labs

© 2009 Hewlett-Packard Development Company, L.P.