Sophie Wagner

Sophie Wagner

Bachelor's Thesis

Development and Evaluation of Musical Instrument Recognition Algorithms

Gabriel Gomez, Prof. Dr. Björn Eskofier

11/2018 – 03/2019

Music Information Retrieval (MIR) deals with the analysis of music files by using digital audio signal processing [1]. In a first step, a music file as PCM (Pulse Code Modulated) audio signal such as a wave format, usually at 16 or 24 bits per sample at 44,1 kHz sampling frequency is used. If the music files are not already in a pure PCM format such as Mp3 files, they might need to be decoded into the PCM format. The files are then partitioned into frames of 30-100 ms length and analyzed separately both in the time domain and the frequency domain. Features are extracted from each frame, such as the zero-crossing rate or statistical values of the amplitude or squared amplitude (min, max, mean, median) in the time domain, or the fundamental frequency, energy distribution, spectral centroid or harmonic energy ratios in the frequency domain, to name a few. These features give information about the harmonics, pitch, key, tempo, rhythm and other characteristics of a musical piece [2, 3]. Since music is dynamic, these features can change from frame to frame. Thus, also the change of features between 2 or more frames can give important information about a musical piece. Features are usually saved in vectors and matrices and can be analyzed in a mathematical way to find patterns, sequences and dependencies between them. Each musical piece has a very individual digital fingerprint and can be used for song identification, such as is used in software like Shazam, even when only a few seconds of the song are taken for comparison with a feature database.

The goal of this bachelor thesis is to firstly find the most significant features for instrument recognition and further adapt and/or develop musical instrument recognition algorithms [4, 5], that can differentiate between different types of instruments such as piano, violin, guitar, trumpet and voice. To develop and evaluate musical instrument recognition algorithms, supervised data will be collected by synthetically creating different versions of several musical pieces, such that for each musical piece a version for each instrument is made using midi files and applying virtual instruments.


  1. Müller, M. (2007). Information retrieval for music and motion (Vol. 2). Heidelberg: Springer.
  2. Brown, J. C., Houix, O., & McAdams, S. (2001). Feature dependence in the automatic identification of musical woodwind instruments. The Journal of the Acoustical Society of America, 109(3), 1064-1072.
  3. Eronen, A. (2003, July). Musical instrument recognition using ICA-based transform of features and discriminatively trained HMMs. In Signal Processing and Its Applications, 2003. Proceedings. Seventh International Symposium on (Vol. 2, pp. 133-136). IEEE.
  4. Little, D., & Pardo, B. (2008, December). Learning Musical Instruments from Mixtures of Audio with Weak Labels. In ISMIR(Vol. 8, pp. 127-132).
  5. Kobayashi, Y. (2009). Automatic Generation of Musical Instrument Detector by Using Evolutionary Learning Method. In ISMIR (pp. 93-98).