Auditory-based Acoustic Distinctive Features and Formant Cues for Robust Automatic Speech Recognition in Low-SNR Car Environments

H. Tolba, S.-A. Selouani, and D. O'Shaughnessy (Canada)


Automatic Speech Recognition, Formants, Ear-Model, MFCCs, Multistream Paradigm, HMMs


In this paper, a multi-stream paradigm is proposed to im prove the performance of automatic speech recognition (ASR) systems in the presence of highly interfering car noise. It was found that combining the classical MFCCs with some auditory-based acoustic distinctive cues and the main formant frequencies of a speech signal using a multi stream paradigm leads to an improvement in the recog nition performance in noisy car environments. The Hid den Markov Model Toolkit (HTK) was used throughout our experiments to test the use of the new multi-stream feature vector in noisy environments. A series of exper iments on speaker-independent continuous-speech recog nition have been carried out using a noisy version of the TIMIT database. Using such multi-stream paradigm, we found that the use of the proposed paradigm, outperforms the conventional recognition process based on the use of the MFCCs in interfering car noise environments for a wide range of SNRs varying from 16 dB to -4 dB.

Important Links:

Go Back