Impulsive noise usually introduces sudden mismatches between the observation features and the acoustic models trained with clean speech, which drastically degrades the performance of automatic speech recognition (ASR) systems. This paper presents a novel method to directly suppress the adverse effect of impulsive noise on recognition. In this method, according to the noise sensitivity of each feature dimension, the observation vector is divided into several subvectors, each of which is assigned to a suitable flooring threshold. In recognition stage, observation probability of each feature sub-vector is floored at the Gaussian mixture level. Thus, the unreliable relative probability difference caused by impulsive noise is eliminated, and the expected correct state sequence recovers the priority of being chosen in decoding. Experimental evaluations on Aurora2 database show that the proposed method achieves the average error rate reduction (ERR) of 61.62% and 84.32% in simulated impulsive noise and machinegun noise environment, respectively, while maintaining high performance for clean speech recognition.
@inproceedings{ding03_eurospeech, title = {Flooring the observation probability for robust ASR in impulsive noise}, author = {Pei Ding and Bertram E. Shi and Pascale Fung and Zhigang Cao}, year = {2003}, booktitle = {8th European Conference on Speech Communication and Technology (Eurospeech 2003)}, pages = {1777--1780}, doi = {10.21437/Eurospeech.2003-491}, issn = {1018-4074},}
Cite as:Ding, P., Shi, B.E., Fung, P., Cao, Z. (2003) Flooring the observation probability for robust ASR in impulsive noise. Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003), 1777-1780, doi: 10.21437/Eurospeech.2003-491