In this work we propose an online filtering algorithm that aims to alleviate the decrease we see in ASR performance when the speech is corrupted by additive noise. Using an initial estimate of the noise distribution, the algorithm updates the noise model on a frame synchronous basis. The minimum mean square error (MMSE) filtering is also performed at a frame per frame basis, using the most current noise model estimate at all times. The algorithm is compared to a batch version which uses several iterations of the EM-algorithm over the complete utterance to estimate the noise model, and it is demonstrated that the performance is as good or better at a fraction of the computational complexity when the noise is non-stationary.
@inproceedings{myrvoll04_interspeech, title = {Online minimum mean square error filtering of noisy cepstral coefficients using a sequential EM algorithm}, author = {Tor Andre Myrvoll and Satoshi Nakamura}, year = {2004}, booktitle = {Interspeech 2004}, pages = {117--120}, doi = {10.21437/Interspeech.2004-97}, issn = {2958-1796},}
Cite as:Myrvoll, T.A., Nakamura, S. (2004) Online minimum mean square error filtering of noisy cepstral coefficients using a sequential EM algorithm. Proc. Interspeech 2004, 117-120, doi: 10.21437/Interspeech.2004-97