We present a rapid compensation technique aimed at reducing the detrimental effect of environmental noise and channel on server based mobile speech recognition. It solves two key problems for such systems: firstly how to accurately separate non-speech events (or background noise) from noise introduced by network artifacts; secondly how to reduce the latency created by the extra computation required for a codebook-based linear channel compensation technique. We address the first problem by modifying an existing energy based endpoint-detection algorithm to provide segment-type information to the compensation module. We tackle the latency issue with a codebook based scheme by employing a tree structured vector quantization technique with dynamic thresholds to avoid the computation of all codewords. Our technique is evaluated using a speech-in-car database at 3 different speeds. Our results show that our method leads to a 8.7% reduction in error rate and 35% reduction in computational cost.
@inproceedings{huerta04_interspeech, title = {Rapid on-line environment compensation for server - based speech recognition in noisy mobile environments}, author = {Juan Huerta and Etienne Marcheret and Sreeram Balakrishnan}, year = {2004}, booktitle = {Interspeech 2004}, pages = {1653--1656}, doi = {10.21437/Interspeech.2004-620}, issn = {2958-1796},}
Cite as:Huerta, J., Marcheret, E., Balakrishnan, S. (2004) Rapid on-line environment compensation for server - based speech recognition in noisy mobile environments. Proc. Interspeech 2004, 1653-1656, doi: 10.21437/Interspeech.2004-620