Movatterモバイル変換

Nonlinear mixture autoregressive hidden Markov models for speech recognition

Sundar Srinivasan, Tao Ma, Daniel May, Georgios Lazarou, Joseph Picone

Gaussian mixture models are a very successful method for modeling the output distribution of a state in a hidden Markov model (HMM). However, this approach is limited by the assumption that the dynamics of speech features are linear and can be modeled with static features and their derivatives. In this paper, a nonlinear mixture autoregressive model is used to model state output distributions (MAR-HMM). Estimation of model parameters is extended to handle vector features. MAR-HMMs are shown to provide superior performance to comparable Gaussian mixture model-based HMMs (GMM-HMM) with lower complexity on two pilot classification tasks.

@inproceedings{srinivasan08_interspeech,  title     = {Nonlinear mixture autoregressive hidden Markov models for speech recognition},  author    = {Sundar Srinivasan and Tao Ma and Daniel May and Georgios Lazarou and Joseph Picone},  year      = {2008},  booktitle = {Interspeech 2008},  pages     = {960--963},  doi       = {10.21437/Interspeech.2008-118},  issn      = {2958-1796},}

Cite as:Srinivasan, S., Ma, T., May, D., Lazarou, G., Picone, J. (2008) Nonlinear mixture autoregressive hidden Markov models for speech recognition. Proc. Interspeech 2008, 960-963, doi: 10.21437/Interspeech.2008-118

doi:10.21437/Interspeech.2008-118