Learning the hidden structure of speech
- PMID:3372872
- DOI: 10.1121/1.395916
Learning the hidden structure of speech
Abstract
In the work described here, the backpropagation neural network learning procedure is applied to the analysis and recognition of speech. This procedure takes a set of input/output pattern pairs and attempts to learn their functional relationship; it develops the necessary representational features during the course of learning. A series of computer simulation studies was carried out to assess the ability of these networks to accurately label sounds, to learn to recognize sounds without labels, and to learn feature representations of continuous speech. These studies demonstrated that the networks can learn to label presegmented test tokens with accuracies of up to 95%. Networks trained on segmented sounds using a strategy that requires no external labels were able to recognize and delineate sounds in continuous speech. These networks developed rich internal representations that included units which corresponded to such traditional distinctions as vowels and consonants, as well as units that were sensitive to novel and nonstandard features. Networks trained on a large corpus of unsegmented, continuous speech without labels also developed interesting feature representations, which may be useful in both segmentation and label learning. The results of these studies, while preliminary, demonstrate that backpropagation learning can be used with complex, natural data to identify a feature structure that can serve as the basis for both analysis and nontrivial pattern recognition.
Similar articles
- Speaker-independent consonant classification in continuous speech with distinctive features and neural networks.De Mori R, Flammia G.De Mori R, et al.J Acoust Soc Am. 1993 Dec;94(6):3091-103. doi: 10.1121/1.407243.J Acoust Soc Am. 1993.PMID:8300949
- Modeling the categorical perception of speech sounds: a step toward biological plausibility.Salminen NH, Tiitinen H, May PJ.Salminen NH, et al.Cogn Affect Behav Neurosci. 2009 Sep;9(3):304-13. doi: 10.3758/CABN.9.3.304.Cogn Affect Behav Neurosci. 2009.PMID:19679765
- Connectionist modelling of the separable processing of consonants and vowels.Monaghan P, Shillcock R.Monaghan P, et al.Brain Lang. 2003 Jul;86(1):83-98. doi: 10.1016/s0093-934x(02)00536-9.Brain Lang. 2003.PMID:12821417
- Hearing speech sounds: top-down influences on the interface between audition and speech perception.Davis MH, Johnsrude IS.Davis MH, et al.Hear Res. 2007 Jul;229(1-2):132-47. doi: 10.1016/j.heares.2007.01.014. Epub 2007 Jan 18.Hear Res. 2007.PMID:17317056Review.
- Speech perception.Diehl RL, Lotto AJ, Holt LL.Diehl RL, et al.Annu Rev Psychol. 2004;55:149-79. doi: 10.1146/annurev.psych.55.090902.142028.Annu Rev Psychol. 2004.PMID:14744213Review.
Cited by
- Prediction of chaotic time series using recurrent neural networks and reservoir computing techniques: A comparative study.Shahi S, Fenton FH, Cherry EM.Shahi S, et al.Mach Learn Appl. 2022 Jun 15;8:100300. doi: 10.1016/j.mlwa.2022.100300. Epub 2022 Apr 9.Mach Learn Appl. 2022.PMID:35755176Free PMC article.
- Using neural networks to diagnose cancer.Maclin PS, Dempsey J, Brooks J, Rand J.Maclin PS, et al.J Med Syst. 1991 Feb;15(1):11-9. doi: 10.1007/BF00993877.J Med Syst. 1991.PMID:1748845
- A neural network model of the effect of prior experience with regularities on subsequent category learning.Roark CL, Plaut DC, Holt LL.Roark CL, et al.Cognition. 2022 May;222:104997. doi: 10.1016/j.cognition.2021.104997. Epub 2022 Jan 7.Cognition. 2022.PMID:35007885Free PMC article.
- SORN: a self-organizing recurrent neural network.Lazar A, Pipa G, Triesch J.Lazar A, et al.Front Comput Neurosci. 2009 Oct 30;3:23. doi: 10.3389/neuro.10.023.2009. eCollection 2009.Front Comput Neurosci. 2009.PMID:19893759Free PMC article.
- COVID-19 sentiment analysis via deep learning during the rise of novel cases.Chandra R, Krishna A.Chandra R, et al.PLoS One. 2021 Aug 19;16(8):e0255615. doi: 10.1371/journal.pone.0255615. eCollection 2021.PLoS One. 2021.PMID:34411112Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources