Part of the book series:Lecture Notes in Computer Science ((LNIP,volume 4225))
Included in the following conference series:
1281Accesses
Abstract
This paper describes a detailed analysis and implementation of a robust gender detector for audio stream applications. The implementation, based on melcepstral features and a Gaussian mixture model classifier, is designed to maximize gender classification performance in continuous speech. The described detector outperforms other reported systems based on statistically significant numbers of gender verifications (2136 unique speakers) obtained from the FISHER speech corpus. The system yields high accuracies for long and short utterances while a confidence figure of merit score for the decision ensures reliability in continuous audio streams.
Chapter PDF
Similar content being viewed by others
References
Parris, E.S., Carey, M.J.: Language Dependent Gender Identification. In: Acoustics, Speech, and Signal Processing. ICASSP-1996 Conference Proceedings, vol. 2, pp. 685–688 (1996)
Hurb, H., Chen, L.: Gender Identification Using a General Audio Classifier. In: ICME 2003 Proceedings, July 2003, vol. 2, pp. 733–736 (2003)
Kamran, M., Bruce, I.C.: Robust Formant Tracking for Continuous Speech with Speaker Variability. IEEE Trans. Speech and Audio Proc. January 19 (2005) (accepted for publication)
Vergin, R., Farhat, A., O’Shaughnessy, D.: Robust Gender-dependent Acoustic-phonetic Modelling in Continuous Speech Recognition Based on a New Automatic Male/female Classification. In: ICSLP 1996 Conference Proceedings, October 1996, vol. 2, pp. 1081–1084 (1996)
Slomka, S., Sridharan, S.: Automatic Gender Identification Optimised for Language Independence. In: TENCON 1997 IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications Conference Proceedings, December 1997, vol. 1, pp. 685–688 (1997)
Childers, D.G., Ke, W., Bae, K.S., Hicks, D.M.: Automatic Recognition of Gender by Voice. Acoustics, Speech, and Signal Processing. In: ICASSP 1988 Conference Proceedings, vol. 1, pp. 603–606 (1988)
Torres-Carrasquillo, P.A., Singer, E., Kohler, M.A., Greene, R.J., Reynolds, D.A., Deller, J.R.: Approaches to Language Identification Using Gaussian Mixture Models and Shifted Delta Cepstral Features. In: International Conference in Spoken Language. Denver (2002)
Chen, T., Huang, C., Chang, E., Wang, J.: Automatic Accent Identification Using Gaussian Mixture Models. In: Workshop in Automatic Speech Recognition and Understanding ASRU 2001, pp. 343–346 (2001)
Andrianaki, I., White, P. R.: Modeling of Mel Frequency Features for Non Stationary Noise. Institute of Sound and Vibration Research. University of Southampton. Available:http://dea.brunel.ac.uk/cmsp/Projnoise2003/Presentation25052004Ioannis.ppt
Fisher English Training Speech Part 1, Linguistic Data Consortium, LDC2004S13 (2004)
Author information
Authors and Affiliations
Institute of Bimedical Engineering, University of New Brunswick, P.O. Box 4400, Fredericton, NB, E3B 5A3, Canada
Erik Scheme & Kevin Englehart
Dept. of Electrical and Computer Engineering, University of New Brunswick, P.O. Box 4400, Fredericton, NB, E3B 5A3, Canada
Eduardo Castillo-Guerra & Kevin Englehart
Diaphonics Inc., 1310 Hollis Street, Halifax, Nova Scotia, B3J 3P3, Canada
Arvind Kizhanatham
- Erik Scheme
You can also search for this author inPubMed Google Scholar
- Eduardo Castillo-Guerra
You can also search for this author inPubMed Google Scholar
- Kevin Englehart
You can also search for this author inPubMed Google Scholar
- Arvind Kizhanatham
You can also search for this author inPubMed Google Scholar
Editor information
Editors and Affiliations
Computer Science Department,, José Francisco Martínez-Trinidad, National Institute of Astrophysics, Optics and Electronics (INAOE), Luis Enrique Erro No. 1, 72840 Sta. Maria Tonantzintla, Puebla, Mexico
José Francisco Martínez-Trinidad
Computer Science Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Luis Enrique Erro No. 1, 72840 Sta. Maria Tonantzintla, Puebla, Mexico
Jesús Ariel Carrasco Ochoa
Centre for Vision, Speech and Signal Processing, University of Surrey,, GU2 7XH, Guildford, UK
Josef Kittler
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Scheme, E., Castillo-Guerra, E., Englehart, K., Kizhanatham, A. (2006). Practical Considerations for Real-Time Implementation of Speech-Based Gender Detection. In: Martínez-Trinidad, J.F., Carrasco Ochoa, J.A., Kittler, J. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2006. Lecture Notes in Computer Science, vol 4225. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11892755_44
Download citation
Publisher Name:Springer, Berlin, Heidelberg
Print ISBN:978-3-540-46556-0
Online ISBN:978-3-540-46557-7
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative