Movatterモバイル変換

Part of the book series:Lecture Notes in Computer Science ((LNIP,volume 4225))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

1281Accesses
1Citations

Abstract

This paper describes a detailed analysis and implementation of a robust gender detector for audio stream applications. The implementation, based on melcepstral features and a Gaussian mixture model classifier, is designed to maximize gender classification performance in continuous speech. The described detector outperforms other reported systems based on statistically significant numbers of gender verifications (2136 unique speakers) obtained from the FISHER speech corpus. The system yields high accuracies for long and short utterances while a confidence figure of merit score for the decision ensures reliability in continuous audio streams.

Download to read the full chapter text

Chapter PDF

A Novel Approach for Multi-pitch Detection with Gender Recognition

Voice Gender Recognition Using Acoustic Features, MFCCs and SVM

Transformer-based language-independent gender recognition in noisy audio environments

ArticleOpen access25 April 2025

Keywords

References

Parris, E.S., Carey, M.J.: Language Dependent Gender Identification. In: Acoustics, Speech, and Signal Processing. ICASSP-1996 Conference Proceedings, vol. 2, pp. 685–688 (1996)
Google Scholar
Hurb, H., Chen, L.: Gender Identification Using a General Audio Classifier. In: ICME 2003 Proceedings, July 2003, vol. 2, pp. 733–736 (2003)
Google Scholar
Kamran, M., Bruce, I.C.: Robust Formant Tracking for Continuous Speech with Speaker Variability. IEEE Trans. Speech and Audio Proc. January 19 (2005) (accepted for publication)
Google Scholar
Vergin, R., Farhat, A., O’Shaughnessy, D.: Robust Gender-dependent Acoustic-phonetic Modelling in Continuous Speech Recognition Based on a New Automatic Male/female Classification. In: ICSLP 1996 Conference Proceedings, October 1996, vol. 2, pp. 1081–1084 (1996)
Google Scholar
Slomka, S., Sridharan, S.: Automatic Gender Identification Optimised for Language Independence. In: TENCON 1997 IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications Conference Proceedings, December 1997, vol. 1, pp. 685–688 (1997)
Google Scholar
Childers, D.G., Ke, W., Bae, K.S., Hicks, D.M.: Automatic Recognition of Gender by Voice. Acoustics, Speech, and Signal Processing. In: ICASSP 1988 Conference Proceedings, vol. 1, pp. 603–606 (1988)
Google Scholar
Torres-Carrasquillo, P.A., Singer, E., Kohler, M.A., Greene, R.J., Reynolds, D.A., Deller, J.R.: Approaches to Language Identification Using Gaussian Mixture Models and Shifted Delta Cepstral Features. In: International Conference in Spoken Language. Denver (2002)
Google Scholar
Chen, T., Huang, C., Chang, E., Wang, J.: Automatic Accent Identification Using Gaussian Mixture Models. In: Workshop in Automatic Speech Recognition and Understanding ASRU 2001, pp. 343–346 (2001)
Google Scholar
Andrianaki, I., White, P. R.: Modeling of Mel Frequency Features for Non Stationary Noise. Institute of Sound and Vibration Research. University of Southampton. Available:http://dea.brunel.ac.uk/cmsp/Projnoise2003/Presentation25052004Ioannis.ppt
Fisher English Training Speech Part 1, Linguistic Data Consortium, LDC2004S13 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Bimedical Engineering, University of New Brunswick, P.O. Box 4400, Fredericton, NB, E3B 5A3, Canada
Erik Scheme & Kevin Englehart
Dept. of Electrical and Computer Engineering, University of New Brunswick, P.O. Box 4400, Fredericton, NB, E3B 5A3, Canada
Eduardo Castillo-Guerra & Kevin Englehart
Diaphonics Inc., 1310 Hollis Street, Halifax, Nova Scotia, B3J 3P3, Canada
Arvind Kizhanatham

Authors

Erik Scheme
View author publications
You can also search for this author inPubMed Google Scholar
Eduardo Castillo-Guerra
View author publications
You can also search for this author inPubMed Google Scholar
Kevin Englehart
View author publications
You can also search for this author inPubMed Google Scholar
Arvind Kizhanatham
View author publications
You can also search for this author inPubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department,, José Francisco Martínez-Trinidad, National Institute of Astrophysics, Optics and Electronics (INAOE), Luis Enrique Erro No. 1, 72840 Sta. Maria Tonantzintla, Puebla, Mexico
José Francisco Martínez-Trinidad
Computer Science Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Luis Enrique Erro No. 1, 72840 Sta. Maria Tonantzintla, Puebla, Mexico
Jesús Ariel Carrasco Ochoa
Centre for Vision, Speech and Signal Processing, University of Surrey,, GU2 7XH, Guildford, UK
Josef Kittler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Scheme, E., Castillo-Guerra, E., Englehart, K., Kizhanatham, A. (2006). Practical Considerations for Real-Time Implementation of Speech-Based Gender Detection. In: Martínez-Trinidad, J.F., Carrasco Ochoa, J.A., Kittler, J. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2006. Lecture Notes in Computer Science, vol 4225. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11892755_44

Download citation

DOI:https://doi.org/10.1007/11892755_44
Publisher Name:Springer, Berlin, Heidelberg
Print ISBN:978-3-540-46556-0
Online ISBN:978-3-540-46557-7
eBook Packages:Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)