Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

Practical Considerations for Real-Time Implementation of Speech-Based Gender Detection

  • Conference paper

Part of the book series:Lecture Notes in Computer Science ((LNIP,volume 4225))

Included in the following conference series:

Abstract

This paper describes a detailed analysis and implementation of a robust gender detector for audio stream applications. The implementation, based on melcepstral features and a Gaussian mixture model classifier, is designed to maximize gender classification performance in continuous speech. The described detector outperforms other reported systems based on statistically significant numbers of gender verifications (2136 unique speakers) obtained from the FISHER speech corpus. The system yields high accuracies for long and short utterances while a confidence figure of merit score for the decision ensures reliability in continuous audio streams.

Similar content being viewed by others

Keywords

References

  1. Parris, E.S., Carey, M.J.: Language Dependent Gender Identification. In: Acoustics, Speech, and Signal Processing. ICASSP-1996 Conference Proceedings, vol. 2, pp. 685–688 (1996)

    Google Scholar 

  2. Hurb, H., Chen, L.: Gender Identification Using a General Audio Classifier. In: ICME 2003 Proceedings, July 2003, vol. 2, pp. 733–736 (2003)

    Google Scholar 

  3. Kamran, M., Bruce, I.C.: Robust Formant Tracking for Continuous Speech with Speaker Variability. IEEE Trans. Speech and Audio Proc. January 19 (2005) (accepted for publication)

    Google Scholar 

  4. Vergin, R., Farhat, A., O’Shaughnessy, D.: Robust Gender-dependent Acoustic-phonetic Modelling in Continuous Speech Recognition Based on a New Automatic Male/female Classification. In: ICSLP 1996 Conference Proceedings, October 1996, vol. 2, pp. 1081–1084 (1996)

    Google Scholar 

  5. Slomka, S., Sridharan, S.: Automatic Gender Identification Optimised for Language Independence. In: TENCON 1997 IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications Conference Proceedings, December 1997, vol. 1, pp. 685–688 (1997)

    Google Scholar 

  6. Childers, D.G., Ke, W., Bae, K.S., Hicks, D.M.: Automatic Recognition of Gender by Voice. Acoustics, Speech, and Signal Processing. In: ICASSP 1988 Conference Proceedings, vol. 1, pp. 603–606 (1988)

    Google Scholar 

  7. Torres-Carrasquillo, P.A., Singer, E., Kohler, M.A., Greene, R.J., Reynolds, D.A., Deller, J.R.: Approaches to Language Identification Using Gaussian Mixture Models and Shifted Delta Cepstral Features. In: International Conference in Spoken Language. Denver (2002)

    Google Scholar 

  8. Chen, T., Huang, C., Chang, E., Wang, J.: Automatic Accent Identification Using Gaussian Mixture Models. In: Workshop in Automatic Speech Recognition and Understanding ASRU 2001, pp. 343–346 (2001)

    Google Scholar 

  9. Andrianaki, I., White, P. R.: Modeling of Mel Frequency Features for Non Stationary Noise. Institute of Sound and Vibration Research. University of Southampton. Available:http://dea.brunel.ac.uk/cmsp/Projnoise2003/Presentation25052004Ioannis.ppt

  10. Fisher English Training Speech Part 1, Linguistic Data Consortium, LDC2004S13 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Institute of Bimedical Engineering, University of New Brunswick, P.O. Box 4400, Fredericton, NB, E3B 5A3, Canada

    Erik Scheme & Kevin Englehart

  2. Dept. of Electrical and Computer Engineering, University of New Brunswick, P.O. Box 4400, Fredericton, NB, E3B 5A3, Canada

    Eduardo Castillo-Guerra & Kevin Englehart

  3. Diaphonics Inc., 1310 Hollis Street, Halifax, Nova Scotia, B3J 3P3, Canada

    Arvind Kizhanatham

Authors
  1. Erik Scheme

    You can also search for this author inPubMed Google Scholar

  2. Eduardo Castillo-Guerra

    You can also search for this author inPubMed Google Scholar

  3. Kevin Englehart

    You can also search for this author inPubMed Google Scholar

  4. Arvind Kizhanatham

    You can also search for this author inPubMed Google Scholar

Editor information

Editors and Affiliations

  1. Computer Science Department,, José Francisco Martínez-Trinidad, National Institute of Astrophysics, Optics and Electronics (INAOE), Luis Enrique Erro No. 1, 72840 Sta. Maria Tonantzintla, Puebla, Mexico

    José Francisco Martínez-Trinidad

  2. Computer Science Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Luis Enrique Erro No. 1, 72840 Sta. Maria Tonantzintla, Puebla, Mexico

    Jesús Ariel Carrasco Ochoa

  3. Centre for Vision, Speech and Signal Processing, University of Surrey,, GU2 7XH, Guildford, UK

    Josef Kittler

Rights and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Scheme, E., Castillo-Guerra, E., Englehart, K., Kizhanatham, A. (2006). Practical Considerations for Real-Time Implementation of Speech-Based Gender Detection. In: Martínez-Trinidad, J.F., Carrasco Ochoa, J.A., Kittler, J. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2006. Lecture Notes in Computer Science, vol 4225. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11892755_44

Download citation

Publish with us

Societies and partnerships


[8]ページ先頭

©2009-2025 Movatter.jp