Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Common Voice

From Wikipedia, the free encyclopedia
Voice dataset by Mozilla
Common Voice
Developer(s)Mozilla Foundation
Initial releaseJune 19, 2017; 7 years ago (2017-06-19)
Repositorygithub.com/common-voice/common-voice
Available inMultilingual (List of languages)
LicenseCreative Commons CC0
Websitecommonvoice.mozilla.org

Common Voice is acrowdsourcing project started byMozilla to create a freedatabase forspeech recognition software. The project is supported byvolunteers who record sample sentences with amicrophone and review recordings of other users. The transcribed sentences are collected in a voice database available under thepublic domain licenseCC0.[1] This license ensures thatdevelopers can use the database for voice-to-text applications without restrictions or costs.

Aims

[edit]

Common Voice aims to provide diverse voice samples. According to Mozilla'sKatharina Borchert, many existing projects took datasets from public radio or otherwise had datasets that underrepresented both women and people with pronounced accents.[2]

History

[edit]
This section needs to beupdated. Please help update this article to reflect recent events or newly available information.(October 2024)

At the beginning of 2022, Bengali.AI partnered with Common Voice to launch "Bangla Speech Recognition" project that aims to make machines understandBangla language. 2000 hours of voice was collected with aim for higher than 10,000 hours.[3]

Voice database

[edit]

The first dataset was released in November 2017. More than 20,000 users worldwide had recorded 500 hours of English sentences.[4]

In February 2019, the first batch of languages was released for use. This included 18 languages:English,French,German andMandarin Chinese, but also less prevalent languages asWelsh andKabyle. In total, this included almost 1,400 hours of recorded voice data from more than 42,000 contributors.[5]

As of July 2020 the database has amassed 7,226 hours of voice recordings in 54 languages, 5,591 hours of which has been verified by volunteers.[6]

In May 2021, following the work to addKinyarwanda, they received a grant to addKiswahili.[7]

In September 2022, it was announced that theTwi language of Ghana was the 100th language to be added to the Mozilla Common Voice database.[8]

As of October 2022[update], Mozilla Common Voice officially collects voice data for the following languages:[9]

See also

[edit]

References

[edit]
  1. ^"Mozilla Common Voice".commonvoice.mozilla.org. Retrieved2024-10-06.
  2. ^"Why do we gender AI? Voice tech firms move to be more inclusive".The Guardian. 11 January 2020.Archived from the original on 19 December 2022. Retrieved19 April 2020.
  3. ^"Bengali.AI: Democratising AI research in Bangla".The Business Standard. 2022-12-23.Archived from the original on 2022-12-24. Retrieved2022-12-25.
  4. ^"Announcing the Initial Release of Mozilla's Open Source Speech Recognition Model and Voice Dataset".blog mozilla.org. November 29, 2017.Archived from the original on November 29, 2017. RetrievedNovember 19, 2019.
  5. ^"Mozilla updates Common Voice dataset with 1,400 hours of speech across 18 languages".VentureBeat. February 28, 2019.Archived from the original on March 4, 2019. RetrievedNovember 19, 2019.
  6. ^"Mozilla Common Voice updates will help train the 'Hey Firefox' wakeword for voice-based web browsing".VentureBeat. 1 July 2020. Archived fromthe original on March 10, 2021. Retrieved1 April 2021.
  7. ^"Mozilla Common Voice Receives $3.4 Million Investment to Democratize and Diversify Voice Tech in East Africa".Mozilla Foundation. 2021-05-25.Archived from the original on 2022-12-19. Retrieved2021-06-03.
  8. ^Onukwue, Alexander (23 September 2022)."Ghana's most popular language is now on Mozilla Common Voice".Quartz.Archived from the original on 2 December 2022. Retrieved3 October 2022.
  9. ^"Languages".commonvoice.mozilla.org.Archived from the original on 24 December 2022. Retrieved4 October 2022.
Projects
Mozilla
Labs
Mozilla
Research
Mozilla
Foundation
Firefox
Origins
Frameworks
Components
Typefaces
Discontinued
Forks
Discontinued projects are initalics. Some projects abandoned by Mozilla that are still maintained by third parties are inunderline.
Organization
Foundation
Official affiliates
People
Community
Other topics
Retrieved from "https://en.wikipedia.org/w/index.php?title=Common_Voice&oldid=1281113460"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp