Movatterモバイル変換


[0]ホーム

URL:


Latest Version
1.1
License
Public Domain
Download Size
2.6 GB

The LJ Speech Dataset

This is a public domain speech dataset consisting of 13,100 short audio clipsof a single speaker reading passages from 7 non-fiction books. A transcription is provided for each clip.Clips vary in length from 1 to 10 seconds and have a total length of approximately24 hours.

The texts were published between 1884 and 1964, and are in the public domain. The audio wasrecorded in 2016-17 by theLibriVox project and is also in thepublic domain.

Download Dataset (2.6 GB)

Sample Data

The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired,

Many animals of even complex structure which live parasitically within others are wholly devoid of an alimentary cavity.

File Format

Metadata is provided intranscripts.csv. This file consists of one recordper line, delimited by the pipe character (0x7c). The fields are:

  1. ID: this is the name of the corresponding .wav file
  2. Transcription: words spoken by the reader (UTF-8)
  3. Normalized Transcription: transcription with numbers, ordinals, and monetary units expanded into full words (UTF-8).

Each audio file is a single-channel 16-bit PCM WAV with a sample rate of 22050 Hz.

Statistics

Total Clips13,100
Total Words225,715
Total Characters1,308,678
Total Duration23:55:17
Mean Clip Duration6.57 sec
Min Clip Duration1.11 sec
Max Clip Duration10.10 sec
Mean Words per Clip17.23
Distinct Words13,821

Miscellaneous

Changelog

  • 1.1 (current release)
    Version 1.0 included 30 .wav files without corresponding annotations in metadata.csv. These have been removed in version 1.1. Thanks to Rafael Valle for spotting this.
  • 1.0
    Initial release

License

This dataset is in the public domain in the US (and most likely other countries as well). There are no restrictions on its use. For more information, please see:librivox.org/pages/public-domain.

Credits

This dataset consists of excerpts from the following works:

  • Morris, William, et al.Arts and Crafts Essays. 1893.
  • Griffiths, Arthur.The Chronicles of Newgate, Vol. 2. 1884.
  • Roosevelt, Franklin D.The Fireside Chats of Franklin Delano Roosevelt. 1933-42.
  • Harland, Marion.Marion Harland's Cookery for Beginners. 1893.
  • Rolt-Wheeler, Francis.The Science - History of the Universe, Vol. 5: Biology. 1910.
  • Banks, Edgar J.The Seven Wonders of the Ancient World. 1916.
  • President's Commission on the Assassination of President Kennedy.Report of the President's Commission on the Assassination of President Kennedy. 1964.

Recordings by Linda Johnson fromLibriVox. Alignment and annotation byKeith Ito. All text, audio, and annotations are in the public domain. We request that you use this dataset for good and not evil.

As this work is in the public domain, you may use it without attribution. However, if you'd like to cite it in a publication, please do so by linking to this page or using the following:

@misc{ljspeech17, author = {Keith Ito and Linda Johnson}, title = {The LJ Speech Dataset}, howpublished = {\url{https://keithito.com/LJ-Speech-Dataset/}}, year = 2017}


[8]ページ先頭

©2009-2025 Movatter.jp