Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Oxford English Corpus

From Wikipedia, the free encyclopedia
Text corpus and database of 21st century English

TheOxford English Corpus (OEC) is atext corpus of 21st-centuryEnglish, used by the makers of theOxford English Dictionary and byOxford University Press' language research programme. It is the largest corpus of its kind, containing nearly 2.1 billion words.[1]It includes language from the UK, the United States, Ireland, Australia, New Zealand, the Caribbean, Canada, India, Singapore, and South Africa.[2] The text is mainly collected fromweb pages; some printed texts, such asacademic journals, have been collected to supplement particular subject areas.[2] The sources are writings of all sorts, from "literary novels and specialist journals to everyday newspapers and magazines and fromHansard to the language of blogs, emails, and social media".[2] This may be contrasted with similar databases that sample only a specific kind of writing. The corpus is generally available only to researchers at Oxford University Press, but other researchers who can demonstrate a strong need may apply for access.[2][3]

The digital version of the Oxford English Corpus is formatted inXML and usually analysed withSketch Engine software.[4] By April 27, 2006, the dictionary database had 1 billion words.[5]

Each document in the OE Corpus is accompanied bymetadata including:

  • title
  • author (if known; many websites make this difficult to determine reliably)
  • author gender (if known)
  • language type (e.g. British English, American English)
  • source website
  • year (+ date, if known)
  • date of collection
  • domain + subdomain
  • document statistics (number of tokens, sentences, etc.)[4]

See also

[edit]

References

[edit]
  1. ^"The Oxford English Corpus".Sketch Engine. Lexical Computing CZ s.r.o. 6 June 2015. Retrieved27 October 2016.
  2. ^abcd"The Oxford English Corpus".Oxford Dictionaries Online. Oxford University Press. Archived fromthe original on 1 January 2012. Retrieved8 November 2014.
  3. ^"Compare COCA".Corpus of Contemporary American English. Archived fromthe original on 7 November 2014. Retrieved8 November 2014.
  4. ^abThe Oxford English Corpus. Retrieved February 4, 2014.
  5. ^"Dictionary database has billion words".Northwest Herald. 27 April 2006. p. 2. Retrieved15 March 2020 – via Newspapers.com.
Text corpora,
English
Text corpora,
non-English
Organizations
English language
American English
Australian English
Canadian English
History
Biography
Religion
Science
Philosophy
Other languages
Latin
Ancient Greek
French
Russian


Stub icon

This article about theEnglish language is astub. You can help Wikipedia byexpanding it.

Stub icon

Thistext corpus orspeech corpus-related article is astub. You can help Wikipedia byexpanding it.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Oxford_English_Corpus&oldid=1268843896"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp