Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

spaCy

From Wikipedia, the free encyclopedia
Software library for natural language processing
For the 1981 film, seeSpacy (film).
Not to be confused withScapy.
spaCy
Original author(s)Matthew Honnibal
Developer(s)Explosion AI, various
Initial releaseFebruary 2015; 10 years ago (2015-02)[1]
Stable release
3.8.4[2] Edit this on Wikidata / 14 January 2025; 2 months ago (14 January 2025)
Repository
Written inPython,Cython
Operating systemLinux,Windows,macOS,OS X
PlatformCross-platform
TypeNatural language processing
LicenseMIT License
Websitespacy.ioEdit this at Wikidata

spaCy (/spˈs/spay-SEE) is anopen-source software library for advancednatural language processing, written in the programming languagesPython andCython.[3][4] The library is published under theMIT license and its main developers areMatthew Honnibal andInes Montani, the founders of the software company Explosion.

UnlikeNLTK, which is widely used for teaching and research, spaCy focuses on providing software for production usage.[5][6] spaCy also supportsdeep learning workflows that allow connecting statistical models trained by popularmachine learning libraries likeTensorFlow,PyTorch orMXNet through its own machine learning library Thinc.[7][8] Using Thinc as its backend, spaCy featuresconvolutional neural network models forpart-of-speech tagging,dependency parsing,text categorization andnamed entity recognition (NER). Prebuilt statisticalneural network models to perform these tasks are available for 23 languages, including English, Portuguese, Spanish, Russian and Chinese, and there is also a multi-languageNER model. Additional support fortokenization for more than 65 languages allows users to train custom models on their own datasets as well.[9]

History

[edit]
  • Version 1.0 was released on October 19, 2016, and included preliminary support for deep learning workflows by supporting custom processing pipelines.[10] It further included a rule matcher that supportedentity annotations, and an officially documented training API.
  • Version 2.0 was released on November 7, 2017, and introduced convolutional neural network models for 7 different languages.[11] It also supported custom processing pipeline components and extension attributes, and featured a built-in trainabletext classification component.
  • Version 3.0 was released on February 1, 2021, and introduced state-of-the-arttransformer-based pipelines.[12] It also introduced a new configuration system and training workflow, as well as type hints and project templates. This version dropped support forPython 2.

Main features

[edit]

Extensions and visualizers

[edit]
Dependency parse tree visualization generated with the displaCy visualizer
Dependencyparse tree visualization generated with the displaCy visualizer

spaCy comes with several extensions and visualizations that are available as free,open-source libraries:

References

[edit]
  1. ^"Introducing spaCy". explosion.ai. Retrieved2016-12-18.
  2. ^"Release 3.8.4". 14 January 2025. Retrieved29 January 2025.
  3. ^Choi et al. (2015).It Depends: Dependency Parser Comparison Using A Web-based Evaluation Tool.
  4. ^"Google's new artificial intelligence can't understand these sentences. Can you?".Washington Post. Retrieved2016-12-18.
  5. ^"Facts & Figures - spaCy".spacy.io. Retrieved2020-04-04.
  6. ^Bird, Steven; Klein, Ewan; Loper, Edward; Baldridge, Jason (2008)."Multidisciplinary instruction with the Natural Language Toolkit"(PDF).Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics, ACL: 62.doi:10.3115/1627306.1627317.ISBN 9781932432145.S2CID 16932735.
  7. ^"PyTorch, TensorFlow & MXNet".thinc.ai. Retrieved2020-04-04.
  8. ^"explosion/thinc".GitHub. Retrieved2016-12-30.
  9. ^"Models & Languages | spaCy Usage Documentation".spacy.io. Retrieved2020-03-10.
  10. ^"explosion/spaCy".GitHub. Retrieved2021-02-08.
  11. ^"explosion/spaCy".GitHub. Retrieved2021-02-08.
  12. ^"explosion/spaCy".GitHub. Retrieved2021-02-08.
  13. ^"Models & Languages - spaCy".spacy.io. Retrieved2021-02-08.
  14. ^"Models & Languages | spaCy Usage Documentation".spacy.io. Retrieved2021-02-08.
  15. ^"Benchmarks | spaCy Usage Documentation".spacy.io. Retrieved2021-02-08.
  16. ^Trask et al. (2015).sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings.

External links

[edit]
General terms
Text analysis
Text segmentation
Automatic summarization
Machine translation
Distributional semantics models
Language resources,
datasets and corpora
Types and
standards
Data
Automatic identification
and data capture
Topic model
Computer-assisted
reviewing
Natural language
user interface
Related
Retrieved from "https://en.wikipedia.org/w/index.php?title=SpaCy&oldid=1262357224"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp