Language | Treebank | Syntactic Formalism | Distribution / License |
---|
Abaza | Universal Dependencies, ATB | Dependency | CC BY-SA |
Afrikaans | Universal Dependencies, AfriBooms | Dependency | CC BY-SA |
Akkadian | Universal Dependencies, PISANDUB | Dependency | CC BY-SA |
Albanian | Universal Dependencies, TSA | Dependency | CC BY-SA |
Amharic | Universal Dependencies, ATT | Dependency | CC BY-SA |
Ancient Greek | Universal Dependencies, Perseus | Dependency | CC BY-NC-SA |
Ancient Greek | Universal Dependencies, PROIEL | Dependency | CC BY-NC-SA |
Greek (ancient) | Ancient Greek Dependency Treebank[7][8] | Dependency | Open source (Creative Commons license) |
Greek (ancient) | PROIEL Treebank[9] | Dependency | Open source (Creative Commons license) |
Arabic | Columbia Arabic Treebank (CATiB) | Dependency | Linguistic Data Consortium |
Arabic | Prague Arabic Dependency Treebank (PADT) | Dependency | Linguistic Data Consortium |
Arabic | Universal Dependencies, NYUAD | Dependency | CC BY-SA |
Arabic | Universal Dependencies, PADT | Dependency | CC BY-NC-SA |
Arabic | Universal Dependencies, PUD | Dependency | CC BY-SA |
Arabic | Penn Arabic Treebank | Phrase structure | Linguistic Data Consortium |
Armenian | Universal Dependencies, ArmTDP | Dependency | CC BY-SA |
Assyrian (Neo-Aramaic) | Universal Dependencies, AS | Dependency | CC BY-SA |
Bambara | Universal Dependencies, CRB | Dependency | CC BY-SA |
Basque | Universal Dependencies, BDT | Dependency | CC BY-NC-SA |
Belarusian | Universal Dependencies, HSE | Dependency | CC BY-SA |
Bhojpuri | Universal Dependencies, BhEn | Dependency | CC BY-SA |
Bhojpuri | Universal Dependencies, BHTB | Dependency | CC BY-SA |
Breton | Universal Dependencies, KEB | Dependency | CC BY-SA |
Bulgarian | Universal Dependencies, BTB | Dependency | CC BY-NC-SA |
Bulgarian | BulTreeBank | HPSG | Freely available for research |
Buryat | Universal Dependencies, BDT | Dependency | CC BY-SA |
Cantonese | Universal Dependencies, HK | Dependency | CC BY-SA |
Catalan | Cat3LB | Phrase structure | Freely available for research |
Catalan | Universal Dependencies, AnCora | Dependency | GPL |
Chinese | Sinica Treebank | Case grammar | Not freely available |
Chinese | Universal Dependencies, CFL | Dependency | CC BY-SA |
Chinese | Universal Dependencies, GSD | Dependency | CC BY-SA |
Chinese | Universal Dependencies, GSDSimp | Dependency | CC BY-SA |
Chinese | Universal Dependencies, HK | Dependency | CC BY-SA |
Chinese | Universal Dependencies, PUD | Dependency | CC BY-SA |
Chinese | Penn Chinese Treebank | Phrase structure | Linguistic Data Consortium |
Chinese | Chinese Dependency Treebank | Dependency | Linguistic Data Consortium |
Arabic (classical) | Quranic Arabic Dependency Treebank (QADT) (Quranic Arabic Corpus) | Dependency | Open source (GNU general public license) |
Classical Armenian | PROIEL Treebank[9] | Dependency | Open source (Creative Commons license) |
Coptic | Universal Dependencies, Coptic Scriptorium | Dependency | CC BY |
Croatian | Croatian Dependency Treebank | Dependency | Open source (Creative Commons license) |
Croatian | Universal Dependencies, SET | Dependency | CC BY-SA |
Czech | Prague Dependency Treebank | Dependency | Open source (Creative Commons license) |
Czech | Universal Dependencies, CAC | Dependency | CC BY-SA |
Czech | Universal Dependencies, CLTT | Dependency | CC BY-SA |
Czech | Universal Dependencies, FicTree | Dependency | CC BY-NC-SA |
Czech | Universal Dependencies, PDT | Dependency | CC BY-NC-SA |
Czech | Universal Dependencies, PUD | Dependency | CC BY-SA |
Danish | Danish Dependency Treebank | Dependency | Open source (GNU general public license) |
Danish | Arboretum: A syntactic tree corpus of Danish | Phrase structure | License fee |
Danish | Universal Dependencies, DDT | Dependency | CC BY-SA |
Danish | Universal Dependencies, DTB | Dependency | CC BY-SA |
Dutch | Spoken Dutch Corpus (CGN) | Phrase structure | License fee |
Dutch | Universal Dependencies, Alpino | Dependency | CC BY-SA |
Dutch | Universal Dependencies, LassySmall | Dependency | CC BY-SA |
Dutch | LASSY Small and Large | Dependency | License fee |
Dutch | Alpino Treebank | Dependency | Open source (GNU general public license) |
Egyptian | Universal Dependencies, UJaen | Dependency | CC BY-SA |
English | CCGbank | Combinatory categorial grammar | Linguistic Data Consortium |
English | LinGO Redwoods | HPSG | ? |
English | Lancaster Parsed Corpus | Phrase structure | ? |
English | Prague English Dependency Treebank | Dependency | Linguistic Data Consortium |
English | Universal Dependencies, BhEn | Dependency | CC BY-SA |
English | Universal Dependencies, ESL | Dependency | CC BY-SA |
English | Universal Dependencies, EWT | Dependency | CC BY-SA |
English | Universal Dependencies, GUM | Dependency | CC BY-NC-SA |
English | Universal Dependencies, GUMReddit | Dependency | CC BY |
English | Universal Dependencies, LinES | Dependency | CC BY-NC-SA |
English | Universal Dependencies, ParTUT | Dependency | CC BY-NC-SA |
English | Universal Dependencies, Pronouns | Dependency | CC BY-SA |
English | Universal Dependencies, PUD | Dependency | CC BY-SA |
English | Treebank Semantics Parsed Corpus | Phrase structure | Open source (Creative Commons license) |
English | Christine Corpus | Phrase structure | Freely available for research |
English | Lucy Corpus | Phrase structure | Freely available for research |
English | Susanne Corpus | Phrase structure | Freely available for research |
English | BLLIP WSJ corpus | Phrase structure | Linguistic Data Consortium |
English | Tübingen Treebank of English / Spontaneous Speech (TüBa-E/S) | HPSG | Freely available for research |
English | Diachronic Corpus of Present-Day Spoken English (DCPSE) | Phrase structure | License fee |
English | British Component of the International Corpus of English (ICE-GB) | Phrase structure | License fee |
English | The PARC 700 Dependency Bank | Dependency | ? |
English | Yahoo Query Treebank | Dependency | Freely available for research |
English | Penn Treebank | Phrase structure | Linguistic Data Consortium |
English | Multi-Treebank | Phrase structure | Available online for comparison purposes |
English | CHILDES Brown Eve corpus with dependency annotation | Dependency | Open source (Creative Commons license) |
English | SMULTRON - Parallel Treebank EN-DE-SV | Phrase structure | Freely available for research |
Erzya | Universal Dependencies, JR | Dependency | CC BY-SA |
Estonian | Arborest | Phrase structure | ? |
Estonian | Syntactically analyzed and disambiguated text corpus | Dependency | Freely available for research |
Estonian | Universal Dependencies, EDT | Dependency | CC BY-NC-SA |
Estonian | Universal Dependencies, EWT | Dependency | CC BY-NC-SA |
Faroese | Universal Dependencies, FarPaHC | Dependency | CC BY-SA |
Faroese | Universal Dependencies, OFT | Dependency | CC BY-SA |
Finnish | Turku Dependency Treebank (TDT) | Dependency | Open source (Creative Commons license) |
Finnish | Universal Dependencies, FTB | Dependency | CC BY |
Finnish | Universal Dependencies, PUD | Dependency | CC BY-SA |
Finnish | Universal Dependencies, TDT | Dependency | CC BY-SA |
French (spoken) | Rhapsodie | Dependency and macrosyntactic annotation | Open source (Creative Commons license) |
French | L'Arboratoire | Phrase structure | ? |
French | Universal Dependencies, CrapBank | Dependency | CC BY-SA |
French | Universal Dependencies, FQB | Dependency | GPL |
French | Universal Dependencies, FTB | Dependency | GPL |
French | Universal Dependencies, GSD | Dependency | CC BY-SA |
French | Universal Dependencies, ParTUT | Dependency | CC BY-NC-SA |
French | Universal Dependencies, PUD | Dependency | CC BY-SA |
French | Universal Dependencies, Sequoia | Dependency | GPL |
French | Universal Dependencies, Spoken | Dependency | CC BY-SA |
French | French Treebank | Phrase structure | Freely available for research |
French | Free French Treebank | Phrase structure | Open Source license LGPL-LR |
French | Sequoia Treebank | Phrase structure &Dependency | Open Source license LGPL-LR |
Galician | Universal Dependencies, CTG | Dependency | CC BY-NC-SA |
Galician | Universal Dependencies, TreeGal | Dependency | GPL |
German | Hamburg Dependency Treebank (HDT) | Dependency | Freely available for research |
German | Universal Dependencies, GSD | Dependency | CC BY-SA |
German | Universal Dependencies, LIT | Dependency | CC BY-NC-SA |
German | Universal Dependencies, PUD | Dependency | CC BY-SA |
German | SMULTRON - Parallel Treebank EN-DE-SV | Phrase structure | Freely available for research |
German | NEGRA | Phrase structure | Freely available for research |
German | TIGER | Phrase structure | Freely available for research |
German | Tübingen Treebank of German / Spontaneous Speech (TüBa-D/S) | Phrase structure | Freely available for research |
German | Tübingen Treebank of Written German (TüBa-D/Z) | Phrase structure | Freely available for research |
German | Tübingen Partially Parsed Corpus of Written German (TüPP-D/Z) | Phrase structure | License fee |
Gothic | PROIEL Treebank[9] | Dependency | Open source (Creative Commons license) |
Gothic | Universal Dependencies, PROIEL | Dependency | CC BY-NC-SA |
Greek | Greek Dependency Treebank | Dependency | Not freely available |
Greek | Universal Dependencies, GDT | Dependency | CC BY-NC-SA |
Hebrew | Universal Dependencies, HTB | Dependency | CC BY-NC-SA |
Hebrew | Hebrew Dependency Treebank | Dependency | Open source (GNU general public license) |
Hindi English | Universal Dependencies, HIENCS | Dependency | CC BY-SA |
Hindi | Universal Dependencies, HDTB | Dependency | CC BY-NC-SA |
Hindi | Universal Dependencies, PUD | Dependency | CC BY-SA |
Hindi | AnnCorra | Dependency | ? |
English (historical) | Penn Parsed Corpora of Historical English; | Phrase structure | Linguistic Data Consortium (as of April 2020) |
English (historical) | York-Toronto-Helsinki Parsed Corpus of Old English Prose (YCOE) | Phrase structure | Freely available for research |
French (historical) | Corpus MCVF | Phrase structure | Freely available for research |
Portuguese (historical) | Tycho Brahe corpus | Phrase structure | ? |
Hungarian | Universal Dependencies, Szeged | Dependency | CC BY-NC-SA |
Hungarian | Hungarian Treebank | Phrase structure | ? |
Icelandic | IcePaHC - Icelandic Parsed Historical Corpus | Phrase structure | Open source (GNU Lesser General Public License) |
Icelandic | Universal Dependencies, IcePaHC | Dependency | CC BY-SA |
Icelandic | Universal Dependencies, PUD | Dependency | CC BY-SA |
Indonesian | Universal Dependencies, GSD | Dependency | CC BY-SA |
Indonesian | Universal Dependencies, PUD | Dependency | CC BY-SA |
Indonesian | ICON | Phrase structure | ? |
Irish | Universal Dependencies, IDT | Dependency | CC BY-SA |
Italian | ISST - Italian Syntactic-Semantic Treebank | Phrase structure anddependency | License fee |
Italian | MIDT (Merged Italian Dependency Treebank) resulting from the merging and harmonization of the TUT and ISST-CoNLL/TANL treebanks | dependency | Freely available for research |
Italian | VIT - Venice Italian Treebank | Phrase structure anddependency | License fee |
Italian | Universal Dependencies, ISDT | Dependency | CC BY-NC-SA |
Italian | Universal Dependencies, ParTUT | Dependency | CC BY-NC-SA |
Italian | Universal Dependencies, PoSTWITA | Dependency | CC BY-NC-SA |
Italian | Universal Dependencies, PUD | Dependency | CC BY-SA |
Italian | Universal Dependencies, TWITTIRO | Dependency | CC BY-SA |
Italian | Universal Dependencies, VIT | Dependency | CC BY-NC-SA |
Italian | Italian Syntactic-Semantic Treebank for the CoNLL-2007 Shared Task (ISST-CoNLL) | dependency | Freely available for research |
Italian | SUT - Siena University Treebank | ? | ? |
Italian | TUT - Turin University Treebank | Dependency | Open source (Creative Commons license) |
Italian | ISDT (Italian Stanford Dependency Treebank) | dependency | Freely available for research |
Japanese | Kyoto Text Corpus | ? | ? |
Japanese | Universal Dependencies, BCCWJ | Dependency | CC BY-NC-SA |
Japanese | Universal Dependencies, GSD | Dependency | CC BY-SA |
Japanese | Universal Dependencies, KTC | Dependency | CC BY-SA |
Japanese | Universal Dependencies, Modern | Dependency | CC BY-NC-ND |
Japanese | Universal Dependencies, PUD | Dependency | CC BY-SA |
Japanese | Keyaki Treebank | Phrase structure | Open source (Creative Commons license) |
Japanese | Tübingen Treebank of Japanese / Spontaneous Speech (TüBa-J/S) | Phrase structure | Freely available for research |
Japanese | ATR Dependency corpus | Dependency | ? |
Karelian | Universal Dependencies, KKPP | Dependency | CC BY-SA |
Kazakh | Universal Dependencies, KTB | Dependency | CC BY-SA |
Komi Permyak | Universal Dependencies, UH | Dependency | CC BY-SA |
Komi Zyrian | Universal Dependencies, IKDP | Dependency | CC BY-SA |
Komi Zyrian | Universal Dependencies, Lattice | Dependency | CC BY-SA |
Korean | Universal Dependencies, GSD | Dependency | CC BY-SA |
Korean | Universal Dependencies, Kaist | Dependency | CC BY-SA |
Korean | Universal Dependencies, Penn | Dependency | CC BY-SA |
Korean | Universal Dependencies, PUD | Dependency | CC BY-SA |
Korean | Universal Dependencies, Sejong | Dependency | CC BY-SA |
Korean | Korean Treebank | Phrase structure | Linguistic Data Consortium |
Kurmanji | Universal Dependencies, MG | Dependency | CC BY-SA |
Latin | Universal Dependencies, ITTB | Dependency | CC BY-NC-SA |
Latin | Universal Dependencies, LLCT | Dependency | CC BY-SA |
Latin | Universal Dependencies, Perseus | Dependency | CC BY-NC-SA |
Latin | Universal Dependencies, PROIEL | Dependency | CC BY-NC-SA |
Latin | Index Thomisticus Treebank | Dependency | Open source (Creative Commons license) |
Latin | PROIEL Treebank[9] | Dependency | Open source (Creative Commons license) |
Latin | Latin Dependency Treebank[10] | Dependency | Open source (Creative Commons license) |
Latvian | Universal Dependencies, LVTB | Dependency | CC BY-SA |
Lithuanian | Universal Dependencies, ALKSNIS | Dependency | CC BY-SA |
Lithuanian | Universal Dependencies, HSE | Dependency | CC BY-SA |
Livvi | Universal Dependencies, KKPP | Dependency | CC BY-SA |
Magahi | Universal Dependencies, MGTB | Dependency | CC BY-SA |
Maltese | Universal Dependencies, MUDT | Dependency | CC BY-SA |
Marathi | Universal Dependencies, UFAL | Dependency | CC BY-SA |
Mbya Guarani | Universal Dependencies, Dooley | Dependency | CC BY-NC-SA |
Mbya Guarani | Universal Dependencies, Thomas | Dependency | CC BY-NC-SA |
Middle Irish | Universal Dependencies, CritMITB | Dependency | CC BY-SA |
Middle Irish | Universal Dependencies, DipMITB | Dependency | CC BY-SA |
Moksha | Universal Dependencies, JR | Dependency | CC BY-SA |
Naija | Universal Dependencies, NSC | Dependency | CC BY-SA |
North Sami | Universal Dependencies, Giella | Dependency | CC BY-SA |
Norwegian | INESS treebanking infrastructure | LFG | ? |
Norwegian | Universal Dependencies, Bokmaal | Dependency | CC BY-SA |
Norwegian | Universal Dependencies, Nynorsk | Dependency | CC BY-SA |
Norwegian | Universal Dependencies, NynorskLIA | Dependency | CC BY-SA |
Old Church Slavonic | Universal Dependencies, PROIEL | Dependency | CC BY-NC-SA |
Old Church Slavonic | TOROT Treebank[9] | Dependency | Open source (Creative Commons license) |
Old French | Universal Dependencies, SRCMF | Dependency | CC BY-NC-SA |
Old Russian | Universal Dependencies, RNC | Dependency | CC BY-SA |
Old Russian | Universal Dependencies, TOROT | Dependency | CC BY-NC-SA |
Old Russian | TOROT Treebank[9] | Dependency | Open source (Creative Commons license) |
Persian | Persian Dependency Treebank (PerDT) | Dependency | Freely available for research |
Persian | PerTreeBank | HPSG | Freely available for research |
Persian | Universal Dependencies, Seraji | Dependency | CC BY-SA |
Polish | A Treebank / Test Suite for Polish | HPSG | ? |
Polish | Universal Dependencies, LFG | Dependency | GPL |
Polish | Universal Dependencies, PDB | Dependency | CC BY-NC-SA |
Polish | Universal Dependencies, PUD | Dependency | CC BY-SA |
Polish | Składnica | Phrase structure andDependency | Open source (GNU general public license) |
Portuguese | Universal Dependencies, Bosque | Dependency | CC BY-SA |
Portuguese | Universal Dependencies, GSD | Dependency | CC BY-SA |
Portuguese | Universal Dependencies, PUD | Dependency | CC BY-SA |
Portuguese | Projecto Floresta Sintá(c)tica | Dependency,Phrase structure | Open source (GNU general public license) |
Romanian | Romanian Dependency Treebank | Dependency | ? |
Romanian | Universal Dependencies, Nonstandard | Dependency | CC BY-SA |
Romanian | Universal Dependencies, RRT | Dependency | CC BY-SA |
Romanian | Universal Dependencies, SiMoNERo | Dependency | CC BY-SA |
Russian | Universal Dependencies, GSD | Dependency | CC BY-SA |
Russian | Universal Dependencies, PUD | Dependency | CC BY-SA |
Russian | Universal Dependencies, SynTagRus | Dependency | CC BY-NC-SA |
Russian | Universal Dependencies, Taiga | Dependency | CC BY-SA |
Russian | SynTagRus Dependency Treebank (Russian National Corpus) | Dependency | Freely available for research |
Sanskrit | Universal Dependencies, UFAL | Dependency | CC BY-SA |
Sanskrit | Universal Dependencies, Vedic | Dependency | CC BY-SA |
Scottish Gaelic | Universal Dependencies, ARCOSG | Dependency | CC BY-SA |
Serbian | Universal Dependencies, SET | Dependency | CC BY-SA |
Sindhi | Universal Dependencies, MazharDootio | Dependency | CC BY-SA |
Skolt Sami | Universal Dependencies, Giellagas | Dependency | CC BY-SA |
Slovak | Universal Dependencies, SNK | Dependency | CC BY-SA |
Slovene | Slovene Dependency Treebank | Dependency | Freely available for research |
Slovenian | Universal Dependencies, SSJ | Dependency | CC BY-NC-SA |
Slovenian | Universal Dependencies, SST | Dependency | CC BY-NC-SA |
Spanish | Cast3LB | Phrase structure anddependency | Freely available for research |
Spanish | Universal Dependencies, AnCora | Dependency | GPL |
Spanish | Universal Dependencies, GSD | Dependency | CC BY-SA |
Spanish | Universal Dependencies, PUD | Dependency | CC BY-SA |
Spanish | UAM Treebank of Spanish | Phrase structure | Freely available for research |
Swedish | Talbanken05 | Phrase structure anddependency | Freely available for research |
Swedish | Swedish Treebank | Phrase structure | Freely available for research |
Swedish | Universal Dependencies, LinES | Dependency | CC BY-NC-SA |
Swedish | Universal Dependencies, PUD | Dependency | CC BY-SA |
Swedish | Universal Dependencies, Talbanken | Dependency | CC BY-SA |
Swedish | SMULTRON - Parallel Treebank EN-DE-SV | Phrase structure | Freely available for research |
Swedish Sign Language | Universal Dependencies, SSLC | Dependency | CC BY-SA |
Swiss German | Universal Dependencies, UZH | Dependency | CC BY-SA |
Tagalog | Universal Dependencies, TRG | Dependency | CC BY-SA |
Tagalog | Universal Dependencies, Ugnayan | Dependency | CC BY-NC-SA |
Tamil | Universal Dependencies, TTB | Dependency | CC BY-NC-SA |
Telugu | Universal Dependencies, MTG | Dependency | CC BY-SA |
Thai | NAiST Thai Treebank | Dependency | Open source (GNU general public license) |
Thai | Universal Dependencies, PUD | Dependency | CC BY-SA |
Thai | THTB | Phrase structure | CC BY 4.0 |
Turkish | METU-Sabanci Turkish Treebank | Dependency | Freely available for research |
Turkish | Universal Dependencies, BOUN | Dependency | CC BY-SA |
Turkish | Universal Dependencies, GB | Dependency | CC BY-SA |
Turkish | Universal Dependencies, IMST | Dependency | CC BY-NC-SA |
Turkish | Universal Dependencies, PUD | Dependency | CC BY-SA |
Ukrainian | Institute for Ukrainian, NGO Gold Standard | Dependency | Open source (Creative Commons license) |
Ukrainian | Universal Dependencies, IU | Dependency | CC BY-NC-SA |
Upper Sorbian | Universal Dependencies, UFAL | Dependency | CC BY-SA |
Urdu | NU-FAST Treebank | Phrase structure | Contact at Computational Learning Strategies & Practices |
Urdu | The URDU.KON-TB Treebank | Phrase and Hyper Dependency Structure | Contact at Computational Learning Strategies & Practices |
Urdu | Universal Dependencies, UDTB | Dependency | CC BY-NC-SA |
Uyghur | Universal Dependencies, UDT | Dependency | CC BY-SA |
Vietnamese | Universal Dependencies, VTB | Dependency | CC BY-SA |
Vietnamese | Vietnamese Treebank | Phrase structure | Freely available for research |
Vietnamese | Vietnamese Dependency Treebank | Dependency | Freely available for research |
Warlpiri | Universal Dependencies, UFAL | Dependency | CC BY-SA |
Welsh | Universal Dependencies, CCG | Dependency | CC BY-SA |
Wolof | Universal Dependencies, WTB | Dependency | CC BY-SA |
Yoruba | Universal Dependencies, YTB | Dependency | CC BY-SA |