Compilation of concepts primarily for the purposes of lexicostatistics
ASwadesh list (/ˈswɑːdɛʃ/) is a compilation oftentatively universal concepts for the purposes oflexicostatistics. That is, a Swadesh list is a list of forms and concepts which all languages, without exception, have terms for, such as star, hand, water, kill, sleep, and so forth. The number of such terms is small – a few hundred at most, or possibly less than a hundred. The inclusion or exclusion of many terms is subject to debate among linguists; thus, there are several different lists, and some authors may refer to "Swadesh lists." The Swadesh list is named after linguistMorris Swadesh.
Translations of a Swadesh list into a set of languages allow for researchers to quantify the interrelatedness of those languages. Swadesh lists are used inlexicostatistics (the quantitative assessment of the genealogical relatedness of languages) andglottochronology (the dating of language divergence). For instance, the terms on a Swadesh list can be compared between two languages (since both languages will have them) to see if they are related and how closely, thus giving useful information that can be further applied to comparison of the languages. (Actual lexicostatistics is quite complicated, and usually sets of languages are compared.)
Morris Swadesh created several versions of his list. He started[1] with a list of 215 meanings (falsely introduced as a list of 225 meanings in the paper due to a spelling error[2]), which he reduced to 165 words for theSalish-Spokane-Kalispel language. In 1952, he published a list of 215 meanings,[3] of which he suggested the removal of 16 for being unclear or notuniversal, with one added to arrive at 200 words. In 1955,[4] he wrote, "The only solution appears to be a drastic weeding out of the list, in the realization that quality is at least as important as quantity. Even the new list has defects, but they are relatively mild and few in number." After minor corrections, the final 100-word list was published posthumously in 1971[5] and 1972.
Other versions of lexicostatistical test lists were published e.g. byRobert Lees (1953), John A. Rea (1958:145f),Dell Hymes (1960:6), E. Cross (1964 with 241 concepts), W. J. Samarin (1967:220f), D. Wilson (1969 with 57 meanings),Lionel Bender (1969), R. L. Oswald (1971),Winfred P. Lehmann (1984:35f), D. Ringe (1992, passim, different versions),Sergei Starostin (1984, passim, different versions),William S-Y. Wang (1994), M. Lohr (2000, 128 meanings in 18 languages). B. Kessler (2002), and many others. TheConcepticon,[6] a project hosted at theCross-Linguistic Linked Data (CLLD) project, collects various concept lists (including classical Swadesh lists) across different linguistic areas and times, currently listing 240 different concept lists.[7]
Frequently used and widely available on the internet, is the version byIsidore Dyen (1992, 200 meanings of 95 language variants). Since 2010, a team aroundMichael Dunn has tried to update and enhance that list.[8]
In origin, the words in the Swadesh lists were chosen for their universal, culturally independent availability in as many languages as possible, regardless of their stability (how prone the word is to changing, as all words do over time to a greater or lesser extent, which can includeborrowing from another language).
However, stability may be important. The stability of terms on a Swadesh list under language change and the potential use of this fact for purposes ofglottochronology (study of how languages develop and branch apart over time) have been analyzed by numerous authors, including Marisa Lohr 1999, 2000.[9]
The Swadesh list was put together by Morris Swadesh on the basis of his intuition. Similar more recent lists, such as theDolgopolsky list (1964) or theLeipzig–Jakarta list (2009), are based on systematic data from many different languages, but they are not yet as widely known nor as widely used as the Swadesh list.
Lexicostatistical test lists are used inlexicostatistics to define subgroupings of languages, and inglottochronology to "provide dates for branching points in the tree."[10] The task of defining (and counting the number) of cognate words in the list is far from trivial, and often is subject to dispute, because cognates do not necessarily look similar, and recognition of cognates presupposes knowledge of thesound laws of the respective languages.
Swadesh's final list, published in 1971,[5] contains 100 terms. Explanations of the terms can be found in Swadesh 1952[3] or, where noted by a dagger (†), in Swadesh 1955. Note that only this original sequence clarifies the correct meaning which is lost in an alphabetical order, e.g., in the case "27. bark" (originally without the specification here added).
^ "Claw" was only added in 1955, but again replaced by many well-known specialists with(finger)nail, because expressions for "claw" are not available in many old, extinct, or lesser known languages.
The 110-itemGlobal Lexicostatistical Database list uses the original 100-item Swadesh list, in addition to 10 other words from the Swadesh–Yakhontov list.[11]
The most used list nowadays is the Swadesh 207-word list, adapted from Swadesh 1952.[3]
In Wiktionary ("Swadesh lists by language"), Panlex[12][13] and in Palisto's "Swadesh Word List of Indo-European languages",[14] hundreds of Swadesh lists in this form can be found.
TheSwadesh–Yakhontov list is a 35-word subset of the Swadesh list posited as especially stable by Russian linguistSergei Yakhontov around the 1960s, although the list was only officially published in 1991.[15] It has been used inlexicostatistics by linguists such asSergei Starostin. With their Swadesh numbers, they are:[16]
I
you (singular)
this
who
what
one
two
fish
dog
louse
blood
bone
egg
horn
tail
ear
eye
nose
tooth
tongue
hand
know
die
give
sun
moon
water
salt
stone
wind
fire
year
full
new
name
Holmanet al. (2008) found that in identifying the relationships betweenChinese dialects the Swadesh–Yakhontov list was less accurate than the original Swadesh-100 list. Further they found that a different (40-word) list (also known as theASJP list) was just as accurate as the Swadesh-100 list. However, they calculated the relative stability of the words by comparing retentions between languages in established language families. They found no statistically significant difference in the correlations in the families of the Old versus the New World.
The ranked Swadesh-100 list, with Swadesh numbers and relative stability, is as follows (Holmanet al., Appendix. Asterisked words appear on the 40-word list):
In studying thesign languages of Vietnam andThailand, linguist James Woodward noted that the traditional Swadesh list applied to spoken languages was unsuited forsign languages. The Swadesh list results in overestimation of the relationships between sign languages, due to indexical signs such as pronouns and parts of the body. The modified list is as follows, in mostly alphabetical order:[17]
Dolgopolsky list — the 15 words that change least as languages evolve
Leipzig–Jakarta list — 100 words resistant to borrowing, used to estimate chronological separation of languages, intended to improve on the Swadesh list
Holle lists — about 1500 words in more than 250 languages of Indonesia
Intercontinental Dictionary Series — a database of vocabulary lists in over 200 languages, especially indigenous South American and Northeast Caucasian
Linguistic concepts and fields
Cognate — a word derived from the same word as another
^List, J.-M., M. Cysouw, and R. Forkel (2016): Concepticon. A resource for the linking of concept lists. In:Proceedings of the Tenth International Conference on Language Resources and Evaluation. 2393-2400.PDF
^Marisa Lohr (2000), "New Approaches to Lexicostatistics and Glottochronology" in C. Renfrew, A. McMahon and L. Trask, ed.Time Depth in Historical Linguistics, Vol. 1, pp. 209–223
^Sheila Embleton (1992), in W. Bright, ed.,International Encyclopaedia of Linguistics, Oxford University Press, p. 131
Campbell, Lyle. (1998).Historical Linguistics: An Introduction. Edinburgh: Edinburgh University Press.ISBN0-262-53267-0.
Embleton, Sheila (1995). Review ofAn Indo-European Classification: A Lexicostatistical Experiment by Isidore Dyen, J.B. Kruskal and P.Black. TAPS Monograph 82–5, Philadelphia. inDiachronica Vol. 12, no. 2, 263–68.
Gudschinsky, Sarah. (1956). "The ABCs of Lexicostatistics (Glottochronology)."Word, Vol. 12, 175–210.
Hoijer, Harry. (1956). "Lexicostatistics: A Critique."Language, Vol. 32, 49–60.
Holm, Hans J. (2007). "The New Arboretum of Indo-European 'Trees': Can New Algorithms Reveal the Phylogeny and Even Prehistory of Indo-European?"Journal of Quantitative Linguistics, Vol. 14, 167–214.
Holman, Eric W., Søren Wichmann, Cecil H. Brown, Viveka Velupillai, André Müller, Dik Bakker (2008). "Explorations in Automated Language Classification."Folia Linguistica, Vol. 42, no. 2, 331–354
Sankoff, David (1970). "On the Rate of Replacement of Word-Meaning Relationships."Language, Vol. 46, 564–569.
Starostin, Sergei (1991).Altajskaja Problema i Proisxozhdenie Japonskogo Jazyka [The Altaic Problem and the Origin of the Japanese Language]. Moscow: Nauka
Swadesh, Morris. (1950). "Salish Internal Relationships."International Journal of American Linguistics, Vol. 16, 157–167.
Swadesh, Morris. (1952). "Lexicostatistic Dating of Prehistoric Ethnic Contacts."Proceedings of the American Philosophical Society, Vol. 96, 452–463.
Swadesh, Morris. (1971).The Origin and Diversification of Language. Ed.post mortem by Joel Sherzer. Chicago: Aldine.ISBN0-202-01001-5. Contains final 100-word list on p. 283.
Swadesh, Morris, et al. (1972). "What is Glottochronology?" in Morris Swadesh and Joel Sherzer, ed.,The Origin and Diversification of Language, pp. 271–284. London: Routledge & Kegan Paul.ISBN0-202-30841-3.
Wittmann, Henri (1973). "The Lexicostatistical Classification of the French-Based Creole Languages."Lexicostatistics in Genetic Linguistics: Proceedings of the Yale Conference, April 3–4, 1971, dir. Isidore Dyen, 89–99. La Haye: Mouton.[1]