
Inlinguistics, thecomparative method is a technique for studying the development of languages by performing a feature-by-feature comparison of two or more languages withcommon descent from a shared ancestor and then extrapolating backwards to infer the properties of that ancestor. The comparative method may be contrasted with the method ofinternal reconstruction in which the internal development of a single language is inferred by the analysis of features within that language.[1] Ordinarily, both methods are used together to reconstruct prehistoric phases of languages; to fill in gaps in the historical record of a language; to discover the development of phonological, morphological and other linguistic systems and to confirm or to refute hypothesised relationships between languages.
The comparative method emerged in the early 19th century with the birth ofIndo-European studies, then took a definite scientific approach with the works of theNeogrammarians in the late 19th–early 20th century.[2] Key contributions were made by the Danish scholarsRasmus Rask (1787–1832) andKarl Verner (1846–1896), and the German scholarJacob Grimm (1785–1863). The first linguist to offer reconstructed forms from aproto-language wasAugust Schleicher (1821–1868) in hisCompendium der vergleichenden Grammatik der indogermanischen Sprachen, originally published in 1861.[3] Here is Schleicher's explanation of why he offered reconstructed forms:[4]
In the present work an attempt is made to set forth the inferredIndo-European original language side by side with its really existent derived languages. Besides the advantages offered by such a plan, in setting immediately before the eyes of the student the final results of the investigation in a more concrete form, and thereby rendering easier his insight into the nature of particularIndo-European languages, there is, I think, another of no less importance gained by it, namely that it shows the baselessness of the assumption that the non-Indian Indo-European languages were derived from Old-Indian (Sanskrit).
The aim of the comparative method is to highlight and interpret systematicphonological andsemantic correspondences between two or moreattested languages. If those correspondences cannot be rationally explained as the result oflinguistic universals orlanguage contact (borrowings,areal influence, etc.), and if they are sufficiently numerous, regular, and systematic that they cannot be dismissed aschance similarities, then it must be assumed that they descend from a single parent language called the 'proto-language'.[5][6]
A sequence of regularsound changes (along with their underlying sound laws) can then be postulated to explain the correspondences between the attested forms, which eventually allows for thereconstruction of a proto-language by the methodical comparison of "linguistic facts" within a generalized system of correspondences.[7]
Every linguistic fact is part of a whole in which everything is connected to everything else. One detail must not be linked to another detail, but one linguistic system to another.
— Antoine Meillet,La méthode comparative en linguistique historique, 1966 [1925], pp. 12–13.
Relation is considered to be "established beyond a reasonable doubt" if a reconstruction of the common ancestor is feasible.[8]
The ultimate proof of genetic relationship, and to many linguists' minds the only real proof, lies in a successful reconstruction of the ancestral forms from which the semantically corresponding cognates can be derived.
— Hans Henrich Hock,Principles of Historical Linguistics, 1991, p. 567.
In some cases, this reconstruction can only be partial, generally because the compared languages are too scarcely attested, the temporal distance between them and their proto-language is too deep, or their internal evolution render many of the sound laws obscure to researchers. In such case, a relation is considered plausible, but uncertain.[9]
Descent is defined as transmission across the generations: children learn a language from the parents' generation and, after being influenced by their peers, transmit it to the next generation, and so on. For example, a continuous chain of speakers across the centuries linksVulgar Latin to all of its modern descendants.
Two languages aregenetically related if they descended from the sameancestor language.[10] For example,Italian andFrench both come fromLatin and therefore belong to the same family, theRomance languages.[11] Having a large component of vocabulary from a certain origin is not sufficient to establish relatedness; for example, heavyborrowing fromArabic intoPersian has caused more of thevocabulary of Modern Persian to be from Arabic than from the direct ancestor of Persian,Proto-Indo-Iranian, but Persian remains a member of the Indo-Iranian family and is not considered "related" to Arabic.[12]
However, it is possible for languages to have different degrees of relatedness.English, for example, is related to bothGerman andRussian but is more closely related to the former than to the latter. Although all three languages share a common ancestor,Proto-Indo-European, English and German also share a more recent common ancestor,Proto-Germanic, but Russian does not. Therefore, English and German are considered to belong to a subgroup of Indo-European that Russian does not belong to, theGermanic languages.[13]
The division of related languages into subgroups is accomplished by findingshared linguistic innovations that differentiate them from the parent language. For instance, English and German both exhibit the effects of a collection of sound changes known asGrimm's Law, which Russian was not affected by. The fact that English and German share this innovation is seen as evidence of English and German's more recent common ancestor—since the innovation actually took place within that common ancestor, before English and German diverged into separate languages. On the other hand,shared retentions from the parent language are not sufficient evidence of a sub-group. For example, German and Russian both retain from Proto-Indo-European a contrast between thedative case and theaccusative case, which English has lost. However, that similarity between German and Russian is not evidence that German is more closely related to Russian than to English but means only that theinnovation in question, the loss of the accusative/dative distinction, happened more recently in English than the divergence of English from German.
Inclassical antiquity, Romans were aware of the similarities between Greek and Latin, but did not study them systematically. They sometimes explained them mythologically, as the result of Rome being a Greek colony speaking a debased dialect.[14]
Even though grammarians of Antiquity had access to other languages around them (Oscan,Umbrian,Etruscan,Gaulish,Egyptian,Parthian...), they showed little interest in comparing, studying, or just documenting them. Comparison between languages really began after classical antiquity.
In the 9th or 10th century AD,Yehuda Ibn Quraysh compared the phonology and morphology of Hebrew, Aramaic and Arabic but attributed the resemblance to the Biblical story of Babel, with Abraham, Isaac and Joseph retaining Adam's language, with other languages at various removes becoming more altered from the original Hebrew.[15]

In publications of 1647 and 1654,Marcus Zuerius van Boxhorn first described a rigorous methodology for historical linguistic comparisons[16] and proposed the existence of anIndo-European proto-language, which he called "Scythian", unrelated to Hebrew but ancestral to Germanic, Greek, Romance, Persian, Sanskrit, Slavic, Celtic and Baltic languages. The Scythian theory was further developed byAndreas Jäger (1686) andWilliam Wotton (1713), who made early forays to reconstruct the primitive common language. In 1710 and 1723,Lambert ten Kate first formulated the regularity ofsound laws, introducing among others the termroot vowel.[16]
Another early systematic attempt to prove the relationship between two languages on the basis of similarity ofgrammar andlexicon was made by the HungarianJános Sajnovics in 1770, when he attempted to demonstrate the relationship betweenSami andHungarian. That work was later extended to allFinno-Ugric languages in 1799 by his countrymanSamuel Gyarmathi.[17] However, the origin of modernhistorical linguistics is often traced back toSir William Jones, an Englishphilologist living inIndia, who in 1786 made his famousobservation:[18]
TheSanscrit language, whatever be its antiquity, is of a wonderful structure; more perfect than theGreek, more copious than theLatin, and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists. There is a similar reason, though not quite so forcible, for supposing that both theGothick and theCeltick, though blended with a very different idiom, had the same origin with the Sanscrit; and theold Persian might be added to the same family.
The comparative method developed out of attempts to reconstruct the proto-language mentioned by Jones, which he did not name but subsequent linguists have labelledProto-Indo-European (PIE). The first professional comparison between theIndo-European languages that were then known was made by the German linguistFranz Bopp in 1816. He did not attempt a reconstruction but demonstrated that Greek, Latin and Sanskrit shared a common structure and a common lexicon.[19] In 1808,Friedrich Schlegel first stated the importance of using the eldest possible form of a language when trying to prove its relationships;[20] in 1818,Rasmus Christian Rask developed the principle of regular sound-changes to explain his observations of similarities between individual words in the Germanic languages and their cognates in Greek andLatin.[21]Jacob Grimm, better known for hisFairy Tales, used the comparative method inDeutsche Grammatik (published 1819–1837 in four volumes), which attempted to show the development of theGermanic languages from a common origin, which was the first systematic study ofdiachronic language change.[22]
Both Rask and Grimm were unable to explain apparent exceptions to the sound laws that they had discovered. AlthoughHermann Grassmann explained one of the anomalies with the publication ofGrassmann's law in 1862,[23]Karl Verner made a methodological breakthrough in 1875, when he identified a pattern now known asVerner's law, the first sound-law based on comparative evidence showing that aphonological change in onephoneme could depend on other factors within the same word (such as neighbouring phonemes and the position of theaccent[24]), which are now calledconditioning environments.
Similar discoveries made by theJunggrammatiker (usually translated as "Neogrammarians") at theUniversity of Leipzig in the late 19th century led them to conclude that all sound changes were ultimately regular, resulting in the famous statement byKarl Brugmann andHermann Osthoff in 1878 that "sound laws have no exceptions".[2] That idea is fundamental to the modern comparative method since it necessarily assumes regular correspondences between sounds in related languages and thus regular sound changes from the proto-language. TheNeogrammarian hypothesis led to the application of the comparative method to reconstructProto-Indo-European sinceIndo-European was then by far the most well-studied language family. Linguists working with other families soon followed suit, and the comparative method quickly became the established method for uncovering linguistic relationships.[17]
There is no fixed set of steps to be followed in the application of the comparative method, but some steps are suggested byLyle Campbell[25] andTerry Crowley,[26] who are both authors of introductory texts in historical linguistics. This abbreviated summary is based on their concepts of how to proceed.
This step involves making lists of words that are likely cognates among the languages being compared. If there is a regularly-recurring match between the phonetic structure of basic words with similar meanings, a genetic kinship can probably then be established.[27] For example, linguists looking at thePolynesian family might come up with a list similar to the following (their actual list would be much longer):[28]
| Gloss | one | two | three | four | five | man | sea | taboo | octopus | canoe | enter | 
|---|---|---|---|---|---|---|---|---|---|---|---|
| Tongan | taha | ua | tolu | fā | nima | taŋata | tahi | tapu | feke | vaka | hū | 
| Samoan | tasi | lua | tolu | fā | lima | taŋata | tai | tapu | feʔe | vaʔa | ulu | 
| Māori | tahi | rua | toru | ɸā | rima | taŋata | tai | tapu | ɸeke | waka | uru | 
| Rapanui | -tahi | -rua | -toru | -ha | -rima | taŋata | tai | tapu | heke | vaka | uru | 
| Rarotongan | taʔi | rua | toru | ʔā | rima | taŋata | tai | tapu | ʔeke | vaka | uru | 
| Hawaiian | kahi | lua | kolu | hā | lima | kanaka | kai | kapu | heʔe | waʔa | ulu | 
Borrowings orfalse cognates can skew or obscure the correct data.[29] For example, Englishtaboo ([tæbu]) is like the six Polynesian forms because of borrowing from Tongan into English, not because of a genetic similarity.[30] That problem can usually be overcome by using basic vocabulary, such as kinship terms, numbers, body parts and pronouns.[31] Nonetheless, even basic vocabulary can be sometimes borrowed.Finnish, for example, borrowed the word for "mother",äiti, from Proto-Germanic *aiþį̄ (compare toGothicaiþei).[32]English borrowed the pronouns "they", "them", and "their(s)" fromNorse.[33]Thai and various otherEast Asian languages borrowed their numbers fromChinese. An extreme case is represented byPirahã, aMuran language of South America, which has been controversially[34] claimed to have borrowed all of itspronouns fromNheengatu.[35][36]
The next step involves determining the regular sound-correspondences exhibited by the lists of potential cognates. For example, in the Polynesian data above, it is apparent that words that containt in most of the languages listed have cognates in Hawaiian withk in the same position. That is visible in multiple cognate sets: the words glossed as 'one', 'three', 'man' and 'taboo' all show the relationship. The situation is called a "regular correspondence" betweenk in Hawaiian andt in the other Polynesian languages. Similarly, a regular correspondence can be seen between Hawaiian and Rapanuih, Tongan and Samoanf, Maoriɸ, and Rarotonganʔ.
Mere phonetic similarity, as betweenEnglishday andLatindies (both with the same meaning), has no probative value.[37] English initiald- does notregularly matchLatind-[38] since a large set of English and Latin non-borrowed cognates cannot be assembled such that Englishd repeatedly and consistently corresponds to Latind at the beginning of a word, and whatever sporadic matches can be observed are due either to chance (as in the above example) or toborrowing (for example, Latindiabolus and Englishdevil, both ultimately of Greek origin[39]). However, English and Latin exhibit a regular correspondence oft- :d-[38] (in which "A : B" means "A corresponds to B"), as in the following examples:[40]
| English | ten | two | tow | tongue | tooth | 
| Latin | decem | duo | dūco | dingua | dent- | 
If there are many regular correspondence sets of this kind (the more, the better), a common origin becomes a virtual certainty, particularly if some of the correspondences are non-trivial or unusual.[27]
During the late 18th to late 19th century, two major developments improved the method's effectiveness.
First, it was found that many sound changes are conditioned by a specificcontext. For example, in bothGreek andSanskrit, anaspiratedstop evolved into an unaspirated one, but only if a second aspirate occurred later in the same word;[41] this isGrassmann's law, first described forSanskrit bySanskrit grammarianPāṇini[42] and promulgated byHermann Grassmann in 1863.
Second, it was found that sometimes sound changes occurred in contexts that were later lost. For instance, in Sanskritvelars (k-like sounds) were replaced bypalatals (ch-like sounds) whenever the following vowel was*i or*e.[43] Subsequent to this change, all instances of*e were replaced bya.[44] The situation could be reconstructed only because the original distribution ofe anda could be recovered from the evidence of otherIndo-European languages.[45] For instance, theLatin suffixque, "and", preserves the original*e vowel that caused the consonant shift in Sanskrit:
| 1. | *ke | Pre-Sanskrit "and" | 
| 2. | *ce | Velars replaced by palatals before*i and*e | 
| 3. | ca | The attested Sanskrit form:*e has becomea | 
Verner's Law, discovered byKarl Vernerc. 1875, provides a similar case: thevoicing of consonants inGermanic languages underwent a change that was determined by the position of the old Indo-Europeanaccent. Following the change, the accent shifted to initial position.[46] Verner solved the puzzle by comparing the Germanic voicing pattern with Greek and Sanskrit accent patterns.
This stage of the comparative method, therefore, involves examining the correspondence sets discovered in step 2 and seeing which of them apply only in certain contexts. If two (or more) sets apply incomplementary distribution, they can be assumed to reflect a single originalphoneme: "some sound changes, particularly conditioned sound changes, can result in a proto-sound being associated with more than one correspondence set".[47]
For example, the following potential cognate list can be established forRomance languages, which descend fromLatin:
| Italian | Spanish | Portuguese | French | Gloss | |
|---|---|---|---|---|---|
| 1. | corpo | cuerpo | corpo | corps | body | 
| 2. | crudo | crudo | cru | cru | raw | 
| 3. | catena | cadena | cadeia | chaîne | chain | 
| 4. | cacciare | cazar | caçar | chasser | to hunt | 
They evidence two correspondence sets,k : k andk :ʃ:
| Italian | Spanish | Portuguese | French | |
|---|---|---|---|---|
| 1. | k | k | k | k | 
| 2. | k | k | k | ʃ | 
Since Frenchʃ occurs only beforea where the other languages also havea, and Frenchk occurs elsewhere, the difference is caused by different environments (being beforea conditions the change), and the sets are complementary. They can, therefore, be assumed to reflect a single proto-phoneme (in this case*k, spelled ⟨c⟩ inLatin).[48] The original Latin words arecorpus,crudus,catena andcaptiare, all with an initialk. If more evidence along those lines were given, one might conclude that an alteration of the originalk took place because of a different environment.
A more complex case involves consonant clusters inProto-Algonquian. The AlgonquianistLeonard Bloomfield used the reflexes of the clusters in four of the daughter languages to reconstruct the following correspondence sets:[49]
| Ojibwe | Meskwaki | Plains Cree | Menomini | |
|---|---|---|---|---|
| 1. | kk | hk | hk | hk | 
| 2. | kk | hk | sk | hk | 
| 3. | sk | hk | sk | t͡ʃk | 
| 4. | ʃk | ʃk | sk | sk | 
| 5. | sk | ʃk | hk | hk | 
Although all five correspondence sets overlap with one another in various places, they are not in complementary distribution and so Bloomfield recognised that a different cluster must be reconstructed for each set. His reconstructions were, respectively,*hk,*xk,*čk (=[t͡ʃk]),*šk (=[ʃk]), andçk (in which'x' and'ç' are arbitrary symbols, rather than attempts to guess the phonetic value of the proto-phonemes).[50]
Typology assists in deciding what reconstruction best fits the data. For example, the voicing of voiceless stops between vowels is common, but the devoicing of voiced stops in that environment is rare. If a correspondence-t- :-d- between vowels is found in two languages, the proto-phoneme is more likely to be*-t-, with a development to the voiced form in the second language. The opposite reconstruction would represent a rare type.
However, unusual sound changes occur. TheProto-Indo-European word fortwo, for example, is reconstructed as*dwō, which is reflected inClassical Armenian aserku. Several other cognates demonstrate a regular change*dw- →erk- in Armenian.[51] Similarly, in Bearlake, a dialect of theAthabaskan language ofSlavey, there has been a sound change of Proto-Athabaskan*ts → Bearlakekʷ.[52] It is very unlikely that*dw- changed directly intoerk- and*ts intokʷ, but they probably instead went through several intermediate steps before they arrived at the later forms. It is not phonetic similarity that matters for the comparative method but rather regular sound correspondences.[37]
By theprinciple of economy, the reconstruction of a proto-phoneme should require as few sound changes as possible to arrive at the modern reflexes in the daughter languages. For example,Algonquian languages exhibit the following correspondence set:[53][54]
| Ojibwe | Míkmaq | Cree | Munsee | Blackfoot | Arapaho | 
|---|---|---|---|---|---|
| m | m | m | m | m | b | 
The simplest reconstruction for this set would be either*m or*b. Both*m →b and*b →m are likely. Becausem occurs in five of the languages andb in only one of them, if*b is reconstructed, it is necessary to assume five separate changes of*b →m, but if*m is reconstructed, it is necessary to assume only one change of*m →b and so*m would be most economical.
That argument assumes the languages other than Arapaho to be at least partly independent of one another. If they all formed a common subgroup, the development*b →m would have to be assumed to have occurred only once.
In the final step, the linguist checks to see how the proto-phonemes fit the knowntypological constraints. For example, a hypothetical system,
| p | t | k | 
|---|---|---|
| b | ||
| n | ŋ | |
| l | 
has only onevoiced stop,*b, and although it has analveolar and avelar nasal,*n and*ŋ, there is no correspondinglabial nasal. However, languages generally maintain symmetry in their phonemic inventories.[55] In this case, a linguist might attempt to investigate the possibilities that either what was earlier reconstructed as*b is in fact*m or that the*n and*ŋ are in fact*d and*g.
Even a symmetrical system can be typologically suspicious. For example, here is the traditionalProto-Indo-European stop inventory:[56]
| Labials | Dentals | Velars | Labiovelars | Palatovelars | |
|---|---|---|---|---|---|
| Voiceless | p | t | k | kʷ | kʲ | 
| Voiced | (b) | d | g | ɡʷ | ɡʲ | 
| Voicedaspirated | bʱ | dʱ | ɡʱ | ɡʷʱ | ɡʲʱ | 
An earlier voiceless aspirated row was removed on grounds of insufficient evidence. Since the mid-20th century, a number of linguists have argued that this phonology is implausible[57] and that it is extremely unlikely for a language to have a voiced aspirated (breathy voice) series without a corresponding voiceless aspirated series.
Thomas Gamkrelidze andVyacheslav Ivanov provided a potential solution and argued that the series that are traditionally reconstructed as plain voiced should be reconstructed asglottalized: eitherimplosive(ɓ,ɗ,ɠ) orejective(pʼ,tʼ,kʼ). The plain voiceless and voiced aspirated series would thus be replaced by just voiceless and voiced, with aspiration being a non-distinctive quality of both.[58] That example of the application of linguistic typology to linguistic reconstruction has become known as theglottalic theory. It has a large number of proponents but is not generally accepted.[59]
The reconstruction of proto-sounds logically precedes the reconstruction of grammaticalmorphemes (word-forming affixes and inflectional endings), patterns ofdeclension andconjugation and so on. The full reconstruction of an unrecorded protolanguage is an open-ended task.
The limitations of the comparative method were recognized by the very linguists who developed it,[60] but it is still seen as a valuable tool. In the case of Indo-European, the method seemed at least a partial validation of the centuries-old search for anUrsprache, the original language. The others were presumed to be ordered in afamily tree, which was thetree model of theneogrammarians.
The archaeologists followed suit and attempted to find archaeological evidence of a culture or cultures that could be presumed to have spoken aproto-language, such asVere Gordon Childe'sThe Aryans: a study of Indo-European origins, 1926. Childe was a philologist turned archaeologist. Those views culminated in theSiedlungsarchaologie, or "settlement-archaeology", ofGustaf Kossinna, becoming known as "Kossinna's Law". Kossinna asserted that cultures represent ethnic groups, including their languages, but his law was rejected after World War II. The fall of Kossinna's Law removed the temporal and spatial framework previously applied to many proto-languages. Fox concludes:[61]
The Comparative Methodas such is not, in fact, historical; it provides evidence of linguistic relationships to which we may give a historical interpretation.... [Our increased knowledge about the historical processes involved] has probably made historical linguists less prone to equate the idealizations required by the method with historical reality.... Provided we keep [the interpretation of the results and the method itself] apart, the Comparative Method can continue to be used in the reconstruction of earlier stages of languages.
Proto-languages can be verified in many historical instances, such as Latin.[62][63] Although no longer a law, settlement-archaeology is known to be essentially valid for some cultures that straddle history and prehistory, such as the Celtic Iron Age (mainly Celtic) andMycenaean civilization (mainly Greek). None of those models can be or have been completely rejected, but none is sufficient alone.
The foundation of the comparative method, and of comparative linguistics in general, is theNeogrammarians' fundamental assumption that "sound laws have no exceptions". When it was initially proposed, critics of the Neogrammarians proposed an alternate position that summarised by the maxim "each word has its own history".[64] Several types of change actually alter words in irregular ways. Unless identified, they may hide or distort laws and cause false perceptions of relationship.
All languagesborrow words from other languages in various contexts. Loanwords imitate the form of the donor language, as in Finnickuningas, from Proto-Germanic *kuningaz ('king'), with possible adaptations to the local phonology, as in Japanesesakkā, from Englishsoccer. At first sight, borrowed words may mislead the investigator into seeing a genetic relationship, although they can more easily be identified with information on the historical stages of both the donor and receiver languages. Inherently, words that were borrowed from a common source (such as Englishcoffee and Basquekafe, ultimately from Arabicqahwah) do share a genetic relationship, although limited to the history of this word.
Borrowing on a larger scale occurs inareal diffusion, when features are adopted by contiguous languages over a geographical area. The borrowing may bephonological,morphological orlexical. A false proto-language over the area may be reconstructed for them or may be taken to be a third language serving as a source of diffused features.[65]
Several areal features and other influences may converge to form aSprachbund, a wider region sharing features that appear to be related but are diffusional. For instance, theMainland Southeast Asia linguistic area, before it was recognised, suggested several false classifications of such languages asChinese,Thai andVietnamese.
Sporadic changes, such as irregular inflections, compounding and abbreviation, do not follow any laws. For example, theSpanish wordspalabra ('word'),peligro ('danger') andmilagro ('miracle') would have beenparabla,periglo,miraglo by regular sound changes from the Latinparabŏla,perīcŭlum andmīrācŭlum, but ther andl changed places by sporadicmetathesis.[66]
Analogy is the sporadic change of a feature to be like another feature in the same or a different language. It may affect a single word or be generalized to an entire class of features, such as a verb paradigm. An example is theRussian word fornine. The word, by regular sound changes fromProto-Slavic, should have been/nʲevʲatʲ/, but it is in fact/dʲevʲatʲ/. It is believed that the initialnʲ- changed todʲ- under influence of the word for "ten" in Russian,/dʲesʲatʲ/.[67]
Those who study contemporary language changes, such asWilliam Labov, acknowledge that even a systematic sound change is applied at first inconsistently, with the percentage of its occurrence in a person's speech dependent on various social factors.[68] The sound change seems to gradually spread in a process known aslexical diffusion. While it does not invalidate the Neogrammarians' axiom that "sound laws have no exceptions", the gradual application of the very sound laws shows that they do not always apply to all lexical items at the same time. Hock notes,[69] "While it probably is true in the long run every word has its own history, it is not justified to conclude as some linguists have, that therefore the Neogrammarian position on the nature of linguistic change is falsified".
The comparative method cannot recover aspects of a language that were not inherited in its daughter idioms. For instance, theLatin declension pattern was lost inRomance languages, resulting in an impossibility to fully reconstruct such a feature via systematic comparison.[70]
The comparative method is used to construct a tree model (GermanStammbaum) of language evolution,[71] in which daughter languages are seen as branching from theproto-language, gradually growing more distant from it through accumulatedphonological,morpho-syntactic, andlexical changes.


The tree model features nodes that are presumed to be distinct proto-languages existing independently in distinct regions during distinct historical times. The reconstruction of unattested proto-languages lends itself to that illusion since they cannot be verified, and the linguist is free to select whatever definite times and places seems best. Right from the outset of Indo-European studies, however,Thomas Young said:[74]
It is not, however, very easy to say what the definition should be that should constitute a separate language, but it seems most natural to call those languages distinct, of which the one cannot be understood by common persons in the habit of speaking the other.... Still, however, it may remain doubtfull whether the Danes and the Swedes could not, in general, understand each other tolerably well... nor is it possible to say if the twenty ways of pronouncing the sounds, belonging to the Chinese characters, ought or ought not to be considered as so many languages or dialects.... But,... the languages so nearly allied must stand next to each other in a systematic order…
The assumption of uniformity in a proto-language, implicit in the comparative method, is problematic. Even small language communities always have differences indialect, whether they are based on area, gender, class or other factors. ThePirahã language ofBrazil is spoken by only several hundred people but has at least two different dialects, one spoken by men and one by women.[75] Campbell points out:[76]
It is not so much that the comparative method 'assumes' no variation; rather, it is just that there is nothing built into the comparative method which would allow it to address variation directly.... This assumption of uniformity is a reasonable idealization; it does no more damage to the understanding of the language than, say, modern reference grammars do which concentrate on a language's general structure, typically leaving out consideration of regional or social variation.
Different dialects, as they evolve into separate languages, remain in contact with and influence one another. Even after they are considered distinct, languages near one another continue to influence one another and often share grammatical, phonological, andlexical innovations. A change in one language of a family may spread to neighboring languages, and multiple waves of change are communicated like waves across language and dialect boundaries, each with its own randomly delimited range.[77] If a language is divided into an inventory of features, each with its own time and range (isoglosses), they do not all coincide. History and prehistory may not offer a time and place for a distinct coincidence, as may be the case forProto-Italic, for which the proto-language is only a concept. However, Hock[78] observes:
The discovery in the late nineteenth century thatisoglosses can cut across well-established linguistic boundaries at first created considerable attention and controversy. And it became fashionable to oppose a wave theory to a tree theory.... Today, however, it is quite evident that the phenomena referred to by these two terms are complementary aspects of linguistic change....
The reconstruction of unknown proto-languages is inherently subjective. In theProto-Algonquian example above, the choice of*m as the parentphoneme is onlylikely, notcertain. It is conceivable that a Proto-Algonquian language with*b in those positions split into two branches, one that preserved*b and one that changed it to*m instead, and while the first branch developed only intoArapaho, the second spread out more widely and developed into all the otherAlgonquian tribes. It is also possible that the nearest common ancestor of theAlgonquian languages used some other sound instead, such as*p, which eventually mutated to*b in one branch and to*m in the other.
Examples of strikingly complicated and even circular developments are indeed known to have occurred (such as Proto-Indo-European*t > Pre-Proto-Germanic*þ >Proto-Germanic*ð > Proto-West-Germanic*d >Old High Germant infater > Modern GermanVater), but in the absence of any evidence or other reason to postulate a more complicated development, the preference of a simpler explanation is justified by the principle of parsimony, also known asOccam's razor. Since reconstruction involves many such choices, some linguists[who?] prefer to view the reconstructed features as abstract representations of sound correspondences, rather than as objects with a historical time and place.[citation needed]
The existence of proto-languages and the validity of the comparative method is verifiable if the reconstruction can be matched to a known language, which may be known only as a shadow in theloanwords of another language. For example,Finnic languages such asFinnish have borrowed many words from an early stage ofGermanic, and the shape of the loans matches the forms that have been reconstructed forProto-Germanic. Finnishkuningas 'king' andkaunis 'beautiful' match the Germanic reconstructions *kuningaz and *skauniz (> GermanKönig 'king',schön 'beautiful').[79]
Thewave model was developed in the 1870s as an alternative to the tree model to represent the historical patterns of language diversification. Both the tree-based and the wave-based representations are compatible with the comparative method.[80]
By contrast, some approaches are incompatible with the comparative method, including contentiousglottochronology and even more controversialmass lexical comparison considered by most historical linguists to be flawed and unreliable.[81]