Somali is classified within the Cushitic branch of the Afroasiatic family, specifically,Lowland East Cushitic in addition toAfar andSaho.[10] Somali is the best-documented of the Cushitic languages,[11] with academic studies of the language dating back to the late 19th century.[12]
The Somali language is spoken in Somali inhabited areas ofSomalia,Djibouti,Ethiopia,Kenya,Yemen and by members of theSomali diaspora. It is also spoken as an adoptive language by a few ethnic minority groups and individuals in Somali majority regions.
Somali is the most widely spoken Cushitic language in the region followed byOromo andAfar.[13]
As of 2021, there are approximately 24 million speakers of Somali, spread inGreater Somalia of which around 17 million reside in Somalia.[14][15] The language is spoken by an estimated 95% of the country's inhabitants,[12] and also by a majority of the population in Djibouti.[11]
Following the start of theSomali Civil War in the early 1990s, the Somali-speaking diaspora increased in size, with newer Somali speech communities forming in parts of the Middle East, North America and Europe.[3]
Constitutionally, Somali andArabic are the twoofficial languages ofSomalia.[16] Somali has been an official national language since January 1973, when theSupreme Revolutionary Council (SRC) declared it theSomali Democratic Republic's primary language of administration and education. Somali was thereafter established as the main language of academic instruction informs 1 through 4, following preparatory work by the government-appointed Somali Language Committee. It later expanded to include all 12 forms in 1979. In 1972, the SRC adopted aLatin orthography as the official national alphabet over several other writing scripts that were then in use. Concurrently, theItalian-language daily newspaperStella d'Ottobre ("The October Star") was nationalized, renamed toXiddigta Oktoobar, and began publishing in Somali.[17] The state-runRadio Mogadishu has also broadcast in Somali since 1951.[18][19] Additionally, other regional public networks likeSomaliland National TV andPuntland TV and Radio and, as well asEastern Television Network andHorn Cable Television, among other private broadcasters, air programs in Somali.[20]
Somali is recognized as an official working language in theSomali Region of Ethiopia.[21] Although it is not an official language ofDjibouti, it constitutes a major national language there. Somali is used in television and radio broadcasts,[12][22] with the government-operatedRadio Djibouti transmitting programs in the language from 1943 onwards.[23]
The Kenya Broadcasting Corporation also broadcasts in the Somali language in its Iftin FM Programmes. The language is spoken in the Somali territories within North EasternKenya, namelyWajir County,Garissa County andMandera County.[24][25]
The Somali language is regulated by theRegional Somali Language Academy, an intergovernmental institution established in June 2013 inDjibouti City by the governments of Djibouti, Somalia and Ethiopia. It is officially mandated with preserving the Somali language.[26]
Distribution of Somali dialectal groups in the Horn of Africa
TheSomali languages are broadly divided into three main groups:Northern Somali,Benadir andMaay.[28]Northern Somali forms the basis for Standard Somali.[28] It is spoken by the majority of the Somali population[29] with its speech area stretching fromDjibouti, and theSomali Region ofEthiopia to theNorthern Frontier District.[30] This widespread modern distribution is a result of a long series of southward population movements over the past ten centuries from theGulf of Aden littoral.[31] Lamberti subdivides Northern Somali into three dialects: Northern Somali proper (spoken in the northwest; he describes this dialect as Northern Somali in the proper sense), the Darod group (spoken in the northeast and along the eastern Ethiopia frontier; greatest number of speakers overall), and the Lower Juba group (spoken by northern Somali settlers in the southern riverine areas).[32] The sub dialect of Northern Somali that theIsaaq speak has the highest prestige of any other Somali dialect.[33]
Speech sample in Standard Somali (an Islamic discourse containing many Arabic loanwords)
Maay is principally spoken by the Digil and Mirifle (Rahanweyn) clans in the southern regions of Somalia.[35] Its speech area extends from the southwestern border with Ethiopia to a region close to the coastal strip between Mogadishu andKismayo, including the city ofBaidoa.[34] Maay is partially mutually comprehensible with Northern Somali,[36] with the degree of divergence comparable to that betweenSpanish andPortuguese.[37] Despite these linguistic differences, Somali speakers collectively view themselves as speaking a common language.[38] It is also not generally used in education or media. However, Maay speakers often use Standard Somali as a lingua franca,[34] which is learned via mass communications, internal migration and urbanization.[39]
Somali has five vowel articulations that all contrastmurmured andharsh voice as well asvowel length.[clarification needed] There is little change in vowel quality when the vowel is lengthened. Each vowel has a harmonic counterpart, and every vowel within a harmonic group (which notably can be larger than a word in Somali) must harmonize with the other vowels. The Somali orthography, however, does not distinguish between the two harmonic variants of each vowel.
Different analyses have proposed somewhat differentvowel inventories and features for Somali, depending on the set of speakers whose dialects are studied. Up to four features may be phonologicallydistinctive:height,backness,tongue root, andlength.
Saeed (1982) and Orwin (1994) both propose systems with five core vowels, but only Orwin's system makes a tongue root distinction.[40]: 3 [41]: 61 Gabbard (2010) proposes a system with six core vowels, with a tongue root distinction, but only on front vowels.[42]
Orwin argues that, in addition to the vowels listed above, each of these five vowels has a fronted (advanced tongue root) variant, based on the existence ofminimal pairs such as:
Gabbard claims that only the front vowels (/i/ and/e/) have advanced variants, though his system includes a sixth vowel,/ɑ/. Both Orwin and Gabbard agree that the precise phonetic and phonological difference between the advanced and retracted tongue root vowels are unclear.[41]: 61 [42]
The retroflex plosive/ɖ/ may have an implosive quality for some Somali Bantu speakers, and intervocalically it can be realized as the flap[ɽ]. Some speakers produce/ħ/ withepiglottal trilling as /ʜ/ in retrospect.[49]/q/ is oftenepiglottalized.[47]
The letter⟨dh⟩ is pronounced as a retroflex flap[ɽ] when it occurs intervocalically, as inqudhaanjo.
The letter⟨kh⟩, found in Arabic loanwords, is rarely pronounced as a velar fricative. It is more often conflated with/q/, which is pronounced[χ] in syllabic coda position.
Pitch is phonemic in Somali, but it is debated whether Somali is apitch accent, or it is atonal language.[50] Andrzejewski (1954) posits that Somali is a tonal language,[51] whereas Banti (1988) suggests that it is apitch system.
Root morphemes usually have a mono- or di-syllabic structure.
Clusters of two consonants do not occur word-initially or word-finally, i.e., they only occur at syllable boundaries. The following consonants can be geminate: /b/, /d/, /ɖ/, /ɡ/, /ɢ/, /m/, /n/, /r/ and /l/. The following cannot be geminate: /t/, /k/ and the fricatives.
Two vowels cannot occur together at syllable boundaries. Epenthetic consonants, e.g. [j] and [ʔ], are therefore inserted.
Somali is anagglutinative language, and also shows properties ofinflection. Affixes mark many grammatical meanings, including aspect, tense and case.[52]
Somali has an old prefixal verbal inflection restricted to four common verbs, with all other verbs undergoing inflection by more obvious suffixation. This general pattern is similar to the stem alternation that typifiesCairene Arabic.[53]
Somali has two sets of pronouns: independent (substantive, emphatic) pronouns and clitic (verbal) pronouns.[54] The independent pronouns behave grammatically as nouns, and normally occur with the suffixed article -ka/-ta (e.g.adiga, "you").[54] This article may be omitted after a conjunction or focus word. For example,adna meaning "and you..." (fromadi-na).[54] Clitic pronouns are attached to the verb and do not take nominal morphology.[55] Somali marksclusivity in the first person plural pronouns; this is also found in a number of other East Cushitic languages, such asRendille and Dhaasanac.[56]
As in various other Afro-Asiatic languages, Somali is characterized bypolarity of gender, whereby plural nouns usually take the opposite genderagreement of their singular forms.[57][58] For example, the plural of the masculine noundibi ("bull") is formed by converting it into femininedibi.[57] Somali is unusual among the world's languages in that the object is unmarked for case while the subject is marked, though this feature is found in other Cushitic languages such as Oromo.[59]
Somali is asubject–object–verb (SOV) language.[3] It is largelyhead final, withpostpositions and with obliques preceding verbs.[60] These are common features of the Cushitic and Semitic Afroasiatic languages spoken in the Horn region (e.g.Amharic).[61] However, Somali noun phrases are head-initial, whereby the noun precedes its modifying adjective.[60][62] This pattern of general head-finality with head-initial noun phrases is also found in other Cushitic languages (e.g. Oromo), but not generally in Ethiopian Semitic languages.[60][63]
Somali uses threefocus markers:baa,ayaa andwaxa(a), which generally mark new information or contrastive emphasis.[64]Baa andayaa require the focused element to occur preverbally, whilewaxa(a) may be used following the verb.[65]
Somaliloanwords can be divided into those derived from other Afroasiatic languages (mainly Arabic), and those ofIndo-European extraction (mainly Italian).[66]
Somali's main lexical borrowings come from Arabic, and are estimated to constitute about 20% of the language's vocabulary.[67] This is a legacy of the Somali people's extensive social, cultural, commercial and religious links and contacts with nearby populations in the Arabian peninsula. Arabic loanwords are most commonly used in religious, administrative and education-related speech (e.g.aamiin for "faith in God"), though they are also present in other areas (e.g.kubbad-da, "ball").[66] Soravia (1994) noted a total of 1,436 Arabic loanwords in Agostini a.o. 1985,[68] a prominent 40,000-entry Somali dictionary.[69] Most of the terms consisted of commonly used nouns. These lexical borrowings may have been more extensive in the past since a few words that Zaborski (1967:122) observed in the older literature were absent in Agostini's later work.[68] In addition, the majority of personal names are derived from Arabic.[70]
The Somali language also contains a few Indo-European loanwords that were retained from the colonial period.[17] Most of these lexical borrowings come from English andItalian and are used to describe modern concepts (e.g.telefishen-ka, "the television";raadia-ha, "the radio").[71] There are 300 loan words from Italian, such asgarawati for "tie" (fromItaliancravatta),dimuqraadi fromdemocratico (democratic),mikroskoob frommicroscopio, and so on.
Additionally, Somali contains lexical terms fromPersian,Urdu andHindi that were acquired through historical trade with communities in theNear East andSouth Asia (e.g.khiyaar "cucumber" fromPersian:خيارkhiyār ).[71] Other loan words have also displaced their native synonyms in some dialects (e.g.jabaati "a type of flat bread" from Hindi: चपातीchapāti displacingsabaayad). Some of these words were also borrowed indirectly via Arabic.[71][72]
As noted by Somali historian Mohammed Nuuh Ali, the Somali language also incorporates various loanwords fromOld Harari.[73]
As part of a broader governmental effort oflinguistic purism in the Somali language, the past few decades have seen a push in Somalia toward replacement of loanwords in general with their Somali equivalents orneologisms. To this end, the Supreme Revolutionary Council during its tenure officially prohibited the borrowing and use of English and Italian terms.[17]
TheOsmanya writing script for Somali.Shaláw Sabaean writing, Sanaag (Photo: by Sada Mire, 2007). Inscription dates between 900 BCE and 300 CE.
Archaeologicalexcavations and research in Somalia uncoveredancient inscriptions in a distinctwriting system.[74] In an 1878 report to theRoyal Geographical Society of Great Britain, scientistJohann Maria Hildebrandt noted upon visiting the area that "we know from ancient authors that these districts, at present so desert, were formerly populous and civilised[...] I also discovered ancient ruins and rock-inscriptions both in pictures and characters[...] These have hitherto not been deciphered."[75] According to the 1974 report for Ministry of Information and National Guidance, this script represents the earliest written attestation of Somali.[74]
Much more recently, Somali archaeologistSada Mire has published ancient inscriptions found throughoutSomalia. As for much of Somali linguistic history the language was not widely used for literature, Dr. Mire's publications however prove that writing as a technology was not foreign nor scarce in the region.[76] These pieces of writing are from the SemiticHimyarite andSabaean languages that were largely spoken in what is modern day Yemen —"there is an extensive and ancient relationship between the people and cultures of both sides of the Red Sea coast" Mire posits. Yet, while many more such ancient inscriptions are yet to be found or analyzed, many have been "bulldozed by developers, as the Ministry of Tourism could not buy the land or stop the destruction".[76]
Besides Ahmed's Latin script, other orthographies that have been used for centuries for writing the Somali language include the long-establishedArabic script andWadaad's writing.[77] According toBogumił Andrzejewski, this usage was limited to Somali clerics and their associates, as sheikhs preferred to write in the liturgical Arabic language. Various such historical manuscripts in Somali nonetheless exist, which mainly consist of Islamic poems (qasidas), recitations and chants.[78] Among these texts are the Somali poems by Sheikh Uways and Sheikh Ismaaciil Faarah. The rest of the existing historical literature in Somali principally consists of translations of documents from Arabic.[79]
Since then a number of writing systems have been used for transcribing the Somali language. Of these, theSomali Latin alphabet, officially adopted in 1972, is the most widely used and recognised as official orthography of the state.[80] The script was developed by a number of leading scholars of Somali, includingMusa Haji Ismail Galal,B. W. Andrzejewski andShire Jama Ahmed specifically for transcribing the Somali language, and uses all letters of the English Latin alphabet exceptp,v andz.[81][82] There are nodiacritics or other special characters except the use of the apostrophe for theglottal stop, which does not occur word-initially. There are three consonantdigraphs: DH, KH and SH. Tone is not marked, and front and back vowels are not distinguished.
Several digital collections of texts in the Somali language have been developed in recent decades. These corpora includeKaydka Af Soomaaliga (KAF), Bangiga Af Soomaaliga, the Somali Web Corpus (soWaC),[84] a Somali read-speech corpus, Asaas (Beginning in Somali) and a Web-Based Somali Language Model and text Corpus called Wargeys (Newspaper in Somali).[85]
For all numbers between 11kow iyo toban and 99sagaashal iyo sagaal, it is equally correct to switch the placement of the numbers, although larger numbers is some dialects prefer to place the 10s numeral first. For example 25 may both be written aslabaatan iyo shan andshan iyo labaatan (lit. Twenty and Five & Five and Twenty).
Although neither the Latin nor Osmanya scripts accommodate this numerical switching.
In recent years, the Somali language has become the subject of research in computational linguistics due to its complex morphology and low-resource status. Efforts have been made to develop lemmatization, part-of-speech tagging, and automatic speech recognition systems for Somali.[86]
^Jones, Daniel (2003) [1917], Peter Roach; James Hartmann; Jane Setter (eds.),English Pronouncing Dictionary, Cambridge: Cambridge University Press,ISBN3-12-539683-2
^"Somali". Collins Dictionary. Retrieved21 September 2013.
^Andrzejewski, Bogumit Witalis (1954)."Is Somali a Tone-language?", Proceedings of the Twenty-Third International Congress of Orientalists. Royal Asiatic Society. pp. 367–368.OCLC496050266.
^abTosco, Mauro; Department of Anthropology; Indiana University (2000)."Is There an "Ethiopian Language Area"?".Anthropological Linguistics.42 (3): 349. Retrieved8 May 2013.
^abMinistry of Information and National Guidance, Somalia,The writing of the Somali language, (Ministry of Information and National Guidance: 1974), p.5
^Royal Geographical Society (Great Britain),Proceedings of the Royal Geographical Society of London, Volume 22, "Mr. J. M. Hildebrandt on his Travels in East Africa", (Edward Stanford: 1878), p. 447.
^Nimaan, Abdillahi. 2014. Building and Evaluating Somali Language Corpora. In Jeff Good, Julia Hirschberg & Owen Rambow (eds.), Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages, 73–76. Baltimore, Maryland, USA: Association for Computational Linguistics.https://doi.org/10.3115/v1/W14-2210.
^Yusuf, A.; Hassan, M. (2023). "Advancing Somali Natural Language Processing: Morphological Analysis and Lemmatization".arXiv:2308.01785 [cs.CL].{{cite arXiv}}: CS1 maint: multiple names: authors list (link)
Laitin, David (1977).Politics, Language, and Thought: The Somali Experience. University Of Chicago Press.
Lecarme, Jacqueline; Maury, Carole (1987). "A software tool for research in linguistics and lexicography: Application to Somali".Computers and Translation.2. Paradigm Press:21–36.doi:10.1007/BF01540131.S2CID6515240.
Saeed, John (1999).Somali. Amsterdam: John Benjamins.ISBN1-55619-224-X.
Armstrong, Lilias E. (1969) [orig. pub. 1934, Mitteilungen des Seminars für Orientalische Sprachen zu Berlin, vol. 37].The phonetic structure of Somali. Gregg International Publishers.hdl:2307/4698.ISBN0576-11443-X.
Bell, C. R. V. (1953).The Somali Language. London: Longmans, Green & Co.
Berchem, Jörg (1991).Referenzgrammatik des Somali. Köln: Omimee.ISBN3921008018.
Cana, Frank Richardson (1911)."Somaliland" .Encyclopædia Britannica. Vol. 25 (11th ed.). pp. 378–384, see page 379.Inhabitants.—The Somali belong to the Eastern (Abyssinia) Hamitic family.... Their influence has been very slight even on the Somali language, whose structure and vocabulary are essentially Hamitic, with marked affinities to the Galla on the one hand and to the Dankali (Afar) on the other.
Cardona, G. R. (1981). "Profilo fonologico del somalo". In Cardona, G. R.; Agostini, F. (eds.).Studi Somali I: Fonologia e Lessico (in Italian). Roma: Ministero degli Affari Esteri, Dipartimento per la Cooperazione allo Sviluppo, Comitato Tecnico Linguistico per l'Universita Nazionale Somala.OCLC15276449.
Diriye Abdullahi, Mohamed (2000).Le Somali, dialectes et histoire (PhD dissertation) (in French). Université de Montréal.hdl:1866/30162.
Dobnova, Elena Z. (1990).Sovremennyj somalijskij jazyk. Moskva: Nauka.
Lamberti, M. (1986).Die Somali-Dialekte. Hamburg: Buske.
Lamberti, M. (1986).Map of the Somali-Dialects in the Somali Democratic Republic. Hamburg: Buske.
Puglielli, Annarita (1997). "Somali Phonology". In Kaye, Alan S. (ed.).Phonologies of Asia and Africa. Vol. 1. Winona Lake: Eisenbrauns. pp. 521–535.ISBN978-1-57506-019-4.