Thehyphen‐ is apunctuation mark used to joinwords and to separatesyllables of a single word. The use of hyphens is calledhyphenation.[1]
The hyphen is sometimes confused withdashes (en dash–,em dash— and others), which are wider, or with theminus sign−, which is also wider and usually drawn a little higher to match the crossbar in theplus sign+.
As anorthographic concept, the hyphen is a single entity. Incharacter encoding for use with computers, it is represented inUnicode by any of severalcharacters. These include the dual-usehyphen-minus, thesoft hyphen, thenonbreaking hyphen, and an unambiguous form known familiarly as the "Unicode hyphen", shown at the top of the infobox on this page. The character most often used to represent a hyphen (and the one produced by the key on a keyboard) is called the "hyphen-minus" in the Unicode specification because it is also used as aminus sign. The name derives from its name in the originalASCII standard, where it was called "hyphen(minus)".[2]
The word is derived from Ancient Greekὑφ' ἕν (huph' hén), contracted fromὑπό ἕν (hypó hén), "in one" (literally "under one").[3][4] An(ἡ) ὑφέν ((he) hyphén) was anundertie-like‿ sign written below two adjacent letters to indicate that they belong to the same word when it was necessary to avoid ambiguity, beforeword spacing was practiced.
The first known documentation of the hyphen is in the grammatical works ofDionysius Thrax. At the time hyphenation was joining two words that would otherwise be read separately by a lowtie mark between the two words.[5] In Greek these marks were known asenotikon, officiallyromanized as a hyphen.[6]
With the introduction ofletter spacing in theMiddle Ages, the hyphen, still written beneath the text, reversed its meaning. Scribes used the mark to connect two words that had been incorrectly separated by a space. This era also saw the introduction of the marginal hyphen, for words broken across lines.[7]
The modern format of the hyphen originated withJohannes Gutenberg of Mainz, Germany,c. 1455 with the publication of his 42-lineBible. His tools did not allow for asublinear hyphen, and he thus moved it to the middle of the line.[8] Examination of an original copy onvellum (Hubay index #35) in theU. S. Library of Congress shows that Gutenberg's movable type was set justified in a uniform style, 42 equal lines per page. The Gutenberg printing press required words composed of individual movable type to be secured within a rigid, nonprinting frame. To ensure each line fit the frame uniformly, Gutenberg addressed differences in line length by inserting a hyphen at the end of a line at the right-hand margin. This interrupted the letters in the last word, requiring the remaining letters be carried over to the start of the line below. Hisdouble hyphen,⸗, appears throughout the Bible as a short, double line inclined to the right at a 60-degree angle.[citation needed]
This sectionpossibly containsoriginal research. Almost nothing in this section is tied to reliable sources, and there's a great deal of prescriptivist punditry about "codification" of various "rules". Pleaseimprove it byverifying the claims made and addinginline citations. Statements consisting only of original research should be removed.(January 2016) (Learn how and when to remove this message)
TheEnglish language does not have definitive hyphenation rules,[9] though variousstyle guides provide detailed usage recommendations and have a significant amount of overlap in what they advise. Hyphens are mostly used to break single words into parts or to join ordinarily separate words into single words. Spaces are not placed between a hyphen and either of the elements it connects except when using a suspended or "hanging" hyphen that stands in for a repeated word (e.g.,nineteenth- andtwentieth-century writers). Style conventions that apply to hyphens (and dashes) have evolved to support ease of reading in complex constructions; editors often accept deviations if they aid rather than hinder easy comprehension.
The use of the hyphen inEnglish compound nouns and verbs has, in general, been steadily declining. Compounds that might once have been hyphenated are increasingly left with spaces or are combined into one word. Reflecting this changing usage, in 2007, the sixth edition of theShorter Oxford English Dictionary removed the hyphens from 16,000 entries, such asfig-leaf (nowfig leaf),pot-belly (nowpot belly), andpigeon-hole (nowpigeonhole).[10] The increasing prevalence of computer technology and the advent of the Internet have given rise to a subset of common nouns that might have been hyphenated in the past (e.g.,toolbar,hyperlink, andpastebin).
Despite decreased use, hyphenation remains the norm in certain compound-modifier constructions and, among some authors, with certain prefixes (seebelow). Hyphenation is also routinely used as part ofsyllabification injustified texts to avoid unsightly spacing (especially incolumns with narrowline lengths, as when used withnewspapers).
When flowing text, it is sometimes preferable to break a word into two so that it continues on another line rather than moving the entire word to the next line. The word may be divided at the nearest break point between syllables (syllabification) and a hyphen inserted to indicate that the letters form a word fragment, rather than a full word. This allows more efficient use of paper, allows flush appearance of right-side margins (justification) without oddly large word spaces, and decreases the problem ofrivers. This kind of hyphenation is most useful when the width of the column (called the "line length" in typography) is very narrow. For example:
Justified text without hyphenation
Justified text with hyphenation
We, therefore, the representatives of the United States of America ...
We, therefore, the represen- tatives of the United States of America ...
Rules (or guidelines) for correct hyphenation vary between languages, and may be complex, and they can interact with otherorthographic andtypesetting practices.Hyphenation algorithms, when employed in concert with dictionaries, are sufficient for all but the most formal texts.
Prefixes (such asde-,pre-,re-, andnon-[15]) andsuffixes (such as-less,-like,-ness, and-hood) are sometimes hyphenated, especially when the unhyphenated spelling resembles another word or when theaffixation is deemed misinterpretable, ambiguous, or somehow "odd-looking" (for example, having two consecutivemonographs that look like thedigraphs of English, like e+a, e+e, or e+i). However, the unhyphenated style, which is also calledclosed up orsolid, is usually preferred, particularly when thederivative has been relatively familiarized or popularized through extensive use in various contexts. As arule of thumb, affixes are not hyphenated unless the lack of a hyphen would hurt clarity.
The hyphen may be used between vowel letters (e.g.,ee,ea,ei) to indicate that they do not form adigraph. Some words have both hyphenated and unhyphenated variants:de-escalate/deescalate,co-operation/cooperation,re-examine/reexamine,de-emphasize/deemphasize, and so on. Words often lose their hyphen as they become more common, such asemail instead ofe-mail. When there are tripled letters, the hyphenated variant of these words is often more common (as inshell-like instead ofshelllike).
Closed-up style is avoided in some cases: possiblehomographs, such asrecreation (fun or sport) versusre-creation (the act of creating again),retreat (turn back) versusre-treat (givetherapy again), andun-ionized (not inion form) versusunionized (organized intotrade unions); combinations withproper nouns or adjectives (un-American,de-Stalinisation);[16][17]acronyms (anti-TNF antibody,non-SI units); or numbers (pre-1949 diplomacy,pre-1492 cartography). Althoughproto-oncogene is still hyphenated by bothDorland's andMerriam-Webster's Medical, the solid (that is, unhyphenated) styling (protooncogene) is a common variant, particularly among oncologists and geneticists.[citation needed]
Adiaeresis may also be used in a like fashion, either to separate and mark off monographs (as incoöperation) or to signalize avocalic terminal e (for example,Brontë). This use of the diaeresis peaked in the late 19th and early 20th centuries, but it was never applied extensively across the language: only a handful of diaereses, includingcoöperation andBrontë, are encountered with any appreciable frequency in English; thusreëxamine,reïterate,deëmphasize, etc. are seldom encountered. In borrowings from Modern French, whoseorthography utilizes the diaeresis as a means to differentiategraphemes, various English dictionaries list the dieresis as optional (as innaive andnaïve) despite the juxtaposition of a and i.[citation needed]
Hyphens are occasionally used to denotesyllabification, as insyl-la-bi-fi-ca-tion. Various British and North American dictionaries use aninterpunct (sometimes called a "middle dot" or "hyphenation point"), for this purpose, as insyl·la·bi·fi·ca·tion. This practice allows the hyphen to be reserved exclusively for instances where a true hyphen is intended – for example,self-con·scious,un·self-con·scious, andlong-stand·ing. Similarly, hyphens may be used to indicate the spelling of a word, as inW-O-R-D to representword.
In nineteenth-century American literature, hyphens were also used irregularly to divide syllables in words from indigenous North American languages, without regard for etymology or pronunciation,[18] such as "Shuh-shuh-gah" (fromOjibwezhashagi, "blue heron") inThe Song of Hiawatha.[19] This usage is now rare and deprecated,[citation needed] except in some place names such asAh-gwah-ching.
Compound modifiers are groups of two or more words that jointly modify the meaning of another word. When a compound modifier other than anadverb–adjective combination appearsbefore a term, the compound modifier is often hyphenated to prevent misunderstanding, such as inAmerican-football player orlittle-celebrated paintings. Without the hyphen, there is potential confusion about whether the writer means a "player of American football" or an "American player of football" and whether the writer means paintings that are "little celebrated" or "celebrated paintings" that are little.[20] Compound modifiers can extend to three or more words, as inice-cream-flavored candy, and can be adverbial as well as adjectival (spine-tinglingly frightening). However, if the compound is a familiar one, it is usually unhyphenated. For example, some style guides prefer the constructionhigh school students, tohigh-school students.[21][22] Although the expression is technically ambiguous ("students of a high school"/"school students who are high"), it would normally be formulated differently if other than the first meaning were intended. Noun–noun compound modifiers may also be written without a hyphen when no confusion is likely:grade point average anddepartment store manager.[22]
When a compound modifierfollows the term to which it applies, a hyphen is typically not used if the compound is a temporary compound. For example, "that gentleman is well respected", not "that gentleman is well-respected"; or "a patient-centered approach was used" but "the approach was patient centered."[23] But permanent compounds, found as headwords in dictionaries, are treated as invariable, so if they are hyphenated in the cited dictionary, the hyphenation will be used in both attributive and predicative positions. For example, "A cost-effective method was used" and "The method was cost-effective" (cost-effective is a permanent compound that is hyphenated as a headword in various dictionaries). When one of the parts of the modifier is aproper noun or aproper adjective, there is no hyphen (e.g., "a South American actor").[24]
When the first modifier in a compound is an adverb ending in-ly (e.g., "a poorly written novel"), various style guides advise no hyphen.[24][additional citation(s) needed] However, some do allow for this use. For example,The Economist Style Guide advises: "Adverbs do not need to be linked to participles or adjectives by hyphens in simple constructions... Less common adverbs, including all those that end-ly, are less likely to need hyphens."[25] In the 19th century, it was common to hyphenate adverb–adjective modifiers with the adverb ending in-ly (e.g., "a craftily-constructed chair"). However, this has become rare. For example,wholly owned subsidiary andquickly moving vehicle are unambiguous, because the adverbs clearly modify the adjectives: "quickly" cannot modify "vehicle".
However, if an adverb can also function as an adjective, then a hyphen may be or should be used for clarity, depending on the style guide.[17] For example, the phrasemore-important reasons ("reasons that are more important") is distinguished frommore important reasons ("additional important reasons"), wheremore is an adjective. Similarly,more-beautiful scenery (with amass-noun) is distinct frommore beautiful scenery. (In contrast, the hyphen in "amore-important reason" is not necessary, because the syntax cannot be misinterpreted.) A few short and common words—such aswell,ill,little, andmuch—attract special attention in this category.[25] The hyphen in "well-[past_participled] noun", such as in "well-differentiated cells", might reasonably be judged superfluous (the syntax is unlikely to be misinterpreted), yet plenty of style guides call for it. Becauseearly has both adverbial and adjectival senses, its hyphenation can attract attention; some editors, due to comparison withadvanced-stage disease andadult-onset disease, like the parallelism ofearly-stage disease andearly-onset disease. Similarly, the hyphen inlittle-celebrated paintings clarifies that one is not speaking of little paintings.
Hyphens are usually used to connect numbers and words in modifying phrases. Such is the case when used to describe dimensional measurements of weight, size, and time, under the rationale that, like other compound modifiers, they take hyphens in attributive position (before the modified noun),[26] although not in predicative position (after the modified noun). This is applied whether numerals or words are used for the numbers. Thus28-year-old woman andtwenty-eight-year-old woman or32-foot wingspan andthirty-two-foot wingspan, butthe woman is 28 years old anda wingspan of 32 feet.[a] However, with symbols forSI units (such asm orkg)—in contrast to thenames of these units (such asmetre orkilogram)—the numerical value is always separated from it with a space:a 25 kg sphere. When the unit names are spelled out, this recommendation does not apply:a25-kilogram sphere,a roll of35-millimetre film.[27]
In spelled-outfractions, hyphens are usually used when the fraction is used as an adjective but not when it is used as a noun: thustwo-thirds majority[a] andone-eighth portion butI drank two thirds of the bottle orI kept three quarters of it for myself.[28] However, at least one major style guide[26] hyphenates spelled-out fractions invariably (whether adjective or noun).
In English, anen dash,–, sometimes replaces the hyphen in hyphenated compounds if either of its constituent parts is already hyphenated or contains a space (for example,San Francisco–area residents,hormone receptor–positive cells,cell cycle–related factors, andpublic-school–private-school rivalries).[29] A commonly used alternative style is the hyphenated string (hormone-receptor-positive cells,cell-cycle-related factors). (For other aspects of en dash–versus–hyphen use, seeDash § En dash.)
When an object is compounded with a verbal noun, such asegg-beater (a tool that beats eggs), the result is sometimes hyphenated. Some authors do this consistently, others only for disambiguation; in this case,egg-beater, egg beater, andeggbeater are all common.
An example of an ambiguous phrase appears inthey stood near a group of alien lovers, which without a hyphen implies that they stood near a group of lovers who were aliens;they stood near a group of alien-lovers clarifies that they stood near a group of people who loved aliens, as "alien" can be either an adjective or a noun. On the other hand, in the phrasea hungry pizza-lover, the hyphen will often be omitted (a hungry pizza lover), as "pizza" cannot be an adjective and the phrase is therefore unambiguous.
Similarly,a man-eating shark is nearly the opposite ofa man eating shark; the first refers to a shark that eats people, and the second to a man who eatsshark meat.A government-monitoring program is a program that monitors the government, whereasa government monitoring program is a government program that monitors something else.
Some married couples compose a newsurname (sometimes referred to as adouble-barrelled name) for their new family by combining their two surnames with a hyphen. Jane Doe and John Smith might become Jane and John Smith-Doe, or Doe-Smith, for instance. In some countries only the woman hyphenates her birth surname, appending her husband's surname.
With already-hyphenated names, some parts are typically dropped. For example, Aaron Johnson and Samantha Taylor-Wood becameAaron Taylor-Johnson andSam Taylor-Johnson. Not all hyphenated surnames are the result of marriage. For exampleJulia Louis-Dreyfus is a descendant of Louis Lemlé Dreyfus whose son was Léopold Louis-Dreyfus.
Connecting hyphens are used in a large number of miscellaneous compounds, other than modifiers, such as inlily-of-the-valley,cock-a-hoop,clever-clever,tittle-tattle andorang-utan. Use is often dictated by convention rather than fixed rules, and hyphenation styles may vary between authors; for example,orang-utan is also written asorangutan ororang utan, andlily-of-the-valley may be hyphenated or not.
Asuspended hyphen (also called asuspensive hyphen orhanging hyphen, or less commonly adangling orfloating hyphen) may be used when a single base word is used with separate, consecutive, hyphenated words that are connected by "and", "or", or "to". For example,short-term and long-term plans may be written asshort- and long-term plans. This usage is now common and specifically recommended in some style guides.[22] Suspended hyphens are also used, though less commonly, when the base word comes first, such as in "investor-owned and-operated". Uses such as "applied and sociolinguistics" (instead of "applied linguistics and sociolinguistics") are frowned upon; the Indiana University style guide uses this example and says "Do not 'take a shortcut' when the first expression is ordinarily open" (i.e., ordinarily two separate words).[22] This is different, however, from instances where prefixes that are normally closed up (styled solidly) are used suspensively. For example,preoperative and postoperative becomespre- and postoperative (notpre- and post-operative) when suspended. Some editors prefer to avoid suspending such pairs, choosing instead to write out both words in full.[26]
In theASCII character encoding, the hyphen (or minus) is character 4510.[33] AsUnicode is identical to ASCII (the 1967 version) for all encodings up to 12710, the number 4510 (2D16) is also assigned to this character in Unicode, where it is denoted asU+002D-HYPHEN-MINUS.[34] Unicode has, in addition, other encodings for minus and hyphen characters:U+2212−MINUS SIGN andU+2010‐HYPHEN, respectively. The unambiguous§ "Unicode hyphen" at U+2010 is generally inconvenient to enter on most keyboards and the glyphs for this hyphen and the hyphen-minus are identical in most fonts (Lucida Sans Unicode is one of the few exceptions). Consequently, use of the hyphen-minus as the hyphen character is very common. Even theUnicode Standard regularly uses the hyphen-minus rather than the U+2010 hyphen.
The hyphen-minus has limited use in indicating subtraction; for example, compare4+3−2=5 (minus) and4+3-2=5 (hyphen-minus) — in most typefaces, theglyph for hyphen-minus will not have the optimal width, thickness, or vertical position, whereas the minus character is typically designed so that it does. Nevertheless, in many spreadsheet and programming applications the hyphen-minus must be typed to indicate subtraction, as use of the Unicode minus sign will not be recognised.
The hyphen-minus is often used instead of dashes or minus signs in situations where the latter characters are unavailable (such astype-written or ASCII-only text), where they take effort to enter (viadialog boxes or multi-keykeyboard shortcuts), or when the writer is unaware of the distinction. Consequently, some writers use two or three hyphen-minuses (-- or---) to represent an em dash.[35] In the TeX typesetting languages, a single hyphen-minus (-) renders a hyphen, a single hyphen-minus in math mode ($-$) renders a minus sign, two hyphen-minuses (--) renders an en dash, and three hyphen-minuses (---) renders an em dash.
The hyphen-minus character is also often used when specifyingcommand-line options. The character is usually followed by one or more letters that indicate specific actions. Typically it is called a dash or switch in this context. Various implementations of thegetopt function to parse command-line options additionally allow the use of two hyphen-minus characters,--, to specify long option names that are more descriptive than their single-letter equivalents. Another use of hyphens is that employed by programs written withpipelining in mind: a single hyphen may be recognizedin lieu of a filename, with the hyphen then serving as an indicator that astandard stream, instead of a file, is to be worked with.
Although software (hyphenation algorithms) can often automatically make decisions on when to hyphenate a word at a line break, it is also sometimes useful for the user to be able to insert cues for those decisions (which are dynamic in the online medium, given that text can bereflowed). For this purpose, the concept of asoft hyphen (discretionary hyphen, optional hyphen) was introduced, allowing such manual specification of a place where a hyphenated break isallowed but notforced. That is, it does not force a line break in an inconvenient place when the text is later reflowed.
Soft hyphens are inserted into the text at the positions where hyphenationmay occur. It can be a tedious task to insert the soft hyphens by hand, and tools using hyphenation algorithms are available that do this automatically. Current modules[which?] of theCascading Style Sheets (CSS) standard provide language-specific hyphenation dictionaries.
In contrast, a hyphen that is always displayed and printed is called a"hard hyphen". This can be a Unicode hyphen, a hyphen-minus, or a nonbreaking hyphen (seebelow). Confusingly, the term is sometimes limited to nonbreaking hyphens.[citation needed]
Theword segmentation rules of most text systems consider a hyphen to be aword boundary and a valid point at which to break a line when flowing text. This is not always desirable, it could lead to ambiguity (e.g.retreat andre‑treat would be indistinguishable with a line break afterre), it can split off an ending as in "n‑th" (thoughnth or "nth" could be used), and it is inappropriate in some languages other than English (e.g., a line break at the hyphen inIrishan t‑athair orRomanians‑a would be undesirable). Thenon-breaking hyphen,nonbreaking hyphen, orno-break hyphen looks identical to the regular hyphen, but word processors do not break words at it. Thenonbreaking space exists for similar reasons.
Because the conventionalhyphen-minus mark on keyboards is ambiguous (it can be interpreted – sometimes unexpectedly – as a hyphen or a minus, depending on context), in addition theUnicode consortium allocatedcodepoints for an unambiguous minus and an unambiguous hyphen. The Unicode hyphen (U+2010‐HYPHEN) is seldom used. Even theUnicode Standard uses U+002D instead of U+2010 in its text.[36]
Use of hyphens to delineate the parts of a written date (rather than theslashes used conventionally inAnglophone countries) is specified in theinternational standardISO 8601. Thus, for example, 1789-07-14 is the standard way of writing the date ofBastille Day. This standard has been transposed as European Standard EN 28601 and has been incorporated into various national typographic style guides (e.g.,DIN 5008 in Germany). Now all official European Union (and many member state) documents use this style. This is also the typical date format used in large parts of Europe and Asia, although sometimes with other separators than the hyphen.
This method has gained influence within North America, as most common computerfile systems make the use of slashes infile names difficult or impossible.MS-DOS,OS/2 andWindows use/ to introduce and separate switches toshell commands, and on both Windows andUnix-like systems slashes in a filename introduce subdirectories which may not be desirable. Besides encouraging use of dashes, the Y-M-D order and zero-padding of numbers less than 10 are also copied from ISO 8601 to make the filenames sort by date order.
U+1B60᭠BALINESE PAMENENG (used only as a line-breaking hyphen)
U+2E17⸗DOUBLE OBLIQUE HYPHEN (used in ancient Near-Eastern linguistics and inblackletter typefaces)
U+30FB・KATAKANA MIDDLE DOT (has the Unicode property of "Hyphen" despite its name)
U+FE63﹣SMALL HYPHEN-MINUS (compatibility character for a small hyphen-minus, used in East Asian typography)
U+FF0D-FULLWIDTH HYPHEN-MINUS (compatibility character for a wide hyphen-minus, used in East Asian typography)
U+FF65・HALFWIDTH KATAKANA MIDDLE DOT (compatibility character for a wide katakana middle dot, has the Unicode property of "Hyphen" despite its name)
Unicode distinguishes the hyphen from the generalinterpunct. The characters below do not have the Unicode property of "Hyphen" despite their names:[37]
^abWith numbers, where a plural noun would normally be used in an unhyphenated predicative position, the singular form of the noun is generally used in the hyphenated form used attributively. Thusa woman who is 28 years old becomesa28-year-old woman. There are occasional exceptions to this general rule, for instance with fractions (a two-thirds majority) and irregular plurals (a two-criteria review,a two-teeth bridge).
^The soft hyphen serves as an invisible marker that is used to specify a place in text where a hyphenatedline break is preferred should one be needed. This avoids forcing a line break in an inconvenient place, should the text be reflowed. It becomes visible only ifword wrapping occurs at the end of a line.
^Wroe, Ann, ed. (2015).The Economist Style Guide (11th ed.). London / New York: Profile Books / PublicAffairs. p. 74.hyphens There is no firm rule to help you decide which words are run together, hyphenated or left separate.
^abWroe, Ann, ed. (2015).The Economist Style Guide (11th ed.). London / New York: Profile Books / PublicAffairs. pp. 77–78.hyphens ... 12. Adverbs: Adverbs do not need to be linked to participles or adjectives by hyphens in simple constructions [examples elided]. But if the adverb is one of two words together being used adjectivally, a hyphen may be needed [examples elided]. The hyphen is especially likely to be needed if the adverb is short and common, such asill,little,much andwell. Less common adverbs, including all those that end-ly, are less likely to need hyphens [example elided].
^Gunner, Jennifer (22 February 2010)."When and How To Use a Hyphen ( - )".grammar.yourdictionary.com. Retrieved15 April 2023.Many people confuse hyphens and dashes because they look similar in printing.
^Haralambous, Yannis (2007). "ASCII".Fonts & Encodings. O'Reilly Media. p. 29.ISBN978-0596102425.
^"3.1 General scripts"(PDF).Unicode Version 1.0 · Character Blocks. p. 30.Loose vs. Precise Semantics. Some ASCII characters have multiple uses, either through ambiguity in the original standards or through accumulated reinterpretations of a limited codeset. For example, 27 hex is defined in ANSI X3.4 as apostrophe (closing single quotation mark; acute accent), and 2D hex as hyphen minus.
^Bringhurst, Robert (2004).The elements of typographic style (third ed.). Hartley & Marks, Publishers. p. 80.ISBN978-0-88179-206-5. Retrieved10 November 2020.In typescript, a double hyphen (--) is often used for a long dash. Double hyphens in a typeset document are a sure sign that the type was set by a typist, not a typographer. A typographer will use an em dash, three-quarter em, or en dash, depending on context or personal style. The em dash is the nineteenth-century standard, still prescribed in many editorial style books, but the em dash is too long for use with the best text faces. Like the oversized space between sentences, it belongs to the padded and corseted aesthetic of Victorian typography.