
In computing and typesetting, asoft hyphen (U+00AD SOFT HYPHEN (­)),syllable hyphen, ordiscretionary hyphen is a code point reserved in somecoded character sets for the purpose of breaking words across lines by inserting visiblehyphens if they fall on the line end but remain invisible within the line.
Two alternative ways of using the soft hyphen character for this purpose have emerged, depending on whether the encoded text will be broken into lines by its recipient, or has already been preformatted by its originator.[1][2][3]
The use of SHY characters in text that will be broken into lines by the recipient is the application context considered by the post-1999HTML andUnicode specifications, as well as some word-processing file formats. In this context, the soft hyphen may also be called adiscretionary hyphen oroptional hyphen. It serves as an invisible marker used to specify a place in text where a hyphenated break is allowed without forcing aline break in an inconvenient place if the text is re-flowed. It becomes visible only afterword wrapping at the end of a line.[4] The soft hyphen's Unicode semantics and HTML implementation are in many ways similar to Unicode'szero-width space, with the exception that the soft hyphen will preserve thekerning of the characters on either side when not visible. The zero-width space, on the other hand, will not, as it is considered a visible character even if not rendered, thus having its own kerning metrics.
To show the effect of a soft hyphen in HTML, the words of the following text (from the poemSpring and Fall byGerard Manley Hopkins) have been separated with soft hyphens:
MargaretAreYouGrievingOverGoldengroveUnleavingLeavesLikeTheThingsOfManYouWithYourFreshThoughtsCareForCanYouAhAsTheHeartGrowsOlderItWillComeToSuchSightsColderByAndByNorSpareASighThoughWorldsOfWanwoodLeafmealLieAndYetYouWillWeepAndKnowWhyNowNoMatterChildTheNameSorrowsSpringsAreTheSameNorMouthHadNoNorMindExpressedWhatHeartHeardOfGhostGuessedItIsTheBlightManWasBornForItIsMargaretYouMournFor
On HTML browsers supporting soft hyphens, resizing the window will re-break the above text only at word boundaries, and insert a hyphen at the end of each line.
The SHY character is also used in text where paragraphs have already been broken into lines, such as certainplain text files, text sent toVT100-styleterminal emulators or printers, or pages represented inpage description languages. This is the application context originally considered by theEBCDIC andISO 8859-1 standards and implemented in manyVT100terminal emulators.[1][2]
Here, SHY is a visible hyphen that is usually visually indistinguishable from a regular hyphen, but has been inserted solely for the purpose of line breaking. The purpose of the soft hyphen here is to distinguish it from any regular hyphen that might have been part of the original spelling of the word. This distinction helps re-use of already formatted text, when line breaks and soft hyphens inserted during word wrapping have to be removed to convert the text back into its unformatted form. For example, the copy or paste function of aterminal emulator can offer to replace line breaks with aspace character, and remove any soft hyphens including any immediately followingwhitespace characters.
An example application that outputs soft hyphens for this reason is thegroff text formatter as used on many Unix/Linux systems to displayman pages.
Soft hyphen (SHY) characters in coded characters sets, roughly in chronological order:
­ for the ISO 8859-1 soft hyphen.Other commands for marking hyphenation opportunities in text formatting languages (similar to the HTML 4 and Unicode 4.0 interpretation of SHY):
Soft hyphens, like other invisible characters, have been used to obscure maliciousdomains orURLs ine-mail spam.[10][11]
They are also used in emails to try to defeat spam prevention systems. For example, the phrase "I need your assistance discreetly" has a soft hyphen in the word assistance which may mean a mail system would not detect the phrase in the email body.[citation needed]