![]() | |
Version | 4 |
Authors | Ken Whistler, Rick McGowan |
Date | 2024-08-13 |
This Version | https://www.unicode.org/notes/tn33/tn33-4.html |
Previous Version | https://www.unicode.org/notes/tn33/tn33-3.html |
Latest Version | https://www.unicode.org/notes/tn33/ |
This document provides a list ofdanda characters in the Unicode Standard.
This document is aUnicode Technical Note. Sole responsibility for its contents rests with the author(s). Publication does not imply any endorsement by the Unicode Consortium.
For information on Unicode Technical Notes, including criteria for acceptance, seehttps://www.unicode.org/notes/.
Dandas are punctuation characters commonly seen in thetypographic traditions of writing systems of South andSoutheast Asia. While they occur in many scripts, they are primarily found in traditional materials written in scripts historically derived from the Brahmi script.
The typical appearance of a danda is simply a verticalbar. Two vertical bars may also be paired together ina corresponding punctuation mark known as a double danda.Tripled forms may also occur, but are much less common.Although forms based on a simple vertical bar are typical,in some scripts more elaborate forms have developed,and in some cases—such as Tibetan, in which the dandais termed ashad—the danda mark may accrueadditional adornments.
Dandas generally delimit phrase-, sentence-, or section-leveldivisions in text. When both a single and a double dandaoccur, the double danda is used to demarcate larger unitsof text than the single danda. This usage is roughlycomparable to the use of commas and full stops in Westerntypography, although dandas typically mark largerphrasal units than what might be separated by commasin Western typography. In many traditional materials, dandas and double dandas delimit what might be besttermed verses or sections, and do not map easily ontoconcepts such as "sentence". Usage may also vary by script,by language, and by corpus.
Many South and Southeast Asian scripts in modern usage haveadopted Western typographic practice invarying degrees. In such contexts dandas are often supplantedby common-use Western punctuation marks.
Many of the danda characters encoded in the Unicode Standardhave the word "DANDA" in their name, but there are manyinstances where punctuation marks are encoded, which historically andfunctionally are dandas, but which have distinct names specificto a particular script. For example, in Tibetan and scriptsinfluenced by Tibetan, these marks have "SHAD", ratherthan "DANDA" in their names. Also, because danda characters donot all have simple, vertical bar shapes, they are not alwayseasy to find when searching the code charts.
To make it easier to identify danda characters in the UnicodeStandard, this Technical Note includes a specific list ofknown danda characters as of Unicode 16.0. This list may beperiodically updated in the future, if further danda charactersare added to the Unicode Standard.
The table below is in the usual Unicode Data File format of semi-colon delimited fields optionally followed by "#" and a comment. The table contains a list of characters in the Unicode Standard that are dandas. The first field is a code point or code point range. The second field is the General_Category property value of the character. The third field is a comment giving the name of a single character or the names of the first and last characters in a code point range.
# Dandas# [Not derivable]0964..0965 ; Po # [2] DEVANAGARI DANDA..DEVANAGARI DOUBLE DANDA0E5A ; Po # THAI CHARACTER ANGKHANKHU0F08 ; Po # TIBETAN MARK SBRUL SHAD0F0D..0F12 ; Po # [7] TIBETAN MARK SHAD..TIBETAN MARK RGYA GRAM SHAD104A..104B ; Po # [2] MYANMAR SIGN LITTLE SECTION..MYANMAR SIGN SECTION1735..1736 ; Po # [2] PHILIPPINE SINGLE PUNCTUATION..PHILIPPINE DOUBLE PUNCTUATION17D4..17D5 ; Po # [2] KHMER SIGN KHAN..KHMER SIGN BARIYOOSAN1AA8..1AAB ; Po # [4] TAI THAM SIGN KAAN..TAI THAM SIGN SATKAANKUU1B5E..1B5F ; Po # [2] BALINESE CARIK SIKI..BALINESE CARIK PAREREN1C3B..1C3C ; Po # [2] LEPCHA PUNCTUATION TA-ROL..LEPCHA PUNCTUATION NYET THYOOM TA-ROL1C7E..1C7F ; Po # [2] OL CHIKI PUNCTUATION MUCAAD..OL CHIKI PUNCTUATION DOUBLE MUCAADA876..A877 ; Po # [2] PHAGS-PA SHAD..PHAGS-PA MARK DOUBLE SHADA8CE..A8CF ; Po # [2] SAURASHTRA DANDA..SAURASHTRA DOUBLE DANDAA92F ; Po # KAYAH LI SIGN SHYAA9C8..A9C9 ; Po # [2] JAVANESE PADA LINGSA..JAVANESE PADA LUNGSIAA5D..AA5F ; Po # [3] CHAM PUNCTUATION DANDA..CHAM PUNCTUATION TRIPLE DANDAAAF0 ; Po # MEETEI MAYEK CHEIKHANABEB ; Po # MEETEI MAYEK CHEIKHEI10A56..10A57 ; Po # [2] KHAROSHTHI PUNCTUATION DANDA..KHAROSHTHI PUNCTUATION DOUBLE DANDA11047..11048 ; Po # [2] BRAHMI DANDA..BRAHMI DOUBLE DANDA110C0..110C1 ; Po # [2] KAITHI DANDA..KAITHI DOUBLE DANDA11141..11142 ; Po # [2] CHAKMA DANDA..CHAKMA DOUBLE DANDA11175 ; Po # MAHAJANI SECTION MARK111C5..111C6 ; Po # [2] SHARADA DANDA..SHARADA DOUBLE DANDA11238..11239 ; Po # [2] KHOJKI DANDA..KHOJKI DOUBLE DANDA112A9 ; Po # MULTANI SECTION MARK113D4..113D5 ; Po # [2] TULU-TIGALARI DANDA..TULU-TIGALARI DOUBLE DANDA1144B..1144C ; Po # [2] NEWA DANDA..NEWA DOUBLE DANDA115C2..115C3 ; Po # [2] SIDDHAM DANDA..SIDDHAM DOUBLE DANDA11641..11642 ; Po # [2] MODI DANDA..MODI DOUBLE DANDA1173C..1173D ; Po # [2] AHOM SIGN SMALL SECTION..AHOM SIGN SECTION11994 ; Po # DIVES AKURU DOUBLE DANDA11A42..11A43 ; Po # [2] ZANABAZAR SQUARE MARK SHAD..ZANABAZAR SQUARE MARK DOUBLE SHAD11A9B..11A9C ; Po # [2] SOYOMBO MARK SHAD..SOYOMBO MARK DOUBLE SHAD11C41..11C42 ; Po # [2] BHAIKSUKI DANDA..BHAIKSUKI DOUBLE DANDA11F43..11F46 ; Po # [4] KAWI DANDA..KAWI PUNCTUATION ALTERNATE SECTION MARKER16A6E..16A6F ; Po # [2] MRO DANDA..MRO DOUBLE DANDA |
[Glossary] | Unicode Glossary https://www.unicode.org/glossary/ For explanations of terminology used in this and other documents. |
[UCD] | Unicode Character Database https://www.unicode.org/ucd/ For detailed documentation about the Unicode Character Database, see Unicode Standard Annex #44: Unicode Character Database https://www.unicode.org/reports/tr44/ |
[Unicode] | The Unicode Standard For the latest version, see: https://www.unicode.org/versions/latest/ |
The following summarizes modifications from the previous version of this document.
4
3
2
1
© 2010–2024 Ken Whistler, Rick McGowan. This publication is protected by copyright, and permission must be obtained from the author and Unicode, Inc. prior to any reproduction, modification, or other use not permitted by the Terms of Use.
Use of this publication is governed by the UnicodeTerms of Use. The authors, contributors, and publishers have taken care in the preparation of this publication, but make no express or implied representation or warranty of any kind and assume no responsibility or liability for errors or omissions or for consequential or incidental damages that may arise therefrom. This publication is provided “AS-IS” without charge as a convenience to users.
Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and other countries.