| Copyright | (c) The University of Glasgow 2001 |
|---|---|
| License | BSD-style (see the file libraries/base/LICENSE) |
| Maintainer | libraries@haskell.org |
| Stability | stable |
| Portability | portable |
| Safe Haskell | Trustworthy |
| Language | Haskell2010 |
Data.Char
Contents
Description
The Char type and associated operations.
The character typeChar is an enumeration whose values representUnicode (or equivalently ISO/IEC 10646) code points (i.e. characters, seehttp://www.unicode.org/ for details). This set extends the ISO 8859-1(Latin-1) character set (the first 256 characters), which is itself an extensionof the ASCII character set (the first 128 characters). A character literal inHaskell has typeChar.
To convert aChar to or from the correspondingInt value definedby Unicode, usetoEnum andfromEnum from theEnum class respectively (or equivalentlyord andchr).
| BoundedCharSource# | Since: 2.1 |
| EnumCharSource# | Since: 2.1 |
Instance detailsDefined inGHC.Enum | |
| EqChar | |
| DataCharSource# | Since: 4.0.0.0 |
Instance detailsDefined inData.Data Methods gfoldl :: (forall d b.Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) ->Char -> cCharSource# gunfold :: (forall b r.Data b => c (b -> r) -> c r) -> (forall r. r -> c r) ->Constr -> cCharSource# toConstr ::Char ->ConstrSource# dataTypeOf ::Char ->DataTypeSource# dataCast1 ::Typeable t => (forall d.Data d => c (t d)) ->Maybe (cChar)Source# dataCast2 ::Typeable t => (forall d e. (Data d,Data e) => c (t d e)) ->Maybe (cChar)Source# gmapT :: (forall b.Data b => b -> b) ->Char ->CharSource# gmapQl :: (r -> r' -> r) -> r -> (forall d.Data d => d -> r') ->Char -> rSource# gmapQr :: (r' -> r -> r) -> r -> (forall d.Data d => d -> r') ->Char -> rSource# gmapQ :: (forall d.Data d => d -> u) ->Char -> [u]Source# gmapQi ::Int -> (forall d.Data d => d -> u) ->Char -> uSource# gmapM ::Monad m => (forall d.Data d => d -> m d) ->Char -> mCharSource# gmapMp ::MonadPlus m => (forall d.Data d => d -> m d) ->Char -> mCharSource# gmapMo ::MonadPlus m => (forall d.Data d => d -> m d) ->Char -> mCharSource# | |
| OrdChar | |
| ReadCharSource# | Since: 2.1 |
| ShowCharSource# | Since: 2.1 |
| IxCharSource# | Since: 2.1 |
| StorableCharSource# | Since: 2.1 |
| IsCharCharSource# | Since: 2.1 |
| PrintfArgCharSource# | Since: 2.1 |
Instance detailsDefined inText.Printf | |
| Generic1 (URecChar :: k ->Type)Source# | |
| Functor (URecChar ::Type ->Type)Source# | Since: 4.9.0.0 |
| Foldable (URecChar ::Type ->Type)Source# | Since: 4.9.0.0 |
Instance detailsDefined inData.Foldable Methods fold ::Monoid m =>URecChar m -> mSource# foldMap ::Monoid m => (a -> m) ->URecChar a -> mSource# foldr :: (a -> b -> b) -> b ->URecChar a -> bSource# foldr' :: (a -> b -> b) -> b ->URecChar a -> bSource# foldl :: (b -> a -> b) -> b ->URecChar a -> bSource# foldl' :: (b -> a -> b) -> b ->URecChar a -> bSource# foldr1 :: (a -> a -> a) ->URecChar a -> aSource# foldl1 :: (a -> a -> a) ->URecChar a -> aSource# toList ::URecChar a -> [a]Source# null ::URecChar a ->BoolSource# length ::URecChar a ->IntSource# elem ::Eq a => a ->URecChar a ->BoolSource# maximum ::Ord a =>URecChar a -> aSource# minimum ::Ord a =>URecChar a -> aSource# | |
| Traversable (URecChar ::Type ->Type)Source# | Since: 4.9.0.0 |
Instance detailsDefined inData.Traversable | |
| Eq (URecChar p)Source# | Since: 4.9.0.0 |
| Ord (URecChar p)Source# | Since: 4.9.0.0 |
Instance detailsDefined inGHC.Generics | |
| Show (URecChar p)Source# | Since: 4.9.0.0 |
| Generic (URecChar p)Source# | |
| dataURecChar (p :: k)Source# | Used for marking occurrences of Since: 4.9.0.0 |
| typeRep1 (URecChar :: k ->Type)Source# | Since: 4.9.0.0 |
Instance detailsDefined inGHC.Generics | |
| typeRep (URecChar p)Source# | Since: 4.9.0.0 |
Instance detailsDefined inGHC.Generics | |
Unicode characters are divided into letters, numbers, marks, punctuation, symbols, separators (including spaces) and others (including control characters).
isControl ::Char ->BoolSource#
Selects control characters, which are the non-printing characters of the Latin-1 subset of Unicode.
ReturnsTrue for any Unicode space character, and the control characters\t,\n,\r,\f,\v.
Selects upper-case or title-case alphabetic Unicode characters (letters). Title case is used by a small number of letter ligatures like the single-character form ofLj.
Selects alphabetic Unicode characters (lower-case, upper-case and title-case letters, plus letters of caseless scripts and modifiers letters). This function is equivalent toisLetter.
isAlphaNum ::Char ->BoolSource#
Selects alphabetic or numeric Unicode characters.
Note that numeric digits outside the ASCII range, as well as numeric characters which aren't digits, are selected by this function but not byisDigit. Such characters may be part of identifiers but are not used by the printer and reader to represent numbers.
Selects printable Unicode characters (letters, numbers, marks, punctuation, symbols and spaces).
isOctDigit ::Char ->BoolSource#
Selects ASCII octal digits, i.e.'0'..'7'.
isHexDigit ::Char ->BoolSource#
Selects ASCII hexadecimal digits, i.e.'0'..'9','a'..'f','A'..'F'.
Selects alphabetic Unicode characters (lower-case, upper-case and title-case letters, plus letters of caseless scripts and modifiers letters). This function is equivalent toisAlpha.
This function returnsTrue if its argument has one of the followingGeneralCategorys, orFalse otherwise:
These classes are defined in theUnicode Character Database, part of the Unicode standard. The same document defines what is and is not a "Letter".
Basic usage:
>>>isLetter 'a'True>>>isLetter 'A'True>>>isLetter 'λ'True>>>isLetter '0'False>>>isLetter '%'False>>>isLetter '♥'False>>>isLetter '\31'False
Ensure thatisLetter andisAlpha are equivalent.
>>>let chars = [(chr 0)..]>>>let letters = map isLetter chars>>>let alphas = map isAlpha chars>>>letters == alphasTrue
Selects Unicode mark characters, for example accents and the like, which combine with preceding characters.
This function returnsTrue if its argument has one of the followingGeneralCategorys, orFalse otherwise:
These classes are defined in theUnicode Character Database, part of the Unicode standard. The same document defines what is and is not a "Mark".
Basic usage:
>>>isMark 'a'False>>>isMark '0'False
Combining marks such as accent characters usually need to follow another character before they become printable:
>>>map isMark "ò"[False,True]
Puns are not necessarily supported:
>>>isMark '✓'False
Selects Unicode numeric characters, including digits from various scripts, Roman numerals, et cetera.
This function returnsTrue if its argument has one of the followingGeneralCategorys, orFalse otherwise:
These classes are defined in theUnicode Character Database, part of the Unicode standard. The same document defines what is and is not a "Number".
Basic usage:
>>>isNumber 'a'False>>>isNumber '%'False>>>isNumber '3'True
ASCII'0' through'9' are all numbers:
>>>and $ map isNumber ['0'..'9']True
Unicode Roman numerals are "numbers" as well:
>>>isNumber 'Ⅸ'True
isPunctuation ::Char ->BoolSource#
Selects Unicode punctuation characters, including various kinds of connectors, brackets and quotes.
This function returnsTrue if its argument has one of the followingGeneralCategorys, orFalse otherwise:
ConnectorPunctuationDashPunctuationOpenPunctuationClosePunctuationInitialQuoteFinalQuoteOtherPunctuationThese classes are defined in theUnicode Character Database, part of the Unicode standard. The same document defines what is and is not a "Punctuation".
Basic usage:
>>>isPunctuation 'a'False>>>isPunctuation '7'False>>>isPunctuation '♥'False>>>isPunctuation '"'True>>>isPunctuation '?'True>>>isPunctuation '—'True
Selects Unicode symbol characters, including mathematical and currency symbols.
This function returnsTrue if its argument has one of the followingGeneralCategorys, orFalse otherwise:
These classes are defined in theUnicode Character Database, part of the Unicode standard. The same document defines what is and is not a "Symbol".
Basic usage:
>>>isSymbol 'a'False>>>isSymbol '6'False>>>isSymbol '='True
The definition of "math symbol" may be a little counter-intuitive depending on one's background:
>>>isSymbol '+'True>>>isSymbol '-'False
isSeparator ::Char ->BoolSource#
Selects Unicode space and separator characters.
This function returnsTrue if its argument has one of the followingGeneralCategorys, orFalse otherwise:
These classes are defined in theUnicode Character Database, part of the Unicode standard. The same document defines what is and is not a "Separator".
Basic usage:
>>>isSeparator 'a'False>>>isSeparator '6'False>>>isSeparator ' 'True
Warning: newlines and tab characters are not considered separators.
>>>isSeparator '\n'False>>>isSeparator '\t'False
But some more exotic characters are (like HTML's ):
>>>isSeparator '\160'True
Selects the first 128 characters of the Unicode character set, corresponding to the ASCII character set.
Selects the first 256 characters of the Unicode character set, corresponding to the ISO 8859-1 (Latin-1) character set.
Unicode General Categories (column 2 of the UnicodeData table) in the order they are listed in the Unicode standard (the Unicode Character Database, in particular).
Basic usage:
>>>:t OtherLetterOtherLetter :: GeneralCategory
Eq instance:
>>>UppercaseLetter == UppercaseLetterTrue>>>UppercaseLetter == LowercaseLetterFalse
Ord instance:
>>>NonSpacingMark <= MathSymbolTrue
Enum instance:
>>>enumFromTo ModifierLetter SpacingCombiningMark[ModifierLetter,OtherLetter,NonSpacingMark,SpacingCombiningMark]
Read instance:
>>>read "DashPunctuation" :: GeneralCategoryDashPunctuation>>>read "17" :: GeneralCategory*** Exception: Prelude.read: no parse
Show instance:
>>>show EnclosingMark"EnclosingMark"
Bounded instance:
>>>minBound :: GeneralCategoryUppercaseLetter>>>maxBound :: GeneralCategoryNotAssigned
Ix instance:
>>>import Data.Ix ( index )>>>index (OtherLetter,Control) FinalQuote12>>>index (OtherLetter,Control) Format*** Exception: Error in array index
Constructors
| UppercaseLetter | Lu: Letter, Uppercase |
| LowercaseLetter | Ll: Letter, Lowercase |
| TitlecaseLetter | Lt: Letter, Titlecase |
| ModifierLetter | Lm: Letter, Modifier |
| OtherLetter | Lo: Letter, Other |
| NonSpacingMark | Mn: Mark, Non-Spacing |
| SpacingCombiningMark | Mc: Mark, Spacing Combining |
| EnclosingMark | Me: Mark, Enclosing |
| DecimalNumber | Nd: Number, Decimal |
| LetterNumber | Nl: Number, Letter |
| OtherNumber | No: Number, Other |
| ConnectorPunctuation | Pc: Punctuation, Connector |
| DashPunctuation | Pd: Punctuation, Dash |
| OpenPunctuation | Ps: Punctuation, Open |
| ClosePunctuation | Pe: Punctuation, Close |
| InitialQuote | Pi: Punctuation, Initial quote |
| FinalQuote | Pf: Punctuation, Final quote |
| OtherPunctuation | Po: Punctuation, Other |
| MathSymbol | Sm: Symbol, Math |
| CurrencySymbol | Sc: Symbol, Currency |
| ModifierSymbol | Sk: Symbol, Modifier |
| OtherSymbol | So: Symbol, Other |
| Space | Zs: Separator, Space |
| LineSeparator | Zl: Separator, Line |
| ParagraphSeparator | Zp: Separator, Paragraph |
| Control | Cc: Other, Control |
| Format | Cf: Other, Format |
| Surrogate | Cs: Other, Surrogate |
| PrivateUse | Co: Other, Private Use |
| NotAssigned | Cn: Other, Not Assigned |
generalCategory ::Char ->GeneralCategorySource#
The Unicode general category of the character. This relies on theEnum instance ofGeneralCategory, which must remain in the same order as the categories are presented in the Unicode standard.
Basic usage:
>>>generalCategory 'a'LowercaseLetter>>>generalCategory 'A'UppercaseLetter>>>generalCategory '0'DecimalNumber>>>generalCategory '%'OtherPunctuation>>>generalCategory '♥'OtherSymbol>>>generalCategory '\31'Control>>>generalCategory ' 'Space
Convert a letter to the corresponding upper-case letter, if any. Any other character is returned unchanged.
Convert a letter to the corresponding lower-case letter, if any. Any other character is returned unchanged.
Convert a letter to the corresponding title-case or upper-case letter, if any. (Title case differs from upper case only for a small number of ligature letters.) Any other character is returned unchanged.
digitToInt ::Char ->IntSource#
Convert a single digitChar to the correspondingInt. This function fails unless its argument satisfiesisHexDigit, but recognises both upper- and lower-case hexadecimal digits (that is,'0'..'9','a'..'f','A'..'F').
Characters'0' through'9' are converted properly to0..9:
>>>map digitToInt ['0'..'9'][0,1,2,3,4,5,6,7,8,9]
Both upper- and lower-case'A' through'F' are converted as well, to10..15.
>>>map digitToInt ['a'..'f'][10,11,12,13,14,15]>>>map digitToInt ['A'..'F'][10,11,12,13,14,15]
Anything else throws an exception:
>>>digitToInt 'G'*** Exception: Char.digitToInt: not a digit 'G'>>>digitToInt '♥'*** Exception: Char.digitToInt: not a digit '\9829'
intToDigit ::Int ->CharSource#
showLitChar ::Char ->ShowSSource#
Convert a character to a string using only printable characters, using Haskell source-language escape conventions. For example:
showLitChar '\n' s = "\\n" ++ s
lexLitChar ::ReadSStringSource#
Read a string representation of a character, using Haskell source-language escape conventions. For example:
lexLitChar "\\nHello" = [("\\n", "Hello")]readLitChar ::ReadSCharSource#
Read a string representation of a character, using Haskell source-language escape conventions, and convert it to the character that it encodes. For example:
readLitChar "\\nHello" = [('\n', "Hello")]Produced byHaddock version 2.20.0