General topics | ||||||||||||||||
| ||||||||||||||||
Flow control | ||||||||||||||||
Conditional execution statements | ||||||||||||||||
Iteration statements (loops) | ||||||||||||||||
Jump statements | ||||||||||||||||
Functions | ||||||||||||||||
Function declaration | ||||||||||||||||
Lambda function expression | ||||||||||||||||
inline specifier | ||||||||||||||||
Dynamic exception specifications(until C++17*) | ||||||||||||||||
noexcept specifier(C++11) | ||||||||||||||||
Exceptions | ||||||||||||||||
Namespaces | ||||||||||||||||
Types | ||||||||||||||||
Specifiers | ||||||||||||||||
| ||||||||||||||||
Storage duration specifiers | ||||||||||||||||
Initialization | ||||||||||||||||
Expressions | ||||||||||||||||
Alternative representations | ||||||||||||||||
Literals | ||||||||||||||||
Boolean -Integer -Floating-point | ||||||||||||||||
Character -String -nullptr(C++11) | ||||||||||||||||
User-defined(C++11) | ||||||||||||||||
Utilities | ||||||||||||||||
Attributes(C++11) | ||||||||||||||||
Types | ||||||||||||||||
typedef declaration | ||||||||||||||||
Type alias declaration(C++11) | ||||||||||||||||
Casts | ||||||||||||||||
Memory allocation | ||||||||||||||||
Classes | ||||||||||||||||
Class-specific function properties | ||||||||||||||||
| ||||||||||||||||
Special member functions | ||||||||||||||||
Templates | ||||||||||||||||
Miscellaneous | ||||||||||||||||
General | |||||||||||||||||||||
Literals | |||||||||||||||||||||
| |||||||||||||||||||||
Operators | |||||||||||||||||||||
Conversions | |||||||||||||||||||||
Escape sequences are used to represent certain special characters withinstring literals andcharacter literals.
The following escape sequences are available:
Escape sequence | Description | Representation |
---|---|---|
Simple escape sequences | ||
\' | single quote | byte0x27 in ASCII encoding |
\" | double quote | byte0x22 in ASCII encoding |
\? | question mark | byte0x3f in ASCII encoding |
\\ | backslash | byte0x5c in ASCII encoding |
\a | audible bell | byte0x07 in ASCII encoding |
\b | backspace | byte0x08 in ASCII encoding |
\f | form feed - new page | byte0x0c in ASCII encoding |
\n | line feed - new line | byte0x0a in ASCII encoding |
\r | carriage return | byte0x0d in ASCII encoding |
\t | horizontal tab | byte0x09 in ASCII encoding |
\v | vertical tab | byte0x0b in ASCII encoding |
Numeric escape sequences | ||
\nnn | arbitrary octal value | code unitnnn (1~3 octal digits) |
\o{n...} (since C++23) | code unitn... (arbitrary number of octal digits) | |
\xn... | arbitrary hexadecimal value | code unitn... (arbitrary number of hexadecimal digits) |
\x{n...} (since C++23) | ||
Conditional escape sequences[1] | ||
\c | Implementation-defined | Implementation-defined |
Universal character names | ||
\unnnn | arbitraryUnicode value; may result in several code units | code pointU+nnnn (4 hexadecimal digits) |
\u{n...} (since C++23) | code pointU+n... (arbitrary number of hexadecimal digits) | |
\Unnnnnnnn | code pointU+nnnnnnnn (8 hexadecimal digits) | |
\N{NAME} (since C++23) | arbitrary Unicode character | character named byNAME (seebelow) |
c
in each conditional escape sequence is a member ofbasic source character set(until C++23)basic character set(since C++23) that is not the character following the\
in any other escape sequence.Contents |
If a universal character name corresponds to a code point that is not 0x24 ( | (until C++11) |
If a universal character name corresponding to a code point of a member ofbasic source character set or control characters appear outside acharacter orstring literal, the program is ill-formed. If a universal character name corresponds surrogate code point (the range 0xD800-0xDFFF, inclusive), the program is ill-formed. If a universal character name used in a UTF-16/32 string literal does not correspond to a code point inISO/IEC 10646 (the range 0x0-0x10FFFF, inclusive), the program is ill-formed. | (since C++11) (until C++20) |
If a universal character name corresponding to a code point of a member ofbasic source character set or control characters appear outside acharacter orstring literal, the program is ill-formed. If a universal character name does not correspond to a code point inISO/IEC 10646 (the range 0x0-0x10FFFF, inclusive) or corresponds to a surrogate code point (the range 0xD800-0xDFFF, inclusive), the program is ill-formed. | (since C++20) (until C++23) |
If a universal character name corresponding to a scalar value of a character in thebasic character set or a control character appear outside acharacter orstring literal, the program is ill-formed. If a universal character name does not correspond to a scalar value of a character in thetranslation character set, the program is ill-formed. | (since C++23) |
Named universal character escapes
A universal character name of the syntax above is anamed universal character. It designates the corresponding character in theUnicode Standard (chapter 4.8 Name) if then-char-sequence is equal to its character name or to one of its character name aliases of type “control”, “correction”, or “alternate”; otherwise, the program is ill-formed. These aliases are listed in theUnicode Character Database’sNameAliases.txt. None of these names or aliases have leading or trailing spaces. A validn-char-sequence must contain only uppercase Latin letters A through Z, digits, space, and hyphen-minus. Other characters never occur in a Unicode character name, and thus their appearance in an-char-sequence always renders the program ill-formed. | (since C++23) |
\0 is the most commonly used octal escape sequence, because it represents the terminating null character innull-terminated strings.
The new-line character\n has special meaning when used intext mode I/O: it is converted to the OS-specific newline representation, usually a byte or byte sequence. Some systems mark their lines with length fields instead.
Octal escape sequences have a limit of three octal digits, but terminate at the first character that is not a valid octal digit if encountered sooner.
Hexadecimal escape sequences have no length limit and terminate at the first character that is not a valid hexadecimal digit. If the value represented by a single hexadecimal escape sequence does not fit the range of values represented by the character type used in this string literal (char,char8_t,(since C++20)char16_t,char32_t,(since C++11)orwchar_t), the result is unspecified.
A universal character name in a narrow string literal or a 16-bit string literal may map to more than one code unit, e.g.\U0001f34c is 4char code units in UTF-8 (\xF0\x9F\x8D\x8C) and 2char16_t code units in UTF-16 (\xD83C\xDF4C). | (since C++11) |
The question mark escape sequence\? is used to preventtrigraphs from being interpreted inside string literals: a string such as"??/" is compiled as"\", but if the second question mark is escaped, as in"?\?/", it becomes"??/".As trigraphs have been removed from C++, the question mark escape sequence is no longer necessary. It is preserved for compatibility with C++14 (and former revisions) and C.(since C++17)
Feature-test macro | Value | Std | Feature |
---|---|---|---|
__cpp_named_character_escapes | 202207L | (C++23) | Named universal character escapes |
Output:
Thisisatest She said, "Sells she seashells on the seashore?"
The following behavior-changing defect reports were applied retroactively to previously published C++ standards.
DR | Applied to | Behavior as published | Correct behavior |
---|---|---|---|
CWG 505 | C++98 | the behavior was undefined if the character following a backslash was not one of those specified in the table | made conditionally supported (semantic is implementation-defined) |
C documentation forEscape sequence |