std::text_encoding

From cppreference.com

Compiler support
Freestanding and hosted
Language
Standard library
Standard library headers
Named requirements
Feature test macros(C++20)
Language support library
Concepts library(C++20)
Diagnostics library
Memory management library
Metaprogramming library(C++11)
General utilities library
Containers library
Iterators library
Ranges library(C++20)
Algorithms library
Strings library
Text processing library
Numerics library
Date and time library
Input/output library
Filesystem library(C++17)
Concurrency support library(C++11)
Execution control library(C++26)
Technical specifications
Symbols index
External libraries

[edit]

Text processing library

Localization library

Regular expressions library(C++11)

Formatting library(C++20)

Null-terminated sequence utilities

Byte strings

Multibyte strings

Wide strings

Primitive numeric conversions

to_chars (C++17)
to_chars_result (C++17)
from_chars (C++17)
from_chars_result (C++17)
chars_format (C++17)

Text encoding identifications

text_encoding

(C++26)

[edit]

std::text_encoding

Member functions
Creation
text_encoding::text_encoding
text_encoding::literal
text_encoding::environment
Observers
text_encoding::mib
text_encoding::name
text_encoding::aliases
text_encoding::environment_is
Helpers
text_encoding::comp-name
Non-member functions
operator==(std::text_encoding)
Member types
text_encoding::id
text_encoding::aliases_view
Helper classes
hash<std::text_encoding>

[edit]

Defined in header`<text_encoding>`
struct text_encoding;		(since C++26)

The classtext_encoding provides a mechanism for identifying character encodings. It is used to determine theordinary character literal encoding of the translation environment at compile-time and the character encoding of the execution environment at runtime.

Eachtext_encoding object encapsulates acharacter encoding scheme, uniquely identified by an enumerator intext_encoding::id and a corresponding name represented by a null-terminated byte string. These can be accessed through themib() andname() member functions, respectively. The determination of whether an object represents a character encoding scheme implemented in the translation or execution environment is implementation-defined.

The classtext_encoding is aTriviallyCopyable type. The array object representing the corresponding name of the character encoding scheme isnested within thetext_encoding object itself. The stored name is limited to a maximum ofmax_name_length characters excluding the null character'\0'.

The class supports both registered and non-registered character encodings. Registered encodings are those found in theIANA Character Sets Registry excluding the following character encodings:

NATS-DANO (33)
NATS-DANO-ADD (34).

In addition, the class provides access for registered character encodings to:

Primary name: The official name specified in the registry.
Aliases: An implementation-defined superset of aliases from the registry.
MIBenum value: A unique identifier for use in identifying coded character encodings.

Non-registered encodings can be represented with an enumeratorid::other orid::unknown and a custom name.

Atext_encoding objecte whose MIBenum value is neitherid::other norid::unknown maintains the following invariants:

*e.name()!='\0' istrue, and
e.mib()==std::text_encoding(e.name()).mib() istrue.

[edit]Member types

id	represents the MIBenum value of the character encoding (public member enum)[edit]
aliases_view	a`view` over aliases of the character encoding (public member class)[edit]

[edit]Member constant

Name	Value
constexprstd::size_t max_name_length [static]	63 (public static member constant)

[edit]Data members

Member	Description
std::text_encoding::id`mib_`(private)	a MIBenum value withid::unknown as the default value (exposition-only member object*)
char[max_name_length+1]`name_`(private)	a stored primary name (exposition-only member object*)

[edit]Member functions

Creation
(constructor)	constructs new`text_encoding` object (public member function)[edit]
literal [static]	constructs a new`text_encoding` representing theordinary character literal encoding (public static member function)[edit]
environment [static]	constructs a new`text_encoding` representing the implementation-defined character encoding scheme of the execution environment (public static member function)[edit]
Observers
mib	returns the MIBenum value of the current character encoding (public member function)[edit]
name	returns the primary name of the current character encoding (public member function)[edit]
aliases	returns a`view` over aliases of the current character encoding (public member function)[edit]
environment_is [static]	checks the character encoding scheme of the execution environment with the specified MIB value (public static member function)[edit]
Helpers
comp-name [static](private)	compares two alias names usingCharset Alias Matching (exposition-only static member function*)[edit]

[edit]Non-member functions

operator==(std::text_encoding)

(C++26)

compares twotext_encoding objects.
(public member function)[edit]

[edit]Helper classes

std::hash<std::text_encoding>

(C++26)

hash support forstd::text_encoding
(class template specialization)[edit]

[edit]Notes

When working with character encodings, it is important to note that the primary names or aliases of two distinct registered character encodings are not equivalent when compared usingCharset Alias Matching as described by the Unicode Technical Standard.

For convenience, the enumerators oftext_encoding::id are introduced as members oftext_encoding and can be accessed directly. This means thattext_encoding::ASCII andtext_encoding::id::ASCII refer to the same entity.

It is recommended that the implementation should treat registered encodings as not interchangeable. Additionally, the primary name of a registered encoding should not be used to describe a similar but different non-registered encoding, unless there is a clear precedent for doing so.

Feature-test macro	Value	Std	Feature
`__cpp_lib_text_encoding`	`202306L`	(C++26)	`std::text_encoding`

[edit]Example

Run this code

#include <locale>#include <print>#include <text_encoding> int main(){// literal encoding is known at compile-timeconstexprstd::text_encoding literal_encoding= std::text_encoding::literal(); // check for literal encoding    static_assert(literal_encoding.mib()!= std::text_encoding::other&&                  literal_encoding.mib()!= std::text_encoding::unknown); // environment encoding is only known at runtimestd::text_encoding env_encoding= std::text_encoding::environment(); // associated encoding of the default localestd::text_encoding locale_encoding=std::locale("").encoding(); std::println("The literal encoding is {}", literal_encoding.name());std::println("The aliases of literal encoding:");for(constchar* alias_name: literal_encoding.aliases())std::println(" -> {}", alias_name); if(env_encoding== locale_encoding)std::println("Both environment and locale encodings are the same"); std::println("The environment encoding is {}", env_encoding.name());std::println("The aliases of environment encoding:");for(constchar* alias_name: env_encoding.aliases())std::println(" -> {}", alias_name);}

Possible output:

The literal encoding is UTF-8The aliases of literal encoding: -> UTF-8 -> csUTF8Both environment and locale encodings are the sameThe environment encoding is ANSI_X3.4-1968The aliases of environment encoding: -> US-ASCII -> iso-ir-6 -> ANSI_X3.4-1968 -> ANSI_X3.4-1986 -> ISO_646.irv:1991 -> ISO646-US -> us -> IBM367 -> cp367 -> csASCII -> ASCII