Movatterモバイル変換


[0]ホーム

URL:


cppreference.com
Namespaces
Variants
    Actions

      std::codecvt

      From cppreference.com
      <cpp‎ |locale
       
       
       
      Localization library
       
       
      Defined in header<locale>
      template<

         class InternT,
         class ExternT,
         class StateT

      >class codecvt;

      Class templatestd::codecvt encapsulates conversion of character strings, including wide and multibyte, from one encoding to another. All file I/O operations performed throughstd::basic_fstream<CharT> use thestd::codecvt<CharT,char,std::mbstate_t> facet of the locale imbued in the stream.

      std-codecvt-inheritance.svg

      Contents

      [edit]Specializations

      The standard library is guaranteed to provide the following specializations (they arerequired to be implemented by any locale object):

      Defined in header<locale>
      std::codecvt<char,char,std::mbstate_t> identity conversion
      std::codecvt<char16_t,char,std::mbstate_t>
      (since C++11)(deprecated in C++20)
      conversion between UTF-16 and UTF-8
      std::codecvt<char16_t, char8_t,std::mbstate_t>
      (since C++20)(deprecated)
      conversion between UTF-16 and UTF-8
      std::codecvt<char32_t,char,std::mbstate_t>
      (since C++11)(deprecated in C++20)
      conversion between UTF-32 and UTF-8
      std::codecvt<char32_t, char8_t,std::mbstate_t>
      (since C++20)(deprecated)
      conversion between UTF-32 and UTF-8
      std::codecvt<wchar_t,char,std::mbstate_t> conversion between the system's native wide and the single-byte narrow character sets

      [edit]Nested types

      Type Definition
      intern_typeInternT
      extern_typeExternT
      state_typeStateT

      [edit]Data members

      Member Description
      std::locale::idid[static] the identifier of thefacet

      [edit]Member functions

      constructs a newcodecvt facet
      (public member function)
      invokesdo_out
      (public member function)[edit]
      invokesdo_in
      (public member function)[edit]
      invokesdo_unshift
      (public member function)[edit]
      invokesdo_encoding
      (public member function)[edit]
      invokesdo_always_noconv
      (public member function)[edit]
      invokesdo_length
      (public member function)[edit]
      invokesdo_max_length
      (public member function)[edit]

      [edit]Protected member functions

      destructs acodecvt facet
      (protected member function)
      [virtual]
      converts a string fromInternT toExternT, such as when writing to file
      (virtual protected member function)[edit]
      [virtual]
      converts a string fromExternT toInternT, such as when reading from file
      (virtual protected member function)[edit]
      [virtual]
      generates the termination character sequence ofExternT characters for incomplete conversion
      (virtual protected member function)[edit]
      [virtual]
      returns the number ofExternT characters necessary to produce oneInternT character, if constant
      (virtual protected member function)[edit]
      tests if the facet encodes an identity conversion for all valid argument values
      (virtual protected member function)[edit]
      [virtual]
      calculates the length of theExternT string that would be consumed by conversion into givenInternT buffer
      (virtual protected member function)[edit]
      [virtual]
      returns the maximum number ofExternT characters that could be converted into a singleInternT character
      (virtual protected member function)[edit]

      Inherited fromstd::codecvt_base

      Nested type Definition
      enum result{ ok, partial, error, noconv}; Unscoped enumeration type
      Enumeration constant Definition
      ok conversion was completed with no error
      partial not all source characters were converted
      error encountered an invalid character
      noconv no conversion required, input and output types are the same

      [edit]Example

      The following examples reads a UTF-8 file using a locale which implements UTF-8 conversion incodecvt<wchar_t,char,std::mbstate_t> and converts a UTF-8 string to UTF-16 using one of the standard specializations ofstd::codecvt.

      Run this code
      #include <codecvt>#include <cstdint>#include <fstream>#include <iomanip>#include <iostream>#include <locale>#include <string> // utility wrapper to adapt locale-bound facets for wstring/wbuffer converttemplate<class Facet>struct deletable_facet: Facet{template<class...Args>    deletable_facet(Args&&...args): Facet(std::forward<Args>(args)...){}    ~deletable_facet(){}}; int main(){// UTF-8 narrow multibyte encodingstd::string data=reinterpret_cast<constchar*>(+u8"z\u00df\u6c34\U0001f34c");// or reinterpret_cast<const char*>(+u8"zß水🍌")// or "\x7a\xc3\x9f\xe6\xb0\xb4\xf0\x9f\x8d\x8c" std::ofstream("text.txt")<< data; // using system-supplied locale's codecvt facetstd::wifstream fin("text.txt");// reading from wifstream will use codecvt<wchar_t, char, std::mbstate_t>// this locale's codecvt converts UTF-8 to UCS4 (on systems such as Linux)    fin.imbue(std::locale("en_US.UTF-8"));std::cout<<"The UTF-8 file contains the following UCS4 code units:\n"<<std::hex;for(wchar_t c; fin>> c;)std::cout<<"U+"<<std::setw(4)<<std::setfill('0')<<static_cast<uint32_t>(c)<<' '; // using standard (locale-independent) codecvt facetstd::wstring_convert<        deletable_facet<std::codecvt<char16_t,char,std::mbstate_t>>,char16_t> conv16;std::u16string str16= conv16.from_bytes(data); std::cout<<"\n\nThe UTF-8 file contains the following UTF-16 code units:\n"<<std::hex;for(char16_t c: str16)std::cout<<"U+"<<std::setw(4)<<std::setfill('0')<<static_cast<uint16_t>(c)<<' ';std::cout<<'\n';}

      Output:

      The UTF-8 file contains the following UCS4 code units:U+007a U+00df U+6c34 U+1f34c  The UTF-8 file contains the following UTF-16 code units:U+007a U+00df U+6c34 U+d83c U+df4c

      [edit]Defect reports

      The following behavior-changing defect reports were applied retroactively to previously published C++ standards.

      DRApplied toBehavior as publishedCorrect behavior
      LWG 3767C++20std::codecvt<char16_t, char8_t,std::mbstate_t> and
      std::codecvt<char32_t, char8_t,std::mbstate_t> are locale-independent
      deprecated them

      [edit]See also

      Character
      conversions
      locale-defined multibyte
      (UTF-8, GB18030)
      UTF-8
      UTF-16
      UTF-16mbrtoc16 /c16rtomb(with C11's DR488)

      codecvt<char16_t,char,mbstate_t>
      codecvt_utf8_utf16<char16_t>
      codecvt_utf8_utf16<char32_t>
      codecvt_utf8_utf16<wchar_t>

      N/A
      UCS-2c16rtomb(without C11's DR488)codecvt_utf8<char16_t>codecvt_utf16<char16_t>
      UTF-32

      mbrtoc32 /c32rtomb

      codecvt<char32_t,char,mbstate_t>
      codecvt_utf8<char32_t>

      codecvt_utf16<char32_t>

      systemwchar_t:

      UTF-32(non-Windows)
      UCS-2(Windows)

      mbsrtowcs /wcsrtombs
      use_facet<codecvt
      <wchar_t,char,mbstate_t>>(locale)

      codecvt_utf8<wchar_t>codecvt_utf16<wchar_t>
      defines character conversion errors
      (class)[edit]
      represents the system-suppliedstd::codecvt for the named locale
      (class template)[edit]
      (C++11)(deprecated in C++17)(removed in C++26)
      converts between UTF-8 and UCS-2/UCS-4
      (class template)[edit]
      (C++11)(deprecated in C++17)(removed in C++26)
      converts between UTF-16 and UCS-2/UCS-4
      (class template)[edit]
      (C++11)(deprecated in C++17)(removed in C++26)
      converts between UTF-8 and UTF-16
      (class template)[edit]
      Retrieved from "https://en.cppreference.com/mwiki/index.php?title=cpp/locale/codecvt&oldid=177703"

      [8]ページ先頭

      ©2009-2025 Movatter.jp