Movatterモバイル変換


[0]ホーム

URL:


cppreference.com
Namespaces
Variants
    Actions

      mbrtoc32

      From cppreference.com
      <c‎ |string‎ |multibyte
       
       
       
       
      Defined in header<uchar.h>
      size_t mbrtoc32( char32_trestrict* pc32,constchar*restrict s,
                       size_t n,mbstate_t*restrict ps);
      (since C11)

      Converts a single code point from its narrow multibyte character representation to its variable-length 32-bit wide character representation (but typically, UTF-32).

      Ifs is not a null pointer, inspects at mostn bytes of the multibyte character string, beginning with the byte pointed to bys to determine the number of bytes necessary to complete the next multibyte character (including any shift sequences, and taking into account the current multibyte conversion state*ps). If the function determines that the next multibyte character ins is complete and valid, converts it to the corresponding 32-bit wide character and stores it in*pc32 (ifpc32 is not null).

      If the multibyte character in*s corresponds to a multi-char32_t sequence (not possible with UTF-32), then after the first call to this function,*ps is updated in such a way that the next calls tombrtoc32 will write out the additionalchar32_t, without considering*s.

      Ifs is a null pointer, the values ofn andpc32 are ignored and the call is equivalent tombrtoc32(NULL,"",1, ps).

      If the wide character produced is the null character, the conversion state*ps represents the initial shift state.

      If the macro__STDC_UTF_32__ is defined, the 32-bit encoding used by this function is UTF-32; otherwise, it is implementation-defined.The macro is always defined and the encoding is always UTF-32.(since C23) In any case, the multibyte character encoding used by this function is specified by the currently active C locale.

      Contents

      [edit]Parameters

      pc32 - pointer to the location where the resulting 32-bit wide character will be written
      s - pointer to the multibyte character string used as input
      n - limit on the number of bytes in s that can be examined
      ps - pointer to the conversion state object used when interpreting the multibyte string

      [edit]Return value

      The first of the following that applies:

      • 0 if the character converted froms (and stored in*pc32 if non-null) was the null character.
      • The number of bytes[1n] of the multibyte character successfully converted froms.
      • (size_t)-3 if the nextchar32_t from a multi-char32_t character has now been written to*pc32. No bytes are processed from the input in this case.
      • (size_t)-2 if the nextn bytes constitute an incomplete, but so far valid, multibyte character. Nothing is written to*pc32.
      • (size_t)-1 if encoding error occurs. Nothing is written to*pc32, the valueEILSEQ is stored inerrno and the value of*ps is unspecified.

      [edit]Example

      On MSVC you will need the/utf-8 compiler flag for UTF_8 to work properly.

      Run this code
      #include <assert.h>#include <locale.h>#include <stdio.h>#include <uchar.h> int main(void){setlocale(LC_ALL,"en_US.utf8");char in[]= u8"zß水🍌";// or "z\u00df\u6c34\U0001F34C"enum{ in_size=sizeof in/sizeof*in}; printf("Processing %d UTF-8 code units: [", in_size);for(int i=0; i< in_size;++i)printf("%s%02X", i?" ":"",(unsignedchar)in[i]);puts("]");     char32_t out[in_size];    char32_t* p_out= out;char* p_in= in;char* end= in+ in_size;mbstate_t state={0};size_t rc;while((rc= mbrtoc32(p_out, p_in, end- p_in,&state))){assert(rc!=(size_t)-3);// no surrogate pairs in UTF-32if(rc==(size_t)-1)break;// invalid inputif(rc==(size_t)-2)break;// truncated input        p_in+= rc;++p_out;} size_t out_size= p_out+1- out;printf("into %zu UTF-32 code units: [", out_size);for(size_t i=0; i< out_size;++i)printf("%s%08X", i?" ":"", out[i]);puts("]");}

      Output:

      Processing 11 UTF-8 code units: [7A C3 9F E6 B0 B4 F0 9F 8D 8C 00]into 5 UTF-32 code units: [0000007A 000000DF 00006C34 0001F34C 00000000]

      [edit]References

      • C23 standard (ISO/IEC 9899:2024):
      • 7.30.1.5 The mbrtoc32 function (p: 410)
      • C17 standard (ISO/IEC 9899:2018):
      • 7.28.1.3 The mbrtoc32 function (p: 293-294)
      • C11 standard (ISO/IEC 9899:2011):
      • 7.28.1.3 The mbrtoc32 function (p: 400-401)

      [edit]See also

      converts a UTF-32 character to narrow multibyte encoding
      (function)[edit]
      C++ documentation formbrtoc32
      Retrieved from "https://en.cppreference.com/mwiki/index.php?title=c/string/multibyte/mbrtoc32&oldid=181022"

      [8]ページ先頭

      ©2009-2025 Movatter.jp