Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit3726c1c

Browse files
Move is_valid_ascii() to ascii.h.
This function requires simd.h, which is a rather large dependencyfor a widely-used header file like pg_wchar.h. Furthermore, thereis a report of a third-party tool that is struggling to usepg_wchar.h due to its dependence on simd.h (presumably becausesimd.h uses several intrinsics). Moving the function to the muchless popular ascii.h resolves these issues for now.This commit is back-patched for the benefit of the aforementionedthird-party tool. The simd.h dependency was only added in v16,but we've opted to back-patch to v15 so that is_valid_ascii() livesin the same file for all versions where it exists. This couldbreak existing third-party code that uses the function, but wecouldn't find any examples of such code. It should be possible tofix any code that this commit breaks by including ascii.h in thefile that uses is_valid_ascii().Author: Jubilee YoungReviewed-by: Tom Lane, John Naylor, Andres Freund, Eric RidgeDiscussion:https://postgr.es/m/CAPNHn3oKJJxMsYq%2BqLYzVJOFrUcOr4OF1EC-KtFT-qh8nOOOtQ%40mail.gmail.comBackpatch-through: 15
1 parent3f8ac13 commit3726c1c

File tree

3 files changed

+53
-53
lines changed

3 files changed

+53
-53
lines changed

‎src/common/wchar.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
#include"c.h"
1414

1515
#include"mb/pg_wchar.h"
16+
#include"utils/ascii.h"
1617

1718

1819
/*

‎src/include/mb/pg_wchar.h

Lines changed: 0 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -699,57 +699,4 @@ extern intmic2latin_with_table(const unsigned char *mic, unsigned char *p,
699699
externWCHAR*pgwin32_message_to_UTF16(constchar*str,intlen,int*utf16len);
700700
#endif
701701

702-
703-
/*
704-
* Verify a chunk of bytes for valid ASCII.
705-
*
706-
* Returns false if the input contains any zero bytes or bytes with the
707-
* high-bit set. Input len must be a multiple of 8.
708-
*/
709-
staticinlinebool
710-
is_valid_ascii(constunsignedchar*s,intlen)
711-
{
712-
uint64chunk,
713-
highbit_cum=UINT64CONST(0),
714-
zero_cum=UINT64CONST(0x8080808080808080);
715-
716-
Assert(len %sizeof(chunk)==0);
717-
718-
while (len>0)
719-
{
720-
memcpy(&chunk,s,sizeof(chunk));
721-
722-
/*
723-
* Capture any zero bytes in this chunk.
724-
*
725-
* First, add 0x7f to each byte. This sets the high bit in each byte,
726-
* unless it was a zero. If any resulting high bits are zero, the
727-
* corresponding high bits in the zero accumulator will be cleared.
728-
*
729-
* If none of the bytes in the chunk had the high bit set, the max
730-
* value each byte can have after the addition is 0x7f + 0x7f = 0xfe,
731-
* and we don't need to worry about carrying over to the next byte. If
732-
* any input bytes did have the high bit set, it doesn't matter
733-
* because we check for those separately.
734-
*/
735-
zero_cum &= (chunk+UINT64CONST(0x7f7f7f7f7f7f7f7f));
736-
737-
/* Capture any set bits in this chunk. */
738-
highbit_cum |=chunk;
739-
740-
s+=sizeof(chunk);
741-
len-=sizeof(chunk);
742-
}
743-
744-
/* Check if any high bits in the high bit accumulator got set. */
745-
if (highbit_cum&UINT64CONST(0x8080808080808080))
746-
return false;
747-
748-
/* Check if any high bits in the zero accumulator got cleared. */
749-
if (zero_cum!=UINT64CONST(0x8080808080808080))
750-
return false;
751-
752-
return true;
753-
}
754-
755702
#endif/* PG_WCHAR_H */

‎src/include/utils/ascii.h

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,4 +13,56 @@
1313

1414
externvoidascii_safe_strlcpy(char*dest,constchar*src,size_tdestsiz);
1515

16+
/*
17+
* Verify a chunk of bytes for valid ASCII.
18+
*
19+
* Returns false if the input contains any zero bytes or bytes with the
20+
* high-bit set. Input len must be a multiple of 8.
21+
*/
22+
staticinlinebool
23+
is_valid_ascii(constunsignedchar*s,intlen)
24+
{
25+
uint64chunk,
26+
highbit_cum=UINT64CONST(0),
27+
zero_cum=UINT64CONST(0x8080808080808080);
28+
29+
Assert(len %sizeof(chunk)==0);
30+
31+
while (len>0)
32+
{
33+
memcpy(&chunk,s,sizeof(chunk));
34+
35+
/*
36+
* Capture any zero bytes in this chunk.
37+
*
38+
* First, add 0x7f to each byte. This sets the high bit in each byte,
39+
* unless it was a zero. If any resulting high bits are zero, the
40+
* corresponding high bits in the zero accumulator will be cleared.
41+
*
42+
* If none of the bytes in the chunk had the high bit set, the max
43+
* value each byte can have after the addition is 0x7f + 0x7f = 0xfe,
44+
* and we don't need to worry about carrying over to the next byte. If
45+
* any input bytes did have the high bit set, it doesn't matter
46+
* because we check for those separately.
47+
*/
48+
zero_cum &= (chunk+UINT64CONST(0x7f7f7f7f7f7f7f7f));
49+
50+
/* Capture any set bits in this chunk. */
51+
highbit_cum |=chunk;
52+
53+
s+=sizeof(chunk);
54+
len-=sizeof(chunk);
55+
}
56+
57+
/* Check if any high bits in the high bit accumulator got set. */
58+
if (highbit_cum&UINT64CONST(0x8080808080808080))
59+
return false;
60+
61+
/* Check if any high bits in the zero accumulator got cleared. */
62+
if (zero_cum!=UINT64CONST(0x8080808080808080))
63+
return false;
64+
65+
return true;
66+
}
67+
1668
#endif/* _ASCII_H_ */

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp