Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32.4k
Description
Feature or enhancement
There is publicPyUnicode_CompareWithASCIIString()
function. Despite it name, it compares Python string object with ISO-8859-1 encoded C string. it returns -1, 0 or 1 and never sets an error.
There is private_PyUnicode_EqualToASCIIString()
function. It only works with ASCII encoded C string and crashes in debug build it it is not ASCII. It returns 0 or 1 and never sets an error.
_PyUnicode_EqualToASCIIString()
is more efficient thanPyUnicode_CompareWithASCIIString()
, because if arguments are not equal it can simply return false instead of determining what is larger. It was the main reason of introducing it. It is also more convenient, because you do not need to add== 0
or!= 0
after the call (and if it is not added, it is difficult to read).
I propose to add the latter function to the public C API, but also extend it to support UTF-8 encoded C strings. While most of use cases are ASCII-only, formally almost all C strings in the C API are UTF-8 encoded.PyUnicode_FromString()
andPyUnicode_AsUTF8AndSize()
used to convert between Python and C strings use UTF-8 encoding.PyTypeObject.tp_name
,PyMethodDef.ml_name
,PyDescrObject.d_name
all are UTF-8 encoded.PyUnicode_CompareWithASCIIString()
cannot be used to compare Python string with such names.
For PyASCIIObject objects the new function will be as fast as_PyUnicode_EqualToASCIIString()
.