
This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.
Created on2016-11-28 16:29 byxiang.zhang, last changed2022-04-11 14:58 byadmin. This issue is nowclosed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| PyUnicode_FindChar.patch | xiang.zhang,2016-11-28 16:29 | review | ||
| PyUnicode_FindChar-v2.patch | xiang.zhang,2016-11-29 15:51 | review | ||
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 552 | closed | dstufft,2017-03-31 16:36 | |
| Messages (13) | |||
|---|---|---|---|
| msg281883 -(view) | Author: Xiang Zhang (xiang.zhang)*![]() | Date: 2016-11-28 16:29 | |
PyUnicode_FindChar declares in the doc it treats its *start* and *end* parameters as str[start:end], same as other APIs like PyUnicode_Find, PyUnicode_Count. But it doesn't allow negative indices like others so violates the doc. | |||
| msg281889 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2016-11-28 16:56 | |
PyUnicode_FindChar.patch is a new feature, it cannot be applied to stable branches (py < 3.7).I'm not sure that it's worth it to support negative indexes for end. Why not simply documenting that end must be positive? | |||
| msg281938 -(view) | Author: Xiang Zhang (xiang.zhang)*![]() | Date: 2016-11-29 03:09 | |
Other APIs like PyUnicode_Find and PyUnicode_Count support it. Their docs are almost the same so I think PyUnicode_FindChar does not need to be the special one. After change, its behaviour and implementation are more consistent with other APIs. | |||
| msg281962 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2016-11-29 08:02 | |
Serhiy: I don't think that it's worth it to add a new function to _testcapi to test PyUnicode_FindChar. The implementation of the function seems simple.At least, I would prefer to only see a few unit tests, not 17 test for this simple function!I mean "character in str" is already tested by a *lot* of unit tests. | |||
| msg281967 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2016-11-29 08:20 | |
I think it is nice to add tests for C API. Especially if there is no direct mapping between Python and C API ("character in str" don't call PyUnicode_FindChar()). Tests should cover all corner cases, otherwise we can miss bugs. Some C API can be not used in CPython at all, just in third-party extensions, and special tests is the only way to test them. The implementation of PyUnicode_FindChar() is not so simple (for example seeissue24821).I don't have an opinion about supporting negative indices. | |||
| msg281972 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2016-11-29 08:38 | |
Would be nice to test corner cases:1. Search UCS2 or UCS4 character with zero lower 8 bits: U+XX00.2. Search UCS2 or UCS4 character with lower 8 bits that match high bits of string characters. For example search U+0404 in the string that consists of U+04XX (Ukrainian text). I think you can find similar Chinese example. | |||
| msg282000 -(view) | Author: Xiang Zhang (xiang.zhang)*![]() | Date: 2016-11-29 15:51 | |
Thanks for your reviews. :-)v2 updated the test codes. | |||
| msg282016 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2016-11-29 17:15 | |
PyUnicode_FindChar-v2.patch LGTM with a minor comment on the review, but I would prefer that Serhiy also reviews it ;-)Remaining question: what is the behaviour for direction=0, direction=100 or direction=-2? Maybe we can add a few unit tests for strange values of direction? (Not sure if it's worth it.) | |||
| msg282021 -(view) | Author: Xiang Zhang (xiang.zhang)*![]() | Date: 2016-11-29 17:29 | |
> Remaining question: what is the behaviour for direction=0, direction=100 or direction=-2? Maybe we can add a few unit tests for strange values of direction? (Not sure if it's worth it.)It's not documented so I also doubt it. Expect Serhiy's comment. | |||
| msg283682 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2016-12-20 11:25 | |
Ignore my request about special direction values. It's not worth it to writ tests for that.PyUnicode_FindChar-v2.patch LGTM. | |||
| msg283695 -(view) | Author: Roundup Robot (python-dev)![]() | Date: 2016-12-20 14:55 | |
New changesetce6a6cc3765d by Xiang Zhang in branch 'default':Issue#28822: Adjust indices handling of PyUnicode_FindChar().https://hg.python.org/cpython/rev/ce6a6cc3765d | |||
| msg283705 -(view) | Author: Xiang Zhang (xiang.zhang)*![]() | Date: 2016-12-20 16:41 | |
Thanks Victor and Serhiy! | |||
| msg286459 -(view) | Author: Roundup Robot (python-dev)![]() | Date: 2017-01-29 23:48 | |
New changeset2b5e5a3a805e by Martin Panter in branch 'default':Issue#28822: Add susp-ignored entry for NEWS; fix grammarhttps://hg.python.org/cpython/rev/2b5e5a3a805e | |||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:40 | admin | set | github: 73008 |
| 2017-03-31 16:36:12 | dstufft | set | pull_requests: +pull_request875 |
| 2017-01-29 23:48:18 | python-dev | set | messages: +msg286459 |
| 2016-12-20 16:41:01 | xiang.zhang | set | status: open -> closed resolution: fixed messages: +msg283705 stage: patch review -> resolved |
| 2016-12-20 14:55:53 | python-dev | set | nosy: +python-dev messages: +msg283695 |
| 2016-12-20 11:25:47 | vstinner | set | messages: +msg283682 |
| 2016-11-29 17:29:52 | xiang.zhang | set | messages: +msg282021 |
| 2016-11-29 17:15:27 | vstinner | set | messages: +msg282016 |
| 2016-11-29 15:51:28 | xiang.zhang | set | files: +PyUnicode_FindChar-v2.patch messages: +msg282000 |
| 2016-11-29 08:38:26 | serhiy.storchaka | set | messages: +msg281972 |
| 2016-11-29 08:20:24 | serhiy.storchaka | set | messages: +msg281967 |
| 2016-11-29 08:02:01 | vstinner | set | messages: +msg281962 |
| 2016-11-29 03:09:51 | xiang.zhang | set | messages: +msg281938 versions: - Python 3.5, Python 3.6 |
| 2016-11-28 16:56:35 | vstinner | set | messages: +msg281889 |
| 2016-11-28 16:29:18 | xiang.zhang | create | |