
This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.
Created on2016-11-22 14:29 byxiang.zhang, last changed2022-04-11 14:58 byadmin. This issue is nowclosed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| unicode_encode_ucs1_error_pos.patch | xiang.zhang,2016-11-22 14:29 | review | ||
| unicode_encode_ucs1_error_pos2.patch | serhiy.storchaka,2016-11-22 15:52 | review | ||
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 552 | closed | dstufft,2017-03-31 16:36 | |
| Messages (10) | |||
|---|---|---|---|
| msg281482 -(view) | Author: Xiang Zhang (xiang.zhang)*![]() | Date: 2016-11-22 14:29 | |
unicode_encode_ucs1 now recognizes as many characters as it can one time instead of one character a time. But the unicodeerror positions still only count 1(the second time). A similar problem reported in#28561. | |||
| msg281489 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2016-11-22 15:01 | |
If I understood correctly, the patch fix the ASCII encoder to handle correctly error handlers which return non-ASCII text replacement strings. Right?I am not aware of such error handler, so I guess that it's a more a theorical fix?I really hate the code (in each encoder) which handles non-ASCII replacement strings. The code in the charmap encoder is just a mess: it uses a reentrant call to the encoder... I never understood this crazy behaviour. I guess that nobody relies on the behaviour. I hesitate to simply raise an error instead of using different rules depending on the code. Ah yes, by the way, each codec behaves differently on non-ASCII replacement strings... | |||
| msg281490 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2016-11-22 15:52 | |
LGTM. But it is too late for beta 4. I'll commit the patch either after releasing 3.6.0 or in the 3.7 branch only.And while we are here I noticed that handling non-ASCII replacement string could be simpler. | |||
| msg281517 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2016-11-22 21:01 | |
> LGTM. But it is too late for beta 4. I'll commit the patch either after releasing 3.6.0 or in the 3.7 branch only.Right now, I suggest to only commit into 3.7. Such minor bug can wait for Python 3.6.1.> And while we are here I noticed that handling non-ASCII replacement string could be simplerI also suggest to first commit unicode_encode_ucs1_error_pos.patch and then commit the other part of unicode_encode_ucs1_error_pos2.patch in a separated commit. I will be easy to backport the fix to the 3.6 branch later.Serhiy: Xiang became a core developer, are you ok if he push himself unicode_encode_ucs1_error_pos.patch to default tomorrow, and later you rebase your patch on top of that?I'm not super confident because the fix doesn't come with an unit test, but it's ok if Serhiy reviewed it :-) | |||
| msg281519 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2016-11-22 21:12 | |
This is not a bug itself. It seems to me that at worst case the current code is less efficient with non-standard error handler than it can be. I would commit the path to the 3.6 branch before beta 4 as it is nice and simple additional to already added optimization. But it is too late now, at last beta.Xiang can commit his patch to 3.7. No need to backport it to 3.6 (if I didn't miss something). | |||
| msg281520 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2016-11-22 21:16 | |
> No need to backport it to 3.6 (if I didn't miss something).Sorry, I misunderstood the issue. It's an enhancement, so for 3.7 only.Right, 3.6 is now almost frozen, only major bug fixes blocking the release are accepted now (in short). Regular bugfixes should wait for 3.6.1. | |||
| msg281557 -(view) | Author: Roundup Robot (python-dev)![]() | Date: 2016-11-23 12:19 | |
New changeset3d660ed2a60e by Xiang Zhang in branch 'default':Issue#28774: Fix start/end pos in unicode_encode_ucs1().https://hg.python.org/cpython/rev/3d660ed2a60e | |||
| msg281558 -(view) | Author: Xiang Zhang (xiang.zhang)*![]() | Date: 2016-11-23 12:22 | |
Thanks Serhiy and Victor. Finished my first commit. :-)Now assign back to Serhiy and pos2 LGTM. | |||
| msg281560 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2016-11-23 13:07 | |
Congratulations, Xiang! | |||
| msg281562 -(view) | Author: Roundup Robot (python-dev)![]() | Date: 2016-11-23 13:17 | |
New changeset3addf93f4111 by Serhiy Storchaka in branch 'default':Issue#28774: Simplified encoding a str result of an error handler in ASCIIhttps://hg.python.org/cpython/rev/3addf93f4111 | |||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:39 | admin | set | github: 72960 |
| 2017-03-31 16:36:29 | dstufft | set | pull_requests: +pull_request1024 |
| 2016-11-23 13:19:32 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: commit review -> resolved |
| 2016-11-23 13:17:31 | python-dev | set | messages: +msg281562 |
| 2016-11-23 13:07:24 | serhiy.storchaka | set | messages: +msg281560 |
| 2016-11-23 12:22:11 | xiang.zhang | set | assignee:xiang.zhang ->serhiy.storchaka messages: +msg281558 |
| 2016-11-23 12:19:27 | python-dev | set | nosy: +python-dev messages: +msg281557 |
| 2016-11-22 21:16:05 | vstinner | set | messages: +msg281520 |
| 2016-11-22 21:12:32 | serhiy.storchaka | set | assignee:serhiy.storchaka ->xiang.zhang messages: +msg281519 stage: patch review -> commit review |
| 2016-11-22 21:01:33 | vstinner | set | messages: +msg281517 |
| 2016-11-22 15:52:10 | serhiy.storchaka | set | files: +unicode_encode_ucs1_error_pos2.patch messages: +msg281490 |
| 2016-11-22 15:01:34 | vstinner | set | versions: - Python 3.6 |
| 2016-11-22 15:01:00 | vstinner | set | messages: +msg281489 |
| 2016-11-22 14:54:16 | serhiy.storchaka | set | priority: normal -> low assignee:serhiy.storchaka |
| 2016-11-22 14:29:22 | xiang.zhang | create | |