
This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.
Created on2018-04-05 12:34 bymethane, last changed2022-04-11 14:58 byadmin. This issue is nowclosed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 6386 | merged | methane,2018-04-05 13:21 | |
| PR 6393 | merged | miss-islington,2018-04-06 06:51 | |
| PR 6394 | merged | miss-islington,2018-04-06 06:52 | |
| Messages (5) | |||
|---|---|---|---|
| msg314986 -(view) | Author: Josh Rosenberg (josh.r)*![]() | Date: 2018-04-05 14:15 | |
Patch is good, but while we're at it, is there any reason why this multi-allocation design was even used? It PyMem_Mallocs a buffer, makes a C-style string in it, then uses PyUnicode_FromString to convert C-style string to Python str.Seems like the correct approach would be to just use PyUnicode_New to preallocate the final string buffer up front, then pull out the internal buffer with PyUnicode_1BYTE_DATA and populate that directly, saving a pointless allocation/deallocation, which also means the failure case means no cleanup needed at all, while barely changing the code (aside from removing the need to explicitly NUL terminate).Only reason I can see to avoid this would be if the codec names could contain arbitrary Unicode encoded as UTF-8 (and therefore strlen wouldn't tell you the final length in Unicode ordinals), but I'm pretty sure that's not the case (if it is, we're not normalizing properly, since we only lower case ASCII). If Unicode codec names need to be handled, there are other options, though the easy savings go away. | |||
| msg315010 -(view) | Author: Inada Naoki (methane)*![]() | Date: 2018-04-06 06:46 | |
> Only reason I can see to avoid this would be if the codec names could contain arbitrary Unicode encoded as UTF-8 (and therefore strlen wouldn't tell you the final length in Unicode ordinals), but I'm pretty sure that's not the case (if it is, we're not normalizing properly, since we only lower case ASCII). If Unicode codec names need to be handled, there are other options, though the easy savings go away.Maybe, we can add "encoding name must be ascii" restriction in future version (3.8+).But for now, I want to avoid any potential backward incompatibility. | |||
| msg315011 -(view) | Author: Inada Naoki (methane)*![]() | Date: 2018-04-06 06:51 | |
New changeset0c1c4563a65ac451021d927058e4f25013934eb2 by INADA Naoki in branch 'master':bpo-33231: Fix potential leak in normalizestring() (GH-6386)https://github.com/python/cpython/commit/0c1c4563a65ac451021d927058e4f25013934eb2 | |||
| msg315012 -(view) | Author: miss-islington (miss-islington) | Date: 2018-04-06 07:12 | |
New changeset64421d9237e33725e3c2916cdf2b6d6da1751c2a by Miss Islington (bot) in branch '3.7':bpo-33231: Fix potential leak in normalizestring() (GH-6386)https://github.com/python/cpython/commit/64421d9237e33725e3c2916cdf2b6d6da1751c2a | |||
| msg315013 -(view) | Author: miss-islington (miss-islington) | Date: 2018-04-06 07:37 | |
New changeset2350a4765265158072bf7ad9f04402406d3d1ada by Miss Islington (bot) in branch '3.6':bpo-33231: Fix potential leak in normalizestring() (GH-6386)https://github.com/python/cpython/commit/2350a4765265158072bf7ad9f04402406d3d1ada | |||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:59 | admin | set | github: 77412 |
| 2018-04-06 07:37:49 | methane | set | status: open -> closed stage: patch review -> resolved |
| 2018-04-06 07:37:29 | methane | set | resolution: fixed |
| 2018-04-06 07:37:06 | miss-islington | set | messages: +msg315013 |
| 2018-04-06 07:12:40 | miss-islington | set | nosy: +miss-islington messages: +msg315012 |
| 2018-04-06 06:52:34 | miss-islington | set | pull_requests: +pull_request6102 |
| 2018-04-06 06:51:48 | miss-islington | set | pull_requests: +pull_request6101 |
| 2018-04-06 06:51:29 | methane | set | messages: +msg315011 |
| 2018-04-06 06:46:10 | methane | set | messages: +msg315010 |
| 2018-04-05 14:15:19 | josh.r | set | nosy: +josh.r messages: +msg314986 |
| 2018-04-05 13:21:58 | methane | set | keywords: +patch stage: patch review pull_requests: +pull_request6095 |
| 2018-04-05 12:34:19 | methane | create | |