
This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.
Created on2013-04-13 11:41 byserhiy.storchaka, last changed2022-04-11 14:57 byadmin. This issue is nowclosed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| fix_bad_persid.patch | alexandre.vassalotti,2013-04-14 04:01 | |||
| fix_bad_persid_2.patch | serhiy.storchaka,2015-02-13 08:32 | review | ||
| Messages (9) | |||
|---|---|---|---|
| msg186705 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2013-04-13 11:41 | |
Python 2 allows pickling and unpickling non-ascii persistent ids. In Python 3 C implementation of pickle saves persistent ids with protocol version 0 as utf8-encoded strings and loads as bytes.>>> import pickle, io>>> class MyPickler(pickle.Pickler):... def persistent_id(self, obj):... if isinstance(obj, str):... return obj... return None... >>> class MyUnpickler(pickle.Unpickler):... def persistent_load(self, pid):... return pid... >>> f = io.BytesIO(); MyPickler(f).dump('\u20ac'); data = f.getvalue()>>> MyUnpickler(io.BytesIO(data)).load()'€'>>> f = io.BytesIO(); MyPickler(f, 0).dump('\u20ac'); data = f.getvalue()>>> MyUnpickler(io.BytesIO(data)).load()b'\xe2\x82\xac'>>> f = io.BytesIO(); MyPickler(f, 0).dump('a'); data = f.getvalue()>>> MyUnpickler(io.BytesIO(data)).load()b'a'Python implementation in Python 3 doesn't works with non-ascii persistant ids at all. | |||
| msg186789 -(view) | Author: Alexandre Vassalotti (alexandre.vassalotti)*![]() | Date: 2013-04-13 18:35 | |
In protocol 0, the persistent ID is restricted to alphanumeric strings because of the problems that arise when the persistent ID contains newline characters. _pickle likely should be changed to use the ASCII decoded. And perhaps, we should check for embedded newline characters too. | |||
| msg186816 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2013-04-13 20:07 | |
Even for alphanumeric strings Python 3 have a bug. It saves strings and load bytes objects. | |||
| msg186881 -(view) | Author: Alexandre Vassalotti (alexandre.vassalotti)*![]() | Date: 2013-04-14 04:01 | |
Here's a patch that fix the bug. | |||
| msg186894 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2013-04-14 08:33 | |
I think a string with character codes < 256 will be better for test_protocol0_is_ascii_only(). It can be latin1 encoded (Python 2 allows any 8-bit strings).PyUnicode_AsASCIIString() can be slower than _PyUnicode_AsStringAndSize() (actually PyUnicode_AsUTF8AndSize()) because the latter can use cached value. You can check if the persistent id only contains ASCII characters by checking PyUnicode_GET_LENGTH(pid_str) == size.And what are you going to do with the fact that in Python 2 you can pickle non-ascii persistent ids, which will not be able to unpickle in Python 3? | |||
| msg235881 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2015-02-13 08:32 | |
The patch is updated to current sources. Also optimized writing ASCII strings and fixed tests. | |||
| msg268851 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2016-06-19 12:03 | |
Ping. | |||
| msg269874 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2016-07-06 09:31 | |
Ping again. | |||
| msg270619 -(view) | Author: Roundup Robot (python-dev)![]() | Date: 2016-07-17 08:36 | |
New changesetf6a41552a312 by Serhiy Storchaka in branch '3.5':Issue#17711: Fixed unpickling by the persistent ID with protocol 0.https://hg.python.org/cpython/rev/f6a41552a312New changesetdf8857c6f3eb by Serhiy Storchaka in branch 'default':Issue#17711: Fixed unpickling by the persistent ID with protocol 0.https://hg.python.org/cpython/rev/df8857c6f3eb | |||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:57:44 | admin | set | github: 61911 |
| 2016-10-25 18:41:54 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2016-07-17 08:36:13 | python-dev | set | nosy: +python-dev messages: +msg270619 |
| 2016-07-07 10:46:23 | pitrou | set | nosy: -pitrou |
| 2016-07-06 09:31:09 | serhiy.storchaka | set | messages: +msg269874 versions: + Python 3.6, - Python 3.4 |
| 2016-06-19 12:03:07 | serhiy.storchaka | set | messages: +msg268851 |
| 2015-02-13 08:32:15 | serhiy.storchaka | set | files: +fix_bad_persid_2.patch messages: +msg235881 versions: + Python 3.5, - Python 3.3 |
| 2013-04-14 08:33:29 | serhiy.storchaka | set | messages: +msg186894 |
| 2013-04-14 04:01:45 | alexandre.vassalotti | set | files: +fix_bad_persid.patch messages: +msg186881 assignee:alexandre.vassalotti keywords: +patch stage: needs patch -> patch review |
| 2013-04-13 20:07:20 | serhiy.storchaka | set | messages: +msg186816 |
| 2013-04-13 18:35:18 | alexandre.vassalotti | set | messages: +msg186789 |
| 2013-04-13 11:41:49 | serhiy.storchaka | create | |