Movatterモバイル変換


[0]ホーム

URL:


homepage

Issue5670

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title:Speed up pickling of dicts in cPickle
Type:performanceStage:commit review
Components:Extension ModulesVersions:Python 3.1, Python 2.7
process
Status:closedResolution:fixed
Dependencies:Superseder:
Assigned To: collinwinterNosy List: alexandre.vassalotti, amaury.forgeotdarc, collinwinter, feisan, pitrou
Priority:normalKeywords:needs review, patch

Created on2009-04-02 18:53 bycollinwinter, last changed2022-04-11 14:56 byadmin. This issue is nowclosed.

Files
File nameUploadedDescriptionEdit
cpickle_dict.patchcollinwinter,2009-04-02 18:53Patch against trunk, r71058
pickle_batch_dict_exact_py3k-5.diffalexandre.vassalotti,2009-04-03 14:42
Messages (20)
msg85239 -(view)Author: Collin Winter (collinwinter)*(Python committer)Date: 2009-04-02 18:53
The attached patch adds another version of cPickle.c's batch_dict(),batch_dict_exact(), which is specialized for "type(x) is dict". Thisprovides a nice performance boost when pickling objects that usedictionaries:Pickle:Min: 2.216 -> 1.858: 19.24% fasterAvg: 2.238 -> 1.889: 18.50% fasterSignificant (t=106.874099, a=0.95)Benchmark is athttp://code.google.com/p/unladen-swallow/source/browse/tests/performance/macro_pickle.py(driver is  ../perf.py; perf.py was run with "--rigorous -b pickle").This patch passes all the tests added inissue 5665. I would recommendreviewing that patch first. I'll port to py3k once this is reviewed fortrunk.
msg85245 -(view)Author: Antoine Pitrou (pitrou)*(Python committer)Date: 2009-04-02 19:14
Without taking a very detailed look, the patch looks good.Are there already tests for pickling of dict subclasses? Otherwise, theyshould be added.
msg85248 -(view)Author: Antoine Pitrou (pitrou)*(Python committer)Date: 2009-04-02 19:20
By the way, could the same approach be applied to lists and sets as well?
msg85253 -(view)Author: Collin Winter (collinwinter)*(Python committer)Date: 2009-04-02 19:39
On Thu, Apr 2, 2009 at 12:20 PM, Antoine Pitrou <report@bugs.python.org> wrote:>> Antoine Pitrou <pitrou@free.fr> added the comment:>> By the way, could the same approach be applied to lists and sets as well?Certainly; seehttp://bugs.python.org/issue5671 for the list version.It doesn't make as big an impact on the benchmark, though.
msg85257 -(view)Author: Antoine Pitrou (pitrou)*(Python committer)Date: 2009-04-02 19:44
> Certainly; seehttp://bugs.python.org/issue5671 for the list version.> It doesn't make as big an impact on the benchmark, though.How about splitting the benchmark in parts:- (un)pickling lists- (un)pickling dicts- (un)pickling sets(etc.)
msg85272 -(view)Author: Collin Winter (collinwinter)*(Python committer)Date: 2009-04-02 22:10
Antoine: pickletester.py:test_newobj_generic() appears to test dictsubclasses, though in a roundabout-ish way. I don't know of any testsfor dict subclasses in the C level sense (ie, PyDict_Check() vsPyDict_CheckExact()). I can add more explicit tests for Python-leveldict subclasses, if you want.
msg85276 -(view)Author: Amaury Forgeot d'Arc (amaury.forgeotdarc)*(Python committer)Date: 2009-04-02 22:56
The patch produces different output for an empty dict: a sequence "MARK SETITEMS" is written, which is useless and wastes 2 bytes.
msg85277 -(view)Author: Antoine Pitrou (pitrou)*(Python committer)Date: 2009-04-02 22:58
> Antoine: pickletester.py:test_newobj_generic() appears to test dict> subclasses, though in a roundabout-ish way. I don't know of any tests> for dict subclasses in the C level sense (ie, PyDict_Check() vs> PyDict_CheckExact()). I can add more explicit tests for Python-level> dict subclasses, if you want.Well, Python-level dict subclasses are also C-level subclasses (in thePyDict_Check() sense), or am I mistaken?
msg85293 -(view)Author: Alexandre Vassalotti (alexandre.vassalotti)*(Python committer)Date: 2009-04-03 05:20
I ported the patch to py3k. In addition, I added a special-case when thedict contains only one item; you probably want this special-case in thetrunk version as well.
msg85294 -(view)Author: Alexandre Vassalotti (alexandre.vassalotti)*(Python committer)Date: 2009-04-03 05:23
Oops, I forgot to add the comment on top of batch_dict_exact in thepatch. Here is a better patch.
msg85296 -(view)Author: Alexandre Vassalotti (alexandre.vassalotti)*(Python committer)Date: 2009-04-03 05:51
Oops again, I just remarked that the comment for batch_dict_exact refersto batch_dict as being above, but I copied batch_dict_exact beforebatch_dict. Here's a good patch (hopefully) that puts batch_dict_exactat the right place.
msg85306 -(view)Author: Alexandre Vassalotti (alexandre.vassalotti)*(Python committer)Date: 2009-04-03 14:37
Silly me, I had changed the PyDict_Size call in outer loop for Py_SIZEand this is of course totally wrong. Here's a good patch (I am prettysure now! ;-) I ran the whole test suite and I saw no failures.Collin, you can go ahead and commit both patches. Nice work!
msg85307 -(view)Author: Alexandre Vassalotti (alexandre.vassalotti)*(Python committer)Date: 2009-04-03 14:42
Sigh... silly me again. There is some other junk in my last patch.
msg85333 -(view)Author: Collin Winter (collinwinter)*(Python committer)Date: 2009-04-03 21:22
FYI, I just added a pickle_dict microbenchmark to perf.py. Using thisnew microbenchmark, I see these results (perf.py -r -b pickle_dict):pickle_dict:Min: 2.092 -> 1.341: 56.04% fasterAvg: 2.126 -> 1.360: 56.37% fasterSignificant (t=216.895643, a=0.95)I still need to address the comment about pickling empty dicts.
msg85335 -(view)Author: Collin Winter (collinwinter)*(Python committer)Date: 2009-04-03 21:48
Amaury, I can't reproduce the issue you're seeing with empty dicts.Here's what I'm doing:dhcp-172-19-19-199:trunk collinwinter$ ./python.exe Python 2.7a0 (trunk:71100M, Apr  3 2009, 14:40:49) [GCC 4.0.1 (Apple Inc. build 5490)] on darwinType "help", "copyright", "credits" or "license" for more information.>>> import cPickle, pickletools>>> data = cPickle.dumps({}, protocol=2)>>> pickletools.dis(data)    0: \x80 PROTO      2    2: }    EMPTY_DICT    3: .    STOPhighest protocol among opcodes = 2>>> data'\x80\x02}.'>>>What are you doing to produce the MARK SETITEMS sequence?
msg85433 -(view)Author: Amaury Forgeot d'Arc (amaury.forgeotdarc)*(Python committer)Date: 2009-04-04 21:56
Sorry, I was wrong. I think I noticed that the case size==1 was handled differently, and incorrectly inferred the same for size==0.(btw, the patch for trunk was not updated)
msg86188 -(view)Author: Kelvin Liang (feisan)Date: 2009-04-20 03:45
Can this patch be used or ported to 2.5.x?
msg86194 -(view)Author: Antoine Pitrou (pitrou)*(Python committer)Date: 2009-04-20 11:03
Sorry, it won't even be integrated in 2.6 actually. It's a new feature,not a bug fix.
msg88303 -(view)Author: Collin Winter (collinwinter)*(Python committer)Date: 2009-05-25 05:44
Fixed the len(d) == 1 size regression. Final performance of the patchrelative to trunk:Using Unladen Swallow's perf.py -b pickle,pickle_dict on trunk:pickle:Min: 2.238 -> 1.895: 18.08% fasterAvg: 2.241 -> 1.898: 18.04% fasterSignificant (t=282.066701, a=0.95)pickle_dict:Min: 2.163 -> 1.375: 57.36% fasterAvg: 2.168 -> 1.376: 57.50% fasterSignificant (t=527.668441, a=0.95)Performance for py3k:pickle:Min: 2.849 -> 2.790: 2.10% fasterAvg: 2.854 -> 2.796: 2.09% fasterSignificant (t=27.624303, a=0.95)pickle_dict:Min: 2.121 -> 1.512: 40.27% fasterAvg: 2.128 -> 1.519: 40.13% fasterSignificant (t=283.406572, a=0.95)regrtest.py -uall test_xpickle passes all backwards-compatibility testsfor trunk, and all other tests run by regrtest.py on Linux pass.Committed asr72909 (trunk),r72910 (py3k).
msg88314 -(view)Author: Antoine Pitrou (pitrou)*(Python committer)Date: 2009-05-25 09:35
Thanks!> Committed asr72909 (trunk),r72910 (py3k).> > ----------> resolution: accepted -> fixed> status: open -> closed
History
DateUserActionArgs
2022-04-11 14:56:47adminsetgithub: 49920
2009-05-25 09:35:39pitrousetmessages: +msg88314
2009-05-25 05:44:08collinwintersetstatus: open -> closed
resolution: accepted -> fixed
messages: +msg88303
2009-04-20 11:03:49pitrousetmessages: +msg86194
2009-04-20 03:45:06feisansetnosy: +feisan
messages: +msg86188
2009-04-04 21:56:30amaury.forgeotdarcsetmessages: +msg85433
2009-04-03 21:48:36collinwintersetmessages: +msg85335
2009-04-03 21:22:08collinwintersetmessages: +msg85333
2009-04-03 14:42:29alexandre.vassalottisetfiles: -pickle_batch_dict_exact_py3k-4.diff
2009-04-03 14:42:24alexandre.vassalottisetfiles: -pickle_batch_dict_exact_py3k-3.diff
2009-04-03 14:42:16alexandre.vassalottisetfiles: +pickle_batch_dict_exact_py3k-5.diff

messages: +msg85307
2009-04-03 14:37:45alexandre.vassalottisetfiles: +pickle_batch_dict_exact_py3k-4.diff
messages: +msg85306

assignee:collinwinter
keywords: +patch
resolution: accepted
stage: commit review
2009-04-03 05:52:05alexandre.vassalottisetfiles: -pickle_batch_dict_exact_py3k-2.diff
2009-04-03 05:52:00alexandre.vassalottisetfiles: -pickle_batch_dict_exact_py3k.diff
2009-04-03 05:51:51alexandre.vassalottisetkeywords: -patch
files: +pickle_batch_dict_exact_py3k-3.diff
messages: +msg85296

versions: + Python 3.1
2009-04-03 05:23:38alexandre.vassalottisetfiles: +pickle_batch_dict_exact_py3k-2.diff

messages: +msg85294
2009-04-03 05:21:03alexandre.vassalottisetfiles: +pickle_batch_dict_exact_py3k.diff
nosy: +alexandre.vassalotti
messages: +msg85293

2009-04-02 22:58:56pitrousetmessages: +msg85277
2009-04-02 22:56:01amaury.forgeotdarcsetnosy: +amaury.forgeotdarc
messages: +msg85276
2009-04-02 22:10:22collinwintersetmessages: +msg85272
2009-04-02 19:44:44pitrousetmessages: +msg85257
2009-04-02 19:39:47collinwintersetmessages: +msg85253
2009-04-02 19:20:20pitrousetmessages: +msg85248
2009-04-02 19:14:36pitrousetnosy: +pitrou
messages: +msg85245
2009-04-02 18:53:50collinwintercreate
Supported byThe Python Software Foundation,
Powered byRoundup
Copyright © 1990-2022,Python Software Foundation
Legal Statements

[8]ページ先頭

©2009-2026 Movatter.jp