Movatterモバイル変換


[0]ホーム

URL:


homepage

Issue34151

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title:use malloc() for better performance of some list operations
Type:performanceStage:resolved
Components:Interpreter CoreVersions:Python 3.8
process
Status:closedResolution:fixed
Dependencies:Superseder:
Assigned To:Nosy List: rhettinger, scoder, sir-sigurd, xiang.zhang, xtreak
Priority:normalKeywords:patch

Created on2018-07-18 17:31 bysir-sigurd, last changed2022-04-11 14:59 byadmin. This issue is nowclosed.

Pull Requests
URLStatusLinkedEdit
PR 8332mergedsir-sigurd,2018-07-18 17:33
Messages (4)
msg321905 -(view)Author: Sergey Fedoseev (sir-sigurd)*Date: 2018-07-18 17:31
Currently list concatenation, slicing and repeating operations are using PyList_New() which allocates memory for the items by calloc(). malloc() could be used instead, since the allocated memory is overwritten by mentioned operations.I made benchmarks with this script:NAME=list-malloc-master.jsonpython -m perf timeit --name slice0 -s "l = [None]*1000000" "l[:0]" --duplicate=2048 --append $NAMEpython -m perf timeit --name slice1 -s "l = [None]*1000000" "l[:1]" --duplicate=1024 --append $NAMEpython -m perf timeit --name slice2 -s "l = [None]*1000000" "l[:2]" --duplicate=1024 --append $NAMEpython -m perf timeit --name slice3 -s "l = [None]*1000000" "l[:3]" --duplicate=1024 --append $NAMEpython -m perf timeit --name slice1000000 -s "l = [None]*1000000" "l[:1000000]" --append $NAMEpython -m perf timeit --name cat0 -s "l = [None]*0" "l + l" --duplicate=1024 --append $NAMEpython -m perf timeit --name cat1 -s "l = [None]*1" "l * 1" --duplicate=1024 --append $NAMEpython -m perf timeit --name cat2 -s "l = [None]*2" "l * 1" --duplicate=1024 --append $NAMEpython -m perf timeit --name cat3 -s "l = [None]*3" "l * 1" --duplicate=1024 --append $NAMEpython -m perf timeit --name cat1000000 -s "l = [None]*1000000" "l * 1" --append $NAMEpython -m perf timeit --name 1x0 -s "l = [None]" "l * 0" --duplicate=1024 --append $NAMEpython -m perf timeit --name 1x1 -s "l = [None]" "l * 1" --duplicate=1024 --append $NAMEpython -m perf timeit --name 1x2 -s "l = [None]" "l * 2" --duplicate=1024 --append $NAMEpython -m perf timeit --name 1x3 -s "l = [None]" "l * 3" --duplicate=1024 --append $NAMEpython -m perf timeit --name 1x1000000 -s "l = [None]" "l * 1000000" --append $NAME Here's comparison table:+--------------+--------------------+------------------------------+| Benchmark    | list-malloc-master | list-malloc                  |+==============+====================+==============================+| slice1       | 84.5 ns            | 59.6 ns: 1.42x faster (-30%) |+--------------+--------------------+------------------------------+| slice2       | 71.6 ns            | 61.8 ns: 1.16x faster (-14%) |+--------------+--------------------+------------------------------+| slice3       | 74.4 ns            | 63.6 ns: 1.17x faster (-15%) |+--------------+--------------------+------------------------------+| slice1000000 | 4.39 ms            | 4.08 ms: 1.08x faster (-7%)  |+--------------+--------------------+------------------------------+| cat0         | 23.9 ns            | 24.9 ns: 1.04x slower (+4%)  |+--------------+--------------------+------------------------------+| cat1         | 73.2 ns            | 51.9 ns: 1.41x faster (-29%) |+--------------+--------------------+------------------------------+| cat2         | 61.6 ns            | 53.1 ns: 1.16x faster (-14%) |+--------------+--------------------+------------------------------+| cat3         | 63.0 ns            | 54.3 ns: 1.16x faster (-14%) |+--------------+--------------------+------------------------------+| cat1000000   | 4.38 ms            | 4.08 ms: 1.07x faster (-7%)  |+--------------+--------------------+------------------------------+| 1x0          | 27.1 ns            | 27.7 ns: 1.02x slower (+2%)  |+--------------+--------------------+------------------------------+| 1x1          | 72.9 ns            | 51.9 ns: 1.41x faster (-29%) |+--------------+--------------------+------------------------------+| 1x2          | 60.9 ns            | 52.9 ns: 1.15x faster (-13%) |+--------------+--------------------+------------------------------+| 1x3          | 62.5 ns            | 54.8 ns: 1.14x faster (-12%) |+--------------+--------------------+------------------------------+| 1x1000000    | 2.67 ms            | 2.34 ms: 1.14x faster (-12%) |+--------------+--------------------+------------------------------+Not significant (1): slice0
msg321914 -(view)Author: Stefan Behnel (scoder)*(Python committer)Date: 2018-07-18 20:07
Nice! Patch looks good to me, minus the usual naming nit-pick.
msg321925 -(view)Author: Raymond Hettinger (rhettinger)*(Python committer)Date: 2018-07-19 05:27
+1 This looks like a reasonable improvement.
msg323415 -(view)Author: Xiang Zhang (xiang.zhang)*(Python committer)Date: 2018-08-11 13:12
New changeset2fc46979b8c802675ca7fd51c6f2108a305001c8 by Xiang Zhang (Sergey Fedoseev) in branch 'master':bpo-34151: Improve performance of some list operations (GH-8332)https://github.com/python/cpython/commit/2fc46979b8c802675ca7fd51c6f2108a305001c8
History
DateUserActionArgs
2022-04-11 14:59:03adminsetgithub: 78332
2018-08-11 13:14:26xiang.zhangsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2018-08-11 13:12:11xiang.zhangsetnosy: +xiang.zhang
messages: +msg323415
2018-07-19 06:19:13xtreaksetnosy: +xtreak
2018-07-19 05:27:15rhettingersetnosy: +rhettinger
messages: +msg321925
2018-07-18 20:07:04scodersetnosy: +scoder

messages: +msg321914
versions: + Python 3.8
2018-07-18 17:33:45sir-sigurdsetkeywords: +patch
stage: patch review
pull_requests: +pull_request7869
2018-07-18 17:31:49sir-sigurdcreate
Supported byThe Python Software Foundation,
Powered byRoundup
Copyright © 1990-2022,Python Software Foundation
Legal Statements

[8]ページ先頭

©2009-2026 Movatter.jp