Movatterモバイル変換

Issue34151

➜

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/78332

classification

Title:	use malloc() for better performance of some list operations
Type:	performance	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.8

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:		Nosy List:	rhettinger, scoder, sir-sigurd, xiang.zhang, xtreak
Priority:	normal	Keywords:	patch

Created on2018-07-18 17:31 bysir-sigurd, last changed2022-04-11 14:59 byadmin. This issue is nowclosed.

Pull Requests
URL	Status	Linked	Edit
PR 8332	merged	sir-sigurd,2018-07-18 17:33

Messages (4)
msg321905 -(view)	Author: Sergey Fedoseev (sir-sigurd)*	Date: 2018-07-18 17:31
Currently list concatenation, slicing and repeating operations are using PyList_New() which allocates memory for the items by calloc(). malloc() could be used instead, since the allocated memory is overwritten by mentioned operations.I made benchmarks with this script:NAME=list-malloc-master.jsonpython -m perf timeit --name slice0 -s "l = [None]1000000" "l[:0]" --duplicate=2048 --append $NAMEpython -m perf timeit --name slice1 -s "l = [None]1000000" "l[:1]" --duplicate=1024 --append $NAMEpython -m perf timeit --name slice2 -s "l = [None]1000000" "l[:2]" --duplicate=1024 --append $NAMEpython -m perf timeit --name slice3 -s "l = [None]1000000" "l[:3]" --duplicate=1024 --append $NAMEpython -m perf timeit --name slice1000000 -s "l = [None]1000000" "l[:1000000]" --append $NAMEpython -m perf timeit --name cat0 -s "l = [None]0" "l + l" --duplicate=1024 --append $NAMEpython -m perf timeit --name cat1 -s "l = [None]1" "l 1" --duplicate=1024 --append $NAMEpython -m perf timeit --name cat2 -s "l = [None]2" "l 1" --duplicate=1024 --append $NAMEpython -m perf timeit --name cat3 -s "l = [None]3" "l 1" --duplicate=1024 --append $NAMEpython -m perf timeit --name cat1000000 -s "l = [None]1000000" "l 1" --append $NAMEpython -m perf timeit --name 1x0 -s "l = [None]" "l * 0" --duplicate=1024 --append $NAMEpython -m perf timeit --name 1x1 -s "l = [None]" "l * 1" --duplicate=1024 --append $NAMEpython -m perf timeit --name 1x2 -s "l = [None]" "l * 2" --duplicate=1024 --append $NAMEpython -m perf timeit --name 1x3 -s "l = [None]" "l * 3" --duplicate=1024 --append $NAMEpython -m perf timeit --name 1x1000000 -s "l = [None]" "l * 1000000" --append $NAME Here's comparison table:+--------------+--------------------+------------------------------+\| Benchmark \| list-malloc-master \| list-malloc \|+==============+====================+==============================+\| slice1 \| 84.5 ns \| 59.6 ns: 1.42x faster (-30%) \|+--------------+--------------------+------------------------------+\| slice2 \| 71.6 ns \| 61.8 ns: 1.16x faster (-14%) \|+--------------+--------------------+------------------------------+\| slice3 \| 74.4 ns \| 63.6 ns: 1.17x faster (-15%) \|+--------------+--------------------+------------------------------+\| slice1000000 \| 4.39 ms \| 4.08 ms: 1.08x faster (-7%) \|+--------------+--------------------+------------------------------+\| cat0 \| 23.9 ns \| 24.9 ns: 1.04x slower (+4%) \|+--------------+--------------------+------------------------------+\| cat1 \| 73.2 ns \| 51.9 ns: 1.41x faster (-29%) \|+--------------+--------------------+------------------------------+\| cat2 \| 61.6 ns \| 53.1 ns: 1.16x faster (-14%) \|+--------------+--------------------+------------------------------+\| cat3 \| 63.0 ns \| 54.3 ns: 1.16x faster (-14%) \|+--------------+--------------------+------------------------------+\| cat1000000 \| 4.38 ms \| 4.08 ms: 1.07x faster (-7%) \|+--------------+--------------------+------------------------------+\| 1x0 \| 27.1 ns \| 27.7 ns: 1.02x slower (+2%) \|+--------------+--------------------+------------------------------+\| 1x1 \| 72.9 ns \| 51.9 ns: 1.41x faster (-29%) \|+--------------+--------------------+------------------------------+\| 1x2 \| 60.9 ns \| 52.9 ns: 1.15x faster (-13%) \|+--------------+--------------------+------------------------------+\| 1x3 \| 62.5 ns \| 54.8 ns: 1.14x faster (-12%) \|+--------------+--------------------+------------------------------+\| 1x1000000 \| 2.67 ms \| 2.34 ms: 1.14x faster (-12%) \|+--------------+--------------------+------------------------------+Not significant (1): slice0
msg321914 -(view)	Author: Stefan Behnel (scoder)*	Date: 2018-07-18 20:07
Nice! Patch looks good to me, minus the usual naming nit-pick.
msg321925 -(view)	Author: Raymond Hettinger (rhettinger)*	Date: 2018-07-19 05:27
+1 This looks like a reasonable improvement.
msg323415 -(view)	Author: Xiang Zhang (xiang.zhang)*	Date: 2018-08-11 13:12
New changeset2fc46979b8c802675ca7fd51c6f2108a305001c8 by Xiang Zhang (Sergey Fedoseev) in branch 'master':bpo-34151: Improve performance of some list operations (GH-8332)https://github.com/python/cpython/commit/2fc46979b8c802675ca7fd51c6f2108a305001c8

History
Date	User	Action	Args
2022-04-11 14:59:03	admin	set	github: 78332
2018-08-11 13:14:26	xiang.zhang	set	status: open -> closed resolution: fixed stage: patch review -> resolved
2018-08-11 13:12:11	xiang.zhang	set	nosy: +xiang.zhang messages: +msg323415
2018-07-19 06:19:13	xtreak	set	nosy: +xtreak
2018-07-19 05:27:15	rhettinger	set	nosy: +rhettinger messages: +msg321925
2018-07-18 20:07:04	scoder	set	nosy: +scoder messages: +msg321914 versions: + Python 3.8
2018-07-18 17:33:45	sir-sigurd	set	keywords: +patch stage: patch review pull_requests: +pull_request7869
2018-07-18 17:31:49	sir-sigurd	create

Supported byThe Python Software Foundation,
Powered byRoundup