Movatterモバイル変換

Issue32422

➜

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/76603

classification

Title:	Reduce lru_cache memory overhead.
Type:	resource usage	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.7

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	rhettinger	Nosy List:	methane, rhettinger, serhiy.storchaka
Priority:	normal	Keywords:	patch

Created on2017-12-24 05:19 bymethane, last changed2022-04-11 14:58 byadmin. This issue is nowclosed.

Files
File name	Uploaded	Description	Edit
lru_bench.py	methane,2017-12-24 17:24

Pull Requests
URL	Status	Linked	Edit
PR 5008	merged	methane,2017-12-25 07:55

Messages (13)
msg308980 -(view)	Author: Inada Naoki (methane)*	Date: 2017-12-24 05:19
Currently, functools.lru_cache implement own doubly-linked list.But it is inefficient than OrderedDict because each link node is GC object.So it may eat more memory and take longer GC time.I added two private C API for OrderedDict and make lru_cache use it.* _PyODict_LRUGetItem(): Similar to PyDict_GetItemWithHash() + od.move_to_end(last=True).* _PyODict_PopItem(): Similar to odict.popitem().Why I didn't implement C version of move_to_end() is to reduce lookup. _PyODict_LRUGetItem(key) lookup key once whileod[key]; od.move_to_end(key) lookup key twice.I'll benchmark it and report result here.
msg308981 -(view)	Author: Inada Naoki (methane)*	Date: 2017-12-24 05:32
Current implementation (no news entry yet):https://github.com/methane/cpython/pull/10/files
msg308985 -(view)	Author: Serhiy Storchaka (serhiy.storchaka)*	Date: 2017-12-24 08:05
This is a duplicate ofissue28239.
msg308986 -(view)	Author: Serhiy Storchaka (serhiy.storchaka)*	Date: 2017-12-24 08:11
Ah, sorry, you use OrderedDict instead of just ordered dict. It should have different timing and memory consumption.
msg309003 -(view)	Author: Inada Naoki (methane)*	Date: 2017-12-24 17:24
Hmm, it seems my implementation is 30% slower when many mishit scenario.Maybe, dict is faster than OrderedDict about massive insert/discard. But I need to profile it.On the other hand, GC speed looks about 2x faster as expected.$ ./python -m perf compare_to master.json patched.json -GSlower (5):- lru_1000_100: 217 ns +- 6 ns -> 302 ns +- 6 ns: 1.39x slower (+39%)- lru_10000_1000: 225 ns +- 4 ns -> 309 ns +- 2 ns: 1.37x slower (+37%)- lru_100_1000: 114 ns +- 5 ns -> 119 ns +- 1 ns: 1.05x slower (+5%)- lru_100_100: 115 ns +- 6 ns -> 119 ns +- 1 ns: 1.03x slower (+3%)- lru_1000_1000: 134 ns +- 6 ns -> 136 ns +- 1 ns: 1.02x slower (+2%)Faster (4):- gc(1000000): 98.3 ms +- 0.3 ms -> 37.9 ms +- 0.2 ms: 2.59x faster (-61%)- gc(100000): 11.7 ms +- 0.0 ms -> 5.10 ms +- 0.02 ms: 2.29x faster (-56%)- gc(10000): 1.48 ms +- 0.02 ms -> 1.04 ms +- 0.01 ms: 1.41x faster (-29%)- lru_10_100: 149 ns +- 6 ns -> 147 ns +- 2 ns: 1.02x faster (-2%)
msg309006 -(view)	Author: Inada Naoki (methane)*	Date: 2017-12-24 18:14
I found odict.pop() and odict.popitem() is very inefficient becauseit look up key multiple times.odict seems not optimized well and very slow than dict in some area...I'll try to optimize it in holidays.
msg309007 -(view)	Author: Raymond Hettinger (rhettinger)*	Date: 2017-12-24 18:28
Please stop revising every single thing you look at. The traditional design of LRU caches used doubly linked lists for a reason. In particular, when there is a high hit rate, the links can be updated without churning the underlying dictionary.
msg309013 -(view)	Author: Raymond Hettinger (rhettinger)*	Date: 2017-12-24 19:16
FWIW, I'm the original author and designer of this code, so it would have been appropriate to assign this to me for sign-off on any proposed changes.
msg309016 -(view)	Author: Serhiy Storchaka (serhiy.storchaka)*	Date: 2017-12-24 19:49
> FWIW, I'm the original author and designer of this code, so it would have> been appropriate to assign this to me for sign-off on any proposed changes.Not me (the C implementation)? ;-)
msg309029 -(view)	Author: Inada Naoki (methane)*	Date: 2017-12-25 06:53
> Please stop revising every single thing you look at. The traditional design of LRU caches used doubly linked lists for a reason. In particular, when there is a high hit rate, the links can be updated without churning the underlying dictionary.I don't proposing removing doubly linked list; OrderedDict usesdoubly-linked list too, and I found no problem for most-hit scenario.On the other hand, I found problem of OrderedDict for most mis hitscenario.Now I think lru_cache's implementation is better OrderedDict.PyODict is slower than lru_cache's dict + linked list because ofhistorical reason (compatibility with pure Python implemantation.)So I stop trying to remove lru_cache's own implementation.I'll try to reduce overhead of lru_cache, by removing GC headerfrom link node.
msg309031 -(view)	Author: Inada Naoki (methane)*	Date: 2017-12-25 07:56
PR-5008 benchmark:$ ./python -m perf compare_to master.json patched2.json -GFaster (9):- gc(1000000): 98.3 ms +- 0.3 ms -> 29.9 ms +- 0.4 ms: 3.29x faster (-70%)- gc(100000): 11.7 ms +- 0.0 ms -> 3.71 ms +- 0.03 ms: 3.14x faster (-68%)- gc(10000): 1.48 ms +- 0.02 ms -> 940 us +- 6 us: 1.57x faster (-36%)- lru_10_100: 149 ns +- 6 ns -> 138 ns +- 1 ns: 1.08x faster (-8%)- lru_100_100: 115 ns +- 6 ns -> 108 ns +- 1 ns: 1.07x faster (-6%)- lru_1000_1000: 134 ns +- 6 ns -> 127 ns +- 1 ns: 1.05x faster (-5%)- lru_100_1000: 114 ns +- 5 ns -> 108 ns +- 1 ns: 1.05x faster (-5%)- lru_1000_100: 217 ns +- 6 ns -> 212 ns +- 4 ns: 1.03x faster (-2%)- lru_10000_1000: 225 ns +- 4 ns -> 221 ns +- 5 ns: 1.02x faster (-2%)
msg309032 -(view)	Author: Serhiy Storchaka (serhiy.storchaka)*	Date: 2017-12-25 08:58
LGTM.
msg309039 -(view)	Author: Inada Naoki (methane)*	Date: 2017-12-25 17:03
New changeset3070b71e5eedf62e49b8e7dedab75742a5f67ece by INADA Naoki in branch 'master':bpo-32422: Reduce lru_cache memory usage (GH-5008)https://github.com/python/cpython/commit/3070b71e5eedf62e49b8e7dedab75742a5f67ece

History
Date	User	Action	Args
2022-04-11 14:58:56	admin	set	github: 76603
2017-12-25 17:03:57	methane	set	status: open -> closed resolution: fixed stage: patch review -> resolved
2017-12-25 17:03:26	methane	set	messages: +msg309039
2017-12-25 08:59:01	serhiy.storchaka	set	type: resource usage
2017-12-25 08:58:42	serhiy.storchaka	set	messages: +msg309032
2017-12-25 07:56:15	methane	set	messages: +msg309031
2017-12-25 07:55:25	methane	set	keywords: +patch stage: patch review pull_requests: +pull_request4897
2017-12-25 06:53:12	methane	set	messages: +msg309029 title: Make OrderedDict can be used for LRU from C -> Reduce lru_cache memory overhead.
2017-12-24 19:49:25	serhiy.storchaka	set	messages: +msg309016
2017-12-24 19:16:30	rhettinger	set	assignee:rhettinger messages: +msg309013
2017-12-24 18:28:58	rhettinger	set	nosy: +rhettinger messages: +msg309007
2017-12-24 18:14:24	methane	set	messages: +msg309006
2017-12-24 17:24:14	methane	set	files: +lru_bench.py messages: +msg309003
2017-12-24 08:11:19	serhiy.storchaka	set	status: closed -> open superseder:Implement functools.lru_cache() using ordered dict -> messages: +msg308986 resolution: duplicate -> (no value) stage: resolved -> (no value)
2017-12-24 08:07:29	serhiy.storchaka	set	status: open -> closed stage: resolved
2017-12-24 08:05:28	serhiy.storchaka	set	nosy: +serhiy.storchaka messages: +msg308985 resolution: duplicate superseder:Implement functools.lru_cache() using ordered dict
2017-12-24 05:32:45	methane	set	messages: +msg308981
2017-12-24 05:19:02	methane	create

Supported byThe Python Software Foundation,
Powered byRoundup