Movatterモバイル変換


[0]ホーム

URL:


homepage

Issue36012

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title:Investigate slow writes to class variables
Type:performanceStage:resolved
Components:Interpreter CoreVersions:Python 3.8
process
Status:closedResolution:fixed
Dependencies:Superseder:
Assigned To:Nosy List: Yasser Alshalaan, nascheme, pablogsal, rhettinger, scoder
Priority:lowKeywords:patch

Created on2019-02-16 20:15 byrhettinger, last changed2022-04-11 14:59 byadmin. This issue is nowclosed.

Files
File nameUploadedDescriptionEdit
perf-screenshot.pngnascheme,2019-02-19 18:35
Pull Requests
URLStatusLinkedEdit
PR 11907mergedscoder,2019-02-17 14:31
Messages (11)
msg335714 -(view)Author: Raymond Hettinger (rhettinger)*(Python committer)Date: 2019-02-16 20:15
Benchmark show what writes to class variables are anomalously slow.    class A(object):        pass    A.x = 1   # This write is 3 to 5 times slower than other writes.FWIW, the same operation for old-style classes in Python 2.7 was several times faster.We should investigate to understand why the writes are so slow.  There might be a good reason or there might be an opportunity for optimization.-------------------------------------------------$ python3.8Tools/scripts/var_access_benchmark.pyVariable and attribute read access:   4.3 nsread_local   4.6 nsread_nonlocal  14.5 nsread_global  19.0 nsread_builtin  18.4 nsread_classvar_from_class  16.2 nsread_classvar_from_instance  24.7 nsread_instancevar  19.7 nsread_instancevar_slots  19.5 nsread_namedtuple  26.4 nsread_boundmethodVariable and attribute write access:   4.4 nswrite_local   5.1 nswrite_nonlocal  18.2 nswrite_global 103.9 nswrite_classvar             <== Outlier  35.4 nswrite_instancevar  25.6 nswrite_instancevar_slots
msg335733 -(view)Author: Pablo Galindo Salgado (pablogsal)*(Python committer)Date: 2019-02-16 23:34
It seems 50% of the overhead (50ns) is due to two reasons:- 30-40% is due to the call to update_slot(type, name) after the item is set in the class dictionary.- 70-60% is due to all the extra work from _PyObject_GenericSetAttrWithDict until it ends calling  _PyObjectDict_SetItem with the slot. If in the code of typeobject.c:type_setattro you change:res = _PyObject_GenericSetAttrWithDict((PyObject *)type, name, value, NULL);by PyObject* dictptr = _PyObject_GetDictPtr(type);res = _PyObjectDict_SetItem(type, dictptr, name, value);and delete the update_slot(type, name) call afterwards, the times are reduced to 50ns.
msg335747 -(view)Author: Yasser Alshalaan (Yasser Alshalaan)*Date: 2019-02-17 02:53
can you include your python 2.7 runs? for me it looks similar
msg335763 -(view)Author: Raymond Hettinger (rhettinger)*(Python committer)Date: 2019-02-17 08:50
> can you include your python 2.7 runs? for me it looks similarIt will give similar results unless you switch to old-style classes (edit out the inheritance from object).    class A:        pass    A.x = 1---------------------------------------$ python2.7 var_access_benchmark.pyVariable and attribute read access:   6.7 nsread_local  10.9 nsread_global  18.9 nsread_builtin  24.4 nsread_classvar_from_class  30.2 nsread_classvar_from_instance  22.7 nsread_instancevar  25.5 nsread_instancevar_slots  99.3 nsread_namedtuple  40.9 nsread_boundmethodVariable and attribute write access:   8.2 nswrite_local  18.7 nswrite_global  32.9 nswrite_classvar         <== Not formerly an outlier  32.5 nswrite_instancevar  31.2 nswrite_instancevar_slots
msg335776 -(view)Author: Stefan Behnel (scoder)*(Python committer)Date: 2019-02-17 14:36
It turns out that "update_slot()" is always called, even when we are not updating a slot name (which is always a special dunder-name). The linear search for names in "update_slots()" is a huge waste of time here, and short-circuiting out of it when the name does not start with "_" cuts the overall update time by 50%. I pushed a PR.Another improvement would be a sub-linear algorithm for searching the slot name, but that's a bigger change.
msg335794 -(view)Author: Pablo Galindo Salgado (pablogsal)*(Python committer)Date: 2019-02-18 00:59
This are the timings that I am measuring withPR 11907:Variable and attribute read access:   5.7 ns       read_local   5.9 ns       read_nonlocal  16.2 ns       read_global  24.5 ns       read_builtin  20.9 ns       read_classvar_from_class  20.0 ns       read_classvar_from_instance  29.7 ns       read_instancevar  24.7 ns       read_instancevar_slots  22.9 ns       read_namedtuple  36.8 ns       read_boundmethodVariable and attribute write access:   6.9 ns       write_local   6.9 ns       write_nonlocal  26.7 ns       write_global  65.4 ns       write_classvar <----- Down from 120.6 ns  49.5 ns       write_instancevar  34.5 ns       write_instancevar_slots
msg335795 -(view)Author: Raymond Hettinger (rhettinger)*(Python committer)Date: 2019-02-18 01:32
Wow, I didn't expect to get an immediate win this of this magnitude :-)
msg335992 -(view)Author: Neil Schemenauer (nascheme)*(Python committer)Date: 2019-02-19 18:25
Some profiling using 'perf'.  This is for cpython63fa1cfece4912110ce3a0ff11fb3ade3ff5e756.  children      self  [...]+   97.27%     0.00%  run_mod (inlined)+   88.53%     6.33%  PyObject_SetAttr+   79.34%     6.80%  type_setattro+   43.92%    43.92%  update_slot+   26.87%     5.84%  _PyObject_GenericSetAttrWithDict+   14.62%     6.52%  insertdict+    8.80%     8.80%  lookdict_unicode_nodummy+    6.57%     0.00%  _Py_DECREF (inlined)+    5.19%     5.19%  PyUnicode_InternInPlace+    3.57%     3.57%  _PyObjectDict_SetItem+    3.38%     3.38%  _PyType_Lookup+    1.71%     0.00%  _Py_INCREF (inlined)+    1.42%     0.00%  _Py_XDECREF (inlined)  [...]After applyingPR 11907:  children      self  [...]+   94.76%     0.00%  run_mod (inlined)+   75.22%     6.77%  PyObject_SetAttr+   64.65%    13.08%  type_setattro+   47.51%    11.96%  _PyObject_GenericSetAttrWithDict+   22.99%    13.95%  insertdict+   10.10%    10.10%  lookdict_unicode_nodummy+    9.47%     9.47%  PyUnicode_InternInPlace+    7.10%     7.10%  _PyObjectDict_SetItem+    7.02%     7.02%  _PyType_Lookup+    6.52%     0.00%  _Py_DECREF (inlined)+    3.17%     0.00%  _Py_INCREF (inlined)+    3.16%     0.00%  _Py_XDECREF (inlined)+    2.59%     0.00%  PyDict_SetItem (inlined)+    1.50%     0.00%  is_dunder_name (inlined)  [...]The PyUnicode_InternInPlace() can mostly be eliminated by testing PyUnicode_CHECK_INTERNED() first (we already have called PyUnicode_Check() on it).  That only gives a 7% speedup on my machine though.  The is_dunder_name() is a much bigger optimization.
msg335995 -(view)Author: Neil Schemenauer (nascheme)*(Python committer)Date: 2019-02-19 18:35
BTW, 'perf report [...]' has a really neat annotated assembly view.  Scroll to the function you are interested in and press 'a'.  Press 't' to toggle the time units (left side numbers).  I'm attaching a screenshot of the disassembly of the lookdict function.  The time units are sample accounts.  Each count is a point where the profiler woke up on that specific instruction.
msg336031 -(view)Author: Raymond Hettinger (rhettinger)*(Python committer)Date: 2019-02-20 00:45
Thanks Neil.  The tooling does indeed look nice.I added your PyUnicode_InternInPlace() suggestion to the PR.At this point, the PR looks ready-to-go unless any of you think we've missed some low-hanging fruit.
msg336129 -(view)Author: Raymond Hettinger (rhettinger)*(Python committer)Date: 2019-02-20 17:29
New changesetd8b9e1fc2e45d2bc3f4a9737c375f2adb8a8c7de by Raymond Hettinger (Stefan Behnel) in branch 'master':bpo-36012: Avoid linear slot search for non-dunder methods (GH-11907)https://github.com/python/cpython/commit/d8b9e1fc2e45d2bc3f4a9737c375f2adb8a8c7de
History
DateUserActionArgs
2022-04-11 14:59:11adminsetgithub: 80193
2019-02-20 17:29:55rhettingersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-02-20 17:29:26rhettingersetmessages: +msg336129
2019-02-20 00:45:18rhettingersetmessages: +msg336031
2019-02-19 18:35:45naschemesetfiles: +perf-screenshot.png

messages: +msg335995
2019-02-19 18:25:57naschemesetmessages: +msg335992
2019-02-19 11:27:22vstinnersetnosy: -vstinner
2019-02-18 01:32:29rhettingersetmessages: +msg335795
2019-02-18 00:59:21pablogsalsetmessages: +msg335794
2019-02-17 14:36:15scodersetnosy: +scoder
messages: +msg335776
2019-02-17 14:31:39scodersetkeywords: +patch
stage: patch review
pull_requests: +pull_request11932
2019-02-17 08:50:24rhettingersetmessages: +msg335763
2019-02-17 02:53:16Yasser Alshalaansetnosy: +Yasser Alshalaan
messages: +msg335747
2019-02-16 23:34:44pablogsalsetmessages: +msg335733
2019-02-16 20:15:18rhettingercreate
Supported byThe Python Software Foundation,
Powered byRoundup
Copyright © 1990-2022,Python Software Foundation
Legal Statements

[8]ページ先頭

©2009-2026 Movatter.jp