Movatterモバイル変換


[0]ホーム

URL:


[Python-Dev] PEP for new dictionary implementation

"Martin v. Löwis"martin at v.loewis.de
Thu Feb 16 22:34:00 CET 2012


Am 16.02.2012 19:24, schrieb Jim J. Jewett:>>> PEP author Mark Shannon wrote> (inhttp://mail.python.org/pipermail/python-dev/attachments/20120208/05be469a/attachment.txt):>>> ... allows ... (the ``__dict__`` attribute of an object) to share>> keys with other attribute dictionaries of instances of the same class.>> Is "the same class" a deliberate restriction, or just a convenience> of implementation?It's about the implementation: the class keeps a pointer to the key set.A subclass has a separate pointer for that.> I have often created subclasses (or even families> of subclasses) where instances (as opposed to the type) aren't likely> to have additional attributes.  These would benefit from key-sharing> across classes, but I grant that it is a minority use case that isn't> worth optimizing if it complicates the implementation.In particular, the potential savings are small: the instances of thesubclass will share the key sets per-class. So if you have S subclasses,you could save up to S keysets, whereas you are already saving N-S-1keysets (assuming you have a total of N objects across all classes).> Have you timed not storing the hash (in the dict) at all, at least for> (unicode) str-only dicts?  Going to the string for its own cached hash> breaks locality a bit more, but saves 1/3 of the memory for combined> tables, and may make a big difference for classes that have relatively> few instances.I'd be in favor of that, but it is actually an unrelated change: whetheror not you share key sets is unrelated to whether or not str-only dictsdrop the cached hash. Given a dict, it may be tricky to determinewhether or not it is str-only, i.e. what layout to use.>> Reduction in memory use is directly related to the number of dictionaries>> with shared keys in existence at any time. These dictionaries are typically>> half the size of the current dictionary implementation.>> How do you measure that?  The limit for huge N across huge numbers> of dicts should be 1/3 (because both hashes and keys are shared); I> assume that gets swamped by object overhead in typical small dicts.It's more difficult than that. He also drops the smalltable (which Ithink is a good idea), so accounting how this all plays together is tricky.>> If a table is split the values in the keys table are ignored,>> instead the values are held in a separate array.>> If they're just dead weight, then why not use them to hold indices> into the array, so that values arrays only have to be as long as> the number of keys, rather than rounding them up to a large-enough> power-of-two?  (On average, this should save half the slots.)Good idea. However, how do you track per-dict how large the table is?Regards,Martin


More information about the Python-Devmailing list

[8]ページ先頭

©2009-2025 Movatter.jp