Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Dataclasses - Improve the performance of asdict/astuple for common types and default values #103000

Closed
Labels
3.12only security fixesperformancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytype-featureA feature request or enhancement
@DavidCEllis

Description

@DavidCEllis

Feature or enhancement

Improve the performance of asdict/astuple in common cases by making a shortcut for common types that are unaffected by deepcopy in the inner loop. Also special casing for the defaultdict_factory=dict to construct the dictionary directly.

The goal here is to improve performance in common cases without significantly impacting less common cases, while not changing the API or output in any way.

Pitch

In cases where a dataclass contains a lot of data of common python types (eg: bool/str/int/float) currently the inner loops forasdict andastuple require the values to be compared to check if they are dataclasses, namedtuples, lists, tuples, and then dictionaries before passing them todeepcopy. This proposes to special case and shortcut objects of types wheredeepcopy returns the object unchanged.

It is much faster for these cases to instead check for them at the first opportunity and shortcut their return, skipping the recursive call and all of the other comparisons. In the case where this is being used to prepare an object to serialize to JSON this can be quite significant as this covers most of the remaining types handled by the stdlibjson module.

Note: Anything that skips deepcopy with this alteration is already unchanged asdeepcopy(obj) is obj is always True for these types.

Currently when constructing thedict for a dataclass, a list of tuples is created and passed to thedict_factory constructor. In the case where thedict_factory constructor is the default -dict - it is faster to construct the dictionary directly.

Previous discussion

Discussed here with a few more details and earlier examples:https://discuss.python.org/t/dataclasses-make-asdict-astuple-faster-by-skipping-deepcopy-for-objects-where-deepcopy-obj-is-obj/24662

Code Details

Types to skip deepcopy

This is the current set of types to be checked for and shortcut returned, ordered in a way that I think makes more sense fordataclasses than the original ordering copied from thecopy module. These are known to be safe to skip as they are all sent to_deepcopy_atomic (which returns the original object) in thecopy module.

# Types for which deepcopy(obj) is known to return obj unmodified# Used to skip deepcopy in asdict and astuple for performance_ATOMIC_TYPES= {# Common JSON Serializable typestypes.NoneType,bool,int,float,complex,bytes,str,# Other types that are also unaffected by deepcopytypes.EllipsisType,types.NotImplementedType,types.CodeType,types.BuiltinFunctionType,types.FunctionType,type,range,property,# weakref.ref,  # weakref is not currently imported by dataclasses directly}

Function changes

With that added the change is essentially replacing each instance of

_asdict_inner(v,dict_factory)

inside_asdict_inner, with

viftype(v)in_ATOMIC_TYPESelse_asdict_inner(v,dict_factory)

Instances of subclasses of these types are not guaranteed to havedeepcopy(obj) is obj so this checks specifically for instances of the base types.

Performance tests

Test file:https://gist.github.com/DavidCEllis/a2c2ceeeeda2d1ac509fb8877e5fb60d

Results on my development machine (not a perfectly stable test machine, but these differences are large enough).

Main

Current Main python branch:

Dataclasses asdict/astuple speed tests--------------------------------------Python v3.12.0alpha6GIT branch: mainTest Iterations: 10000List of Int case asdict: 5.80sTest Iterations: 1000List of Decimal case asdict: 0.65sTest Iterations: 1000000Basic types case asdict: 3.76sBasic types astuple: 3.48sTest Iterations: 100000Opaque types asdict: 2.15sOpaque types astuple: 2.11sTest Iterations: 100Mixed containers asdict: 3.66sMixed containers astuple: 3.28s

Modified

Modified Branch:

Dataclasses asdict/astuple speed tests--------------------------------------Python v3.12.0alpha6GIT branch: faster_dataclasses_serializeTest Iterations: 10000List of Int case asdict: 0.53sTest Iterations: 1000List of Decimal case asdict: 0.68sTest Iterations: 1000000Basic types case asdict: 1.33sBasic types astuple: 1.28sTest Iterations: 100000Opaque types asdict: 2.14sOpaque types astuple: 2.13sTest Iterations: 100Mixed containers asdict: 1.99sMixed containers astuple: 1.84s

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.12only security fixesperformancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp