Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-121192: Add fast path todeepcopy() for empty list/tuple/dict/set#121193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
lgeiger wants to merge5 commits intopython:main
base:main
Choose a base branch
Loading
fromlgeiger:deepcopy-empty-list-dict

Conversation

lgeiger
Copy link
Contributor

@lgeigerlgeiger commentedJul 1, 2024
edited
Loading

deepcopy() can be surprisingly slow when called with empty containers like lists, tuples, dicts, sets or frozensets.

This adds a fast path for this case similar to#114266 which speeds up these cases by about 4x - 21x while having a small impact on the default path.

Here are some benchmarks comparing this PR againstmain. Let me know if there are more benchmarks that I can run to verify that there a no performance regressions for the common case:

importpyperfrunner=pyperf.Runner()setup="""import copya = {"list": [1, 2 ,3 ,4], "t": (1, 2, 3), "str": "hello", "subdict": {"a": True}}t = ()fs = frozenset()l = []s = set()d = {}deep = [[], (), {}, set(), frozenset()]"""runner.timeit(name="deepcopy dict",stmt=f"b = copy.deepcopy(a)",setup=setup)runner.timeit(name="deepcopy empty tuple",stmt=f"b = copy.deepcopy(t)",setup=setup)runner.timeit(name="deepcopy empty frozenset",stmt=f"b = copy.deepcopy(fs)",setup=setup)runner.timeit(name="deepcopy empty list",stmt=f"b = copy.deepcopy(l)",setup=setup)runner.timeit(name="deepcopy empty set",stmt=f"b = copy.deepcopy(s)",setup=setup)runner.timeit(name="deepcopy empty dict",stmt=f"b = copy.deepcopy(d)",setup=setup)runner.timeit(name="deepcopy multiple empty containers",stmt=f"b = copy.deepcopy(deep)",setup=setup)
deepcopy empty tuple: Mean +- std dev: [baseline] 287 ns +- 4 ns -> [opt-empty] 49.8 ns +- 1.7 ns: 5.77x fasterdeepcopy empty frozenset: Mean +- std dev: [baseline] 1.46 us +- 0.02 us -> [opt-empty] 67.1 ns +- 1.8 ns: 21.70x fasterdeepcopy empty list: Mean +- std dev: [baseline] 327 ns +- 6 ns -> [opt-empty] 67.1 ns +- 8.2 ns: 4.88x fasterdeepcopy empty set: Mean +- std dev: [baseline] 1.44 us +- 0.02 us -> [opt-empty] 67.9 ns +- 3.8 ns: 21.26x fasterdeepcopy empty dict: Mean +- std dev: [baseline] 329 ns +- 4 ns -> [opt-empty] 68.1 ns +- 3.0 ns: 4.83x fasterBenchmark hidden because not significant (2): deepcopy dict, deepcopy multiple empty containersGeometric mean: 4.85x faster

@lgeiger
Copy link
ContributorAuthor

lgeiger commentedJul 1, 2024
edited
Loading

I ran the relevant pyperformance deepcopy benchmark which does seem to suggest that this change makes the non empty case slightly slower:EDIT: I re-ran the pyperformance benchmarks with the latest changes ind87901e and I now can't see any performance regression for the common case.

Performance version: 1.11.0Python version: 3.14.0a0 (64-bit) revision 1a84bdc237Report on macOS-14.5-arm64-arm-64bit-Mach-ONumber of logical CPUs: 8Start date: 2024-07-02 01:23:14.690599End date: 2024-07-02 01:23:51.844329### deepcopy ###Mean +- std dev: 123 us +- 12 us -> 120 us +- 1 us: 1.03x fasterSignificant (t=2.25)### deepcopy_memo ###Mean +- std dev: 14.1 us +- 0.2 us -> 14.1 us +- 0.3 us: 1.00x slowerNot significant### deepcopy_reduce ###Mean +- std dev: 1.22 us +- 0.06 us -> 1.21 us +- 0.01 us: 1.00x fasterNot significant

@lgeigerlgeigerforce-pushed thedeepcopy-empty-list-dict branch fromd7b07eb tod87901eCompareJuly 2, 2024 19:38
@eendebakpt
Copy link
Contributor

The optimization applied in the latest iteration of the PR is only applied when no memo is passed todeepcopy. The advantage is that any recursive structures like{'a': [1, 2, 3, 4, ,5]} will have no (or near zero) overhead. So I agree that performance regression for many common cases is not a big concern.

The question then is whether adeepcopy directly on an empty builtin like like and set (and not adeepcopy on a recursive structure containing empty builtins, e.g,( {}, [], ())) is a common enough operation to apply this PR. I am not convinced on this, In the corresponding issue an example is given for pydantic basemodels. But there the deepcopy overhead can be avoided by using a default factory. Also the corresponding situation for plain dataclasses cannot occur:

from dataclasses import dataclass@dataclassclass DC:    a : list = []    x=DC()

raises aValueError: mutable default <class 'list'> for field a is not allowed: use default_factory. Withattrs an empty list is allowed, but no deepcopy is performed on instantiation. For example

import attrs@attrs.defineclass DC:    a : list = []    x=DC()y=DC()x.a.append(1)print(x)print(y)

results in

DC(a=[1])DC(a=[1])

@eendebakpt
Copy link
Contributor

For reference: I performed a benchmark of main, this PR and the deepcopy C implementation from#91610. On benchmark

import pyperfrunner = pyperf.Runner()setup="""import copyfrom dataclasses import dataclass@dataclassclass A:    a : list    b : str    c : bool    d: tuple    e : set    dc=A([1,2,3], 'hello', True, (), {})empty_tuple = tuple()empty_list = []"""runner.timeit(name="deepcopy empty tuple", stmt="b=copy.deepcopy(empty_tuple)", setup=setup)runner.timeit(name="deepcopy empty list", stmt="b=copy.deepcopy(empty_list)", setup=setup)runner.timeit(name="deepcopy dataclass", stmt="b=copy.deepcopy(dc)", setup=setup)

Results are

deepcopy empty tuple: Mean +- std dev: [dc_main] 827 ns +- 37 ns -> [dc_fastpath_empty] 203 ns +- 7 ns: 4.06x fasterdeepcopy empty list: Mean +- std dev: [dc_main] 1.00 us +- 0.09 us -> [dc_fastpath_empty] 233 ns +- 24 ns: 4.29x fasterdeepcopy dataclass: Mean +- std dev: [dc_main] 8.02 us +- 0.12 us -> [dc_fastpath_empty] 8.22 us +- 0.07 us: 1.02x slowerGeometric mean: 2.57x faster
(env312) c:\projects\misc\cpython\PCbuild>python -m pyperf compare_to dc_main.json dc_c.jsondeepcopy empty tuple: Mean +- std dev: [dc_main] 827 ns +- 37 ns -> [dc_c] 90.6 ns +- 9.8 ns: 9.13x fasterdeepcopy empty list: Mean +- std dev: [dc_main] 1.00 us +- 0.09 us -> [dc_c] 216 ns +- 24 ns: 4.63x fasterdeepcopy dataclass: Mean +- std dev: [dc_main] 8.02 us +- 0.12 us -> [dc_c] 4.06 us +- 0.43 us: 1.97x fasterGeometric mean: 4.37x faster

@sobolevn
Copy link
Member

@eendebakpt it can happen for dataclass, example:

fromdataclassesimportdataclass,as_dict@dataclassclassSome:field:set[int]as_dict(Some(set()))

@eendebakpt
Copy link
Contributor

@eendebakpt it can happen for dataclass, example:

Indeed. Here a more complete example, including the cases where this PR helps

from dataclasses import dataclass, asdict@dataclass class Some:    a: str    t: tuple[int]    field: set[int] # benefits    l : list[int]    d : dict[str, str]    fs: frozenset[int] # benefits    nonempty : set[int]s=Some('x', tuple(), set(), [], d={}, fs=frozenset(), nonempty={1, 2})asdict(s)

@lgeiger
Copy link
ContributorAuthor

Indeed. Here a more complete example, including the cases where this PR helps

Thanks for the example! This PR indeed shows a decent improvement here:

importpyperfrunner=pyperf.Runner()setup="""from dataclasses import dataclass, asdict@dataclassclass Some:    a: str    t: tuple[int]    field: set[int] # benefits    l : list[int]    d : dict[str, str]    fs: frozenset[int] # benefits    nonempty : set[int]s = Some('x', tuple(), set(), [], d={}, fs=frozenset(), nonempty={1, 2})"""runner.timeit(name="dataclass asdict",stmt=f"b = asdict(s)",setup=setup)
Mean +- std dev: [asdict-baseline] 6.26 us +- 0.07 us -> [asdict-opt] 3.39 us +- 0.18 us: 1.85x faster

@sobolevn
Copy link
Member

Other use-cases that we can find in our own code:

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@sobolevnsobolevnsobolevn left review comments

@pochmann3pochmann3pochmann3 left review comments

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

4 participants
@lgeiger@eendebakpt@sobolevn@pochmann3

[8]ページ先頭

©2009-2025 Movatter.jp