Formypy -c 'import torch', the cache load time goes from 0.44s to 0.25s as measured by manager's data_json_load_time. If I time dump times specifically, I see a saving of 0.65s to 0.07s. Overall, a pretty reasonable perf win -- should we make it a required dependency?

I don't know if the sqlite cache path is used at all (what's the status?), but let me know if I need a cleverer migration than renaming the table

This comment has been minimized.

Copy link

Collaborator

JukkaL commentedOct 15, 2024

Sounds very promising! I can also perform some measurements.

should we make it a required dependency?

I wonder how well maintained orjson is, and does it ship binary wheels for all the platforms we care about? We might be adding ARM Linux wheels at some point, and it would be nice if all our dependencies would ship with binary wheels (though it's perhaps not essential for Linux, as long as there are x86-64 wheels).

I don't know if the sqlite cache path is used at all (what's the status?),

Sqlite caching is very much used, and I'm thinking of enabling it by default in the future. In certain use cases it's significantly faster than a file-per-module cache, and we use it at work.

hauntsaninja added2 commits

October 15, 2024 01:47

misc

a27b3d4

fix test

3f8ff75

hauntsaninja commented

Oct 15, 2024

View reviewed changes

mypy/util.py

		returnorjson.dumps(obj,option=orjson.OPT_INDENT_2\|orjson.OPT_SORT_KEYS)# type: ignore[no-any-return]
		else:
		# TODO: If we don't sort keys here, testIncrementalInternalScramble fails
		# We should document exactly what is going on there

Copy link

CollaboratorAuthor

hauntsaninjaOct 15, 2024•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Lmk if you know off the top of your head why sorting keys is important!

Copy link

Collaborator

JukkaLOct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It might be just so that tests produce the keys in a predictable order on older Python 3 versions where dict didn't preserve insertion order.

Copy link

CollaboratorAuthor

hauntsaninja commentedOct 15, 2024•
edited
Loading

I think orjson does ship wheels for all platforms we care about. It would be nice if Python packaging had a concept of a "default" extra for this kind of thing, though

This comment has been minimized.

Copy link

Collaborator

JukkaL commentedOct 15, 2024

I'm seeing a 10-15% improvement to the performance oftime mypy -c 'import torch' on Linux. This probably also helps incremental mypy runs in general.

sort_keys

263cb10

hauntsaninja commented

Oct 16, 2024

View reviewed changes

Copy link

CollaboratorAuthor

hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Okay, I think this PR should be good to go.

Questions to resolve now:

Is just usingfiles2 in sqlite a sufficient migration

Open questions that we can resolve later:

Documenting why sort_keys is important
Adding an extra that includes orjson (or relying on it by default)
Adding test coverage for the optional feature

hauntsaninja mentioned this pull request

Oct 16, 2024

1.12/1.13 Release Tracking Issue#17815

Closed

This comment has been minimized.

Copy link

Collaborator

JukkaL commentedOct 16, 2024

Is just using files2 in sqlite a sufficient migration

This seems fine. It's an internal implementation detail, and caches aren't compatible between mypy versions.

Can you check ifmisc/convert-cache.py still works?

JukkaL approved these changes

Oct 16, 2024

View reviewed changes

mypy/util.py

		returnorjson.dumps(obj,option=orjson.OPT_INDENT_2\|orjson.OPT_SORT_KEYS)# type: ignore[no-any-return]
		else:
		# TODO: If we don't sort keys here, testIncrementalInternalScramble fails
		# We should document exactly what is going on there

Copy link

Collaborator

JukkaLOct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It might be just so that tests produce the keys in a predictable order on older Python 3 versions where dict didn't preserve insertion order.

fix missing spot

bd6536d

Copy link

CollaboratorAuthor

hauntsaninja commentedOct 16, 2024•
edited
Loading

Thanks, there was a missing spot where I'd forgotten to change to files2.

I'm a little confused at how cache-convert is meant to work, I think it might be a little broken on master? Looking...

Copy link

Contributor

github-actionsbot commentedOct 16, 2024

According tomypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

Copy link

CollaboratorAuthor

hauntsaninja commentedOct 16, 2024

Okay, fixed the cache convert problem on master in#17974

hauntsaninja merged commitc1f2db3 intopython:master

Oct 16, 2024

17 checks passed

hauntsaninja deleted the use-orjson branch

October 16, 2024 21:03

hauntsaninja mentioned this pull request

Oct 17, 2024

Add faster-cache extra, test in CI#17978

Merged

JukkaL pushed a commit that referenced this pull request

Oct 17, 2024

Add faster-cache extra, test in CI (#17978)

61ad5a4

Follow up to#17955

hauntsaninja added a commit that referenced this pull request

Oct 20, 2024

Use orjson instead of json, when available (#17955)

7c27808

For `mypy -c 'import torch'`, the cache load time goes from 0.44s to0.25s as measured by manager's data_json_load_time. If I time dump timesspecifically, I see a saving of 0.65s to 0.07s. Overall, a prettyreasonable perf win -- should we make it a required dependency?See also#3456

hauntsaninja added a commit that referenced this pull request

Oct 20, 2024

Add faster-cache extra, test in CI (#17978)

5c4d2db

Follow up to#17955

hauntsaninja mentioned this pull request

Oct 21, 2024

Changelog for 1.13#18000

Merged

simon-liebehenschel reviewed

Nov 30, 2024

View reviewed changes

mypy/util.py

		defjson_dumps(obj:object,debug:bool=False)->bytes:
		iforjsonisnotNone:
		ifdebug:
		returnorjson.dumps(obj,option=orjson.OPT_INDENT_2\|orjson.OPT_SORT_KEYS)# type: ignore[no-any-return]

Copy link

simon-liebehenschelNov 30, 2024•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@hauntsaninja @JukkaL I think that Mypy will not use too much memory because the memory will be released somewhere sooner or later (and, obviously, the cache is not so large to eat a lot of memory), but keep in mind that

python -c'import orjson; [orjson.dumps(i) for i in range(30000000)]'

python -c'import orjson; [orjson.dumps(i).decode() for i in range(30000000)]'

are very different things in the terms how the memory is used and when it is released.

The first command will keep all dumped objects in the memory, plus a crazy memory overhead. E.g. the first command needs +20 GiB of memory to run, but the second command will eat only ~2 GiB.

As I said, Mypy should not be affected because the memory will be freed (I hope, ha-ha) so this PR should be great. Actually, this problem can happen only if wedumps all in a single function. Just a friendly heads up to make sure you check the memory usage in other applications if you will need todumps lots of objects in aFOR loop in a single function :)

Copy link

CollaboratorAuthor

hauntsaninjaNov 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Interesting, that's probably fromijl/orjson#483 (comment) Looks like that's not resolved, someone should open a PR against orjson fixing the thing godlygeek points out

Labels

None yet

Movatterモバイル変換

Uh oh!

Use orjson instead of json, when available#17955

Use orjson instead of json, when available#17955

Conversation

hauntsaninja commentedOct 15, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

This comment has been minimized.

JukkaL commentedOct 15, 2024

Uh oh!

hauntsaninjaOct 15, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JukkaLOct 16, 2024

Choose a reason for hiding this comment

Uh oh!

hauntsaninja commentedOct 15, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

This comment has been minimized.

JukkaL commentedOct 15, 2024

Uh oh!

hauntsaninja left a comment

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

JukkaL commentedOct 16, 2024

Uh oh!

JukkaLOct 16, 2024

Choose a reason for hiding this comment

Uh oh!

hauntsaninja commentedOct 16, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

github-actionsbot commentedOct 16, 2024

Uh oh!

hauntsaninja commentedOct 16, 2024

Uh oh!

Uh oh!

simon-liebehenschelNov 30, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hauntsaninjaNov 30, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hauntsaninja commentedOct 15, 2024•
edited
Loading

hauntsaninjaOct 15, 2024•
edited
Loading

hauntsaninja commentedOct 15, 2024•
edited
Loading

hauntsaninja commentedOct 16, 2024•
edited
Loading

simon-liebehenschelNov 30, 2024•
edited
Loading