Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Speed up ProhibitSurrogateCharactersValidator#9492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
SpecLad wants to merge1 commit intoencode:master
base:master
Choose a base branch
Loading
fromSpecLad:speed-up-psc-validator

Conversation

SpecLad
Copy link

Description

I've noticed that this validator is using a per-character loop. Replacing it with a regex results in a pretty significant speedup. Here are results from my benchmark:

String lengthOld implementation time (sec)New implementation time (sec)
12.833e-071.765e-07
105.885e-072.030e-07
1003.598e-064.144e-07
10003.329e-052.463e-06
100000.00033382.449e-05
1000000.0033380.0002284
10000000.033330.002278
100000000.33890.02377
1000000003.2500.2365

For large strings, the speedups are more than an order of magnitude.

onegreyonewhite and sevdog reacted with thumbs up emojiulgens reacted with eyes emoji
@SpecLad
Copy link
Author

For the record, here's the benchmark:

importtimeitfromrest_framework.validatorsimportProhibitSurrogateCharactersValidatorvalidator=ProhibitSurrogateCharactersValidator()foriinrange(0,9):length=10**istring="a"*lengthtimer=timeit.Timer("validator(string)",globals=globals())loops,duration=timer.autorange()print(f"{length}\t{duration/loops}")

@auvipyauvipy self-requested a reviewAugust 8, 2024 08:52
kevin-brown
kevin-brown previously approved these changesAug 10, 2024
@tomchristie
Copy link
Member

Okay, so this is a good exercise case...

It's a neat and properly defined little PR that'sclearly an improvement.
We'd need to be wilfully obtuse to say no. Why might we choose to reject even this properly targeted improvement?

Well... because having a steadfast "no" policy is really uncomplicated, and means we don't have continual PR pressure & creep, whichon balance just results in unnecessary busy-work and risk.

I'd suggest we need to be more clear in our language inCONTRIBUTING.md, and just issue a really clear no on essentially all code PRs here. If we're unambiguous about this then we don't have to second guess ourselves.

Suggested language might. be...

REST framework is considered complete. We may accept...

  • Pull requests improving the documentation, or adding third party packages to the documentation.
  • Pull requests that resolve a CVE'd security issue.
  • Pull requests that are strictly required if updated Django versions don't pass some part of our existing test suite.

All other pull requests and issues will be closed.

Yes, there's clearly areas where there's potential performance impacts. There's alsoplenty of areas where there's loosely defined behaviour. However being really clear that this is essentiallya finished project (excepting the points above) would probably be a net benefit.

(The alternative is similar to above, but we allow ourselvessome leeway with a "We'll only consider pull requests that meet an unambiguously high bar of quality". 🤔)

ulgens, onegreyonewhite, SleipRecx, noamk-hl, thinkwelltwd, and sevdog reacted with thumbs down emojiakx reacted with confused emoji

@staleStale
Copy link

stalebot commentedApr 26, 2025

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stalestalebot added the stale labelApr 26, 2025
I've noticed that this validator is using a per-character loop. Replacing itwith a regex results in a pretty significant speedup. Here are results frommy benchmark:    String length      Old implementation    New implementation                           time (sec)            time (sec)              1          2.833e-07             1.765e-07             10          5.885e-07             2.030e-07            100          3.598e-06             4.144e-07           1000          3.329e-05             2.463e-06          10000          0.0003338             2.449e-05         100000          0.003338              0.0002284        1000000          0.03333               0.002278       10000000          0.3389                0.02377      100000000          3.250                 0.2365For large strings, the speedups are more than an order of magnitude.
@SpecLad
Copy link
Author

I have rebased this in case you change your mind. If not, just close the PR.

FWIW, thecurrent wording inCONTRIBUTING.md and elsewhere says that

We may accept pull requests that track the continued development of Django versions, but would prefer not to accept new features or code formatting changes.

Since this is neither a new feature nor a code formatting change, it seems like it should be in scope. Of course, it's your project, so I understand if you want to reject it anyway.

q0w and noamk-hl reacted with thumbs up emojiulgens reacted with heart emoji

@stalestalebot removed the stale labelMay 12, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@ISMAIL0112358ISMAIL0112358ISMAIL0112358 approved these changes

@kevin-brownkevin-brownkevin-brown left review comments

@auvipyauvipyAwaiting requested review from auvipy

At least 1 approving review is required to merge this pull request.

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

4 participants
@SpecLad@tomchristie@kevin-brown@ISMAIL0112358

[8]ページ先頭

©2009-2025 Movatter.jp