Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Implement a safe_url based on all standards#221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
Gallaecio wants to merge36 commits intoscrapy:master
base:master
Choose a base branch
Loading
fromGallaecio:safer-url

Conversation

@Gallaecio
Copy link
Member

@GallaecioGallaecio commentedFeb 13, 2024
edited
Loading

Continuation of#201.

Fixes#193.

Tasks:

  • Provide an alternative implementation for safe_url_string based on the URL living standard
  • Add missing typing information
  • Provide comprehensive test coverage for both implementations, highlighting the differences, and in the process make the new implementation support also RFC 2396 (used byjava.net.URI) and RFC 3986 (the previous standard, still popular).
  • Ensure the following issues are addressed by the new implementation:
  • Have the URL living standard parse implementationmostly pass theupstream tests.
  • Improve performance
    The new implementation is currently around 3-4 times slower than the previous implementation for test cases where both have the same, non-error outcome.
  • Address to-dos intest_parse_url.
  • Clean up the implementation: cleaner code, docstrings…
  • Discuss what to do API-wise (deprecate safe_url_string in favor of safe_url? Remove safe_url and make safe_url_string use the new implementation?

@Gallaecio
Copy link
MemberAuthor

I am having a hard time lowering the gap below the current one, which is 4-3 times slower than the previous implementation 😞

@Gallaecio
Copy link
MemberAuthor

Looking atscrapy/scrapy#1306, maybe we could try going forhttps://chromium.googlesource.com/chromium/src/+/HEAD/url/ .

@Gallaecio
Copy link
MemberAuthor

@kmike
Copy link
Member

An interesting project:https://github.com/TkTech/can_ada

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

safe_url_string handling IPv6 URLs

2 participants

@Gallaecio@kmike

[8]ページ先頭

©2009-2025 Movatter.jp