Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[Routing] Support UTF-8 in paths and parameters#19562

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
c960657 wants to merge1 commit intosymfony:masterfromc960657:utf8-routes

Conversation

@c960657
Copy link
Contributor

QA
Branch?master
Bug fix?no
New feature?yes
BC breaks?yes
Deprecations?no
Tests pass?yes
Fixed tickets#5236
LicenseMIT
Doc PR

This PR is a new take on a previous proposal in#19054.

@c960657
Copy link
ContributorAuthor

c960657 commentedAug 7, 2016
edited
Loading

Some of the tests for FrameworkBundle are failing on PHP 7, because the tests assumes the changes made to symfony/routing by this PR, but Travis installs latest released version of symfony/routing. I can bump the requirement in composer.json, or I can make the code and tests compatible with both versions. How do we usually deal with this?

* @param LoggerInterface|null $logger A logger instance
*/
publicfunction__construct(RouteCollection$routes,RequestContext$context,LoggerInterface$logger =null)
publicfunction__construct(RouteCollection$routes,RequestContext$context,LoggerInterface$logger =null,$charset ='UTF-8')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

forgot to upgrade the docblock with new arg

@c960657
Copy link
ContributorAuthor

@MacDada Thanks for the review. I have addresses all your comments.

I still need some advice on the FrameworkBundle issue mentioned above.

@nicolas-grekas
Copy link
Member

nicolas-grekas commentedAug 9, 2016
edited
Loading

I think there is a mismatch is the definition of the$charset params:
URLs have no encoding. HTML5 says that they should be encoded in UTF-8, and that when an non-valid UTF-8 URL is found, it should be converted from CP1252 by default.

-> we need a parameter to configure this fallback input charset. This can't be kernel.charset by default, but CP1252.

Then there is the output encoding of the URLs we generate. These should match the output encoding the app is using.

-> this matches the definition of the kernel.charset parameter so it can be injected here.

  • Let's say I build an UTF-8 app in Russia => kernel.charset is UTF-8 and fallback input enc is some cyrillic enc.
  • Let's now say I build a cyrillic app in Russia => kernel.charset is this cyrillic enc, and the fallback enc should also be kernel.charset (and we should still handle UTF-8 URLs as input of course).

@c960657
Copy link
ContributorAuthor

c960657 commentedAug 9, 2016
edited
Loading

URLs have no encoding.

True, a final URL has no encoding. The charset is used when we want to represent a non-ASCII character. For example, the Wikipedia article about the letter É has the URLhttps://en.wikipedia.org/wiki/%C3%89.%C3%89 is the percent-encoding of the two bytes representing É in UTF-8.

If ISO-8859-1 was used note (that was very common years ago), the URL would have beenhttps://en.wikipedia.org/wiki/%C9 (that URL currently redirects to the UTF-8-encoded one). Nowadays UTF-8 is the standard, and when accessing a UTF-8-encoded URL, the browser displayshttps://en.wikipedia.org/wiki/É in the URL field.

HTML5 says that they should be encoded in UTF-8, and that when an non-valid UTF-8 URL is found, it should be converted from CP1252 by default.

That rule only comes into play if the HTML contains URLs containing characters that are not properly percent-encoded. Such URLs are not valid and should never occur in HTML generated by Symfony functions.

However, since browsers prefer UTF-8 in the URL, I guess you would probably prefer to use UTF-8 as$router->charset even when kernel.charset is cyrillic. So I guess that default the router charset tokernel.charset is a bad idea.

@nicolas-grekas
Copy link
Member

That rule only comes into play if the HTML contains URLs containing characters that are not properly percent-encoded.

Not at all in my understanding, and in fact you cited what I meant: typehttps://en.wikipedia.org/wiki/%C9 in the browser, and get redirected to UTF-8,in the same way than HTML5 says (not the redirection but the charset conversion).

I guess that default the router charset to kernel.charset is a bad idea.

Yep, we need a fallbak input charset

@c960657
Copy link
ContributorAuthor

Which part of the HTML5 spec are you referring to?

I guess we could implement a fallback encoding that automatically redirects like Wikipedia does, but I think that is a new feature that is outside the scope of this PR.

@nicolas-grekas
Copy link
Member

nicolas-grekas commentedAug 10, 2016
edited
Loading

Which part of the HTML5 spec are you referring to?

an older draft one looks like, I can't find the reference :)

@nicolas-grekas
Copy link
Member

nicolas-grekas commentedAug 12, 2016
edited
Loading

I tried an alternative approach in#19604
If#19604 looks good, it means we will have to work on another PR for charset reconciliation between incoming URLs and the charset used in the matcher. To me, this other PR doesn't necessarily belong to the router.We could just have a listener that comes before the router, checks the charset of the URL and either redirects, 404 or converts (the 3 strategies are valid to me, choice should belong to the dev). Unicode normalization could be also done in this listener (if we want it: it's costly, and we lived without for years, could be opt-in only).

fabpot added a commit that referenced this pull requestAug 16, 2016
This PR was merged into the 2.7 branch.Discussion----------[Routing] Add missing options in docblock| Q             | A| ------------- | ---| Branch?       | 2.7| Bug fix?      | no| New feature?  | no| BC breaks?    | no| Deprecations? | no| Tests pass?   | yes| Fixed tickets | -| License       | MIT| Doc PR        | -Courtesy of@c960657 in#19562Commits-------f45da32 [Routing] Add missing options in docblock
@nicolas-grekas
Copy link
Member

nicolas-grekas commentedAug 23, 2016
edited
Loading

@c960657 OK to close this one in favor of#19604?

@c960657
Copy link
ContributorAuthor

Yes, let’s continue in#19604.

fabpot added a commit that referenced this pull requestAug 25, 2016
…s (nicolas-grekas)This PR was merged into the 3.2-dev branch.Discussion----------[Routing] Add seamless support for unicode requirements| Q             | A| ------------- | ---| Branch?       | master| Bug fix?      | no| New feature?  | yes| BC breaks?    | no| Deprecations? | yes| Tests pass?   | yes| Fixed tickets |#3629,#5236,#19562| License       | MIT| Doc PR        |symfony/symfony-docs#6890This PR adds unicode support to route matching and generation by automatically adding the `u` modifier to regexps that use either unicode characters or unicode enabled character classes (e.g. `\p...` `\x{...}` `\X`).As a side note, if one wants to match a single unicode character (vs a single byte), one should use `\PM` or `\X` instead of `.` *or* set the `unicode` parameter to true.Commits-------a829d34 [Routing] Add seamless support for unicode requirements
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

4 participants

@c960657@nicolas-grekas@MacDada@carsonbot

[8]ページ先頭

©2009-2025 Movatter.jp