Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[Routing] Fix matching of utf8 params#42159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
mazanax wants to merge2 commits intosymfony:5.3frommazanax:5.3
Closed

[Routing] Fix matching of utf8 params#42159

mazanax wants to merge2 commits intosymfony:5.3frommazanax:5.3

Conversation

@mazanax
Copy link

QA
Branch?5.3
Bug fix?yes
New feature?no
Deprecations?no
TicketsFix#41909
LicenseMIT
Doc PR

Change Regexp for routes with UTF-8 params.

@carsonbot
Copy link

Hey!

I see that this is your first PR. That is great! Welcome!

Symfony has acontribution guide which I suggest you to read.

In short:

  • Always add tests
  • Keep backward compatibility (seehttps://symfony.com/bc).
  • Bug fixes must be submitted against the lowest maintained branch where they apply (seehttps://symfony.com/releases)
  • Features and deprecations must be submitted against the 5.4 branch.

Review the GitHub status checks of your pull request and try to solve the reported issues. If some tests are failing, try to see if they are failing because of this change.

When two Symfony core team members approve this change, it will be merged and you will become an official Symfony contributor!
If this PR is merged in a lower version branch, it will be merged up to all maintained branches within a few days.

I am going to sit back now and wait for the reviews.

Cheers!

Carsonbot

@carsonbotcarsonbot changed the title[Router] Fix matching of utf8 params[Routing] Fix matching of utf8 paramsJul 16, 2021
// Match all variables enclosed in "{}" and iterate over them. But we only want to match the innermost variable
// in case of nested "{}", e.g. {foo{bar}}. This in ensured because \w does not match "{" or "}" itself.
preg_match_all('#\{(!)?(\w+)\}#',$pattern,$matches, \PREG_OFFSET_CAPTURE | \PREG_SET_ORDER);
$routeParamsPattern =$needsUtf8 ?'#\{(!)?([\p{L}_]+)\}#u' :'#\{(!)?(\w+)\}#';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Numbers?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Fixed, thanks

Copy link
Contributor

@FoxprodevFoxprodevJul 16, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I am not sure 100% sure, but \w with u flag should be enough. Am I wrong?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

\w doesn't support unicode characters.
Here is example:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There is no unicode flag in your snippet

}catch (ResourceNotFoundException$e) {
}

$this->assertEquals(['_route' =>'foo','bär' =>'baz'],$matcher->match('/foo/baz'));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

assertSame whenever possible

Copy link
Author

@mazanaxmazanaxJul 19, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm not sure that it's possible to use assertSame here. $matcher->match() doesn't guarantee the order of elements in the array. And assertEquals ignores it.

// Match all variables enclosed in "{}" and iterate over them. But we only want to match the innermost variable
// in case of nested "{}", e.g. {foo{bar}}. This in ensured because \w does not match "{" or "}" itself.
preg_match_all('#\{(!)?(\w+)\}#',$pattern,$matches, \PREG_OFFSET_CAPTURE | \PREG_SET_ORDER);
$routeParamsPattern =$needsUtf8 ?'#\{(!)?([\p{L}\d_]+)\}#u' :'#\{(!)?(\w+)\}#';
Copy link
Member

@nicolas-grekasnicolas-grekasJul 19, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

there is no need to check for unicode support:\pL always works
this means that the regexp can unconditionally be:'#\{(!)?([\w\pL]++)\}#'

note that there are other occurrences of\w in this very file

Copy link
Author

@mazanaxmazanaxJul 19, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes, I agree. Anyway, I found that I also need to make some fixes to support utf-8 characters in regex of compiled routes. (for php7.2)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@nicolas-grekas Maybe it's not so bad to split the logic.

Matching characters by Unicode property is not fast, because PCRE has to do a multistage table lookup in order to find a character's property. That is why the traditional escape sequences such as \d and \w do not use Unicode properties in PCRE by default

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

In non-unicode mode, PCRE doesn't use unicode tables.

Copy link
Contributor

@FoxprodevFoxprodevJul 19, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@nicolas-grekas in non-unicode mode \pL does not fully handle unicode characters too.https://www.phpliveregex.com/p/BbM

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

When theu modifier is set (aka when$needsUtf8 is true, aka when theutf8 option is set), PCRE will use Unicode tables. It will use ASCII tables otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

So we still need to conditionally set u modifier, right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That's already done:

Copy link
Author

@mazanaxmazanaxJul 19, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

But it doesn't work. We have to use the modifier u in this regex as well. Because otherwise route params won't be parsed and route will be determined as static instead of dynamic

Copy link
Contributor

@FoxprodevFoxprodevJul 19, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Oh, that's the point. But it doesn't work in my test cases and I am still sure that we need to addu on currently discussed string.
Anyway I will wait for full PR then. Thanks for the time!

Copy link
Member

@nicolas-grekasnicolas-grekas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Seegrep '\\w' src/Symfony/Component/Routing/ -r

@nicolas-grekas
Copy link
Member

Closing in favor of#45054
Could you please have a look@mazanax?
Thanks for pushing this forward!

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@nicolas-grekasnicolas-grekasnicolas-grekas requested changes

@OskarStarkOskarStarkOskarStark left review comments

+1 more reviewer

@FoxprodevFoxprodevFoxprodev left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Projects

None yet

Milestone

5.3

Development

Successfully merging this pull request may close these issues.

5 participants

@mazanax@carsonbot@nicolas-grekas@OskarStark@Foxprodev

[8]ページ先頭

©2009-2025 Movatter.jp