Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
This repository was archived by the owner on Jan 23, 2023. It is now read-only.
/corefxPublic archive

Addresses Regex Perf Issue 32764#32899

Merged
stephentoub merged 5 commits intodotnet:masterfromAlois-xx:Issue_32764
Oct 25, 2018

Conversation

@Alois-xx
Copy link
Contributor

This fix should save ca 40% for the common case when many Char.ToLower calls are emitted which access CultureInfo.CurrentCulture for compiled Regex Queries.

iSazonov and Calkines reacted with thumbs up emoji
…xOptions.IgnoreCase or RegexOptions.CultureInvariant.This saves over 40% in these cases.
……xOptions.IgnoreCase or RegexOptions.CultureInvariant.This saves over 40% in these cases.
…ith RegexOptions.IgnoreCase or RegexOptions.CultureInvariant."This reverts commite151e93.

privatevoidCallToLower()
{
Ldloc(_cultureV);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

What is the lifetime of this? e.g. does this cache end up spanning multiple regex calls and user code such that user code could change the current culture and end up with different behavior than before?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

_cultureV is a local variable. The lifetime of the cache is therefore method local in FindFirstChar and Go methods of the compiled regular expression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ah, I see. Thanks.

privatevoidCallToLower()
{
Ldloc(_cultureV);
Call(s_chartolowerM);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@ViktorHofer, am I misremembering, or did you do some ToLower-related optimizations/changes when you were cleaning up the regex code recently? If yes, should we instead just port the appropriate changes to the compiled version? If no, is there a similar optimization to be done on the interpreted side of things?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Right, but only minor allocation optimizations. I wouldn't do that as part of this PR but revisit the compiled code paths later to make sure that we don't diverge and bring optimizations over.

@stephentoub
Copy link
Member

Do we have regex tests (interpreted and compiled) that rely on the current culture being something specific? If not, it'd be good to add as part of this.

@ViktorHofer
Copy link
Member

Do we have regex tests (interpreted and compiled) that rely on the current culture being something specific? If not, it'd be good to add as part of this.

Only very few for Unicode characters. We should definitely add a bunch.

Copy link
Member

@ViktorHoferViktorHofer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can you please post performance numbers with BenchmarkDotNet baselined on master including allocation?

@Alois-xx
Copy link
ContributorAuthor

Alois-xx commentedOct 19, 2018
edited
Loading

@ViktorHofer
Benchmark.NET was a little funny to get it to use the compiled dlls and not the ones from the framework sdk. Here are the patched Results:

                        Method | N |         Mean |        Error |       StdDev | Allocated |------------------------------ |-- |-------------:|-------------:|-------------:|----------:|                      Original | 1 | 438,212.4 ns | 4,647.930 ns | 4,347.676 ns |       0 B |                 CaseSensitive | 1 | 227,563.0 ns | 5,142.693 ns | 5,281.167 ns |       0 B | CaseSensitiveCultureInvariant | 1 | 215,541.1 ns | 1,187.442 ns | 1,110.734 ns |       0 B |          SimplifiedExpression | 1 |   1,525.5 ns |    11.969 ns |    11.195 ns |       0 B |      Case_Sensitive_Substring | 1 |     691.5 ns |     4.897 ns |     4.581 ns |       0 B |

And here the baseline version where I did revert the changes of my commit to really measure the impact of only my change:

                        Method | N |           Mean |         Error |        StdDev | Allocated |------------------------------ |-- |---------------:|--------------:|--------------:|----------:|                      Original | 1 | 1,055,537.2 ns | 31,271.169 ns | 36,011.922 ns |       0 B |                 CaseSensitive | 1 |   216,579.6 ns |  1,827.153 ns |  1,709.119 ns |       0 B | CaseSensitiveCultureInvariant | 1 |   217,116.7 ns |  2,212.376 ns |  1,961.213 ns |       0 B |          SimplifiedExpression | 1 |     1,523.0 ns |     13.191 ns |     12.339 ns |       0 B |      Case_Sensitive_Substring | 1 |       671.3 ns |      8.552 ns |      8.000 ns |       0 B |

Just in case you did not see here is the ETW chart:
grafik

Actually the Benchmark.NET numbers are even better. From the numbers it looks like the slow code gen path is only triggerd when RegexOptions.IgnoreCase is used. But anyway this is certainly a widely used option.

@Alois-xx
Copy link
ContributorAuthor

@stephentoub: I could think of some Turkish I tests which behave differently in different locales. That would be a good test to verify that the right culture was used.

@ViktorHofer
Copy link
Member

ViktorHofer commentedOct 22, 2018
edited
Loading

That sounds good and should suffice. As mentioned before, we currently don't have a comprehensive set of inputs with different cultures. We should definitely fix that.

@Alois-xx
Copy link
ContributorAuthor

@ViktorHofer: Added some locale tests which really show different behavior under the turkish locale.

@Alois-xx
Copy link
ContributorAuthor

@dotnet-bot test this please

stringinput="Iıİi";

varcultInvariantRegex=Create(input,CultureInfo.InvariantCulture,RegexOptions.IgnoreCase|RegexOptions.CultureInvariant);
varturkishRegex=Create(input,turkish,RegexOptions.IgnoreCase);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nit: please change var to be the actual type here... we only use var when the type is obvious from the right-hand side, namely a ctor or an explicit cast.

@Alois-xx
Copy link
ContributorAuthor

@dotnet-bot test this please

@stephentoub
Copy link
Member

Thanks!

@stephentoubstephentoub merged commit58e2b4c intodotnet:masterOct 25, 2018
@danmoseley
Copy link
Member

Nice win here@Alois-xx many thanks! Any interest in more work on regex?

@Alois-xx
Copy link
ContributorAuthor

@danmosemsft: If I find new issues or ideas I will definitely file an issue. But currently my free time is quite limited.

@Alois-xxAlois-xx deleted the Issue_32764 branchOctober 29, 2018 11:28
@karelzkarelz added this to the3.0 milestoneNov 15, 2018
picenka21 pushed a commit to picenka21/runtime that referenced this pull requestFeb 18, 2022
* Fix for Regex performance issue when compiled Regex is used with RegexOptions.IgnoreCase or RegexOptions.CultureInvariant.This saves over 40% in these cases.* Fix for Regex performance issue when compiled Regex is used with Rege…xOptions.IgnoreCase or RegexOptions.CultureInvariant.This saves over 40% in these cases.* Revert "Fix for Regex performance issue when compiled Regex is used with RegexOptions.IgnoreCase or RegexOptions.CultureInvariant."This reverts commitdotnet/corefx@e151e93.* Added TurkishI tests which check compiled and interpreted regular expressions.* Removed var of testCommit migrated fromdotnet/corefx@58e2b4c
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.

Reviewers

2 more reviewers

@stephentoubstephentoubstephentoub left review comments

@ViktorHoferViktorHoferViktorHofer left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

3.0

Development

Successfully merging this pull request may close these issues.

5 participants

@Alois-xx@stephentoub@ViktorHofer@danmoseley@karelz

[8]ページ先頭

©2009-2025 Movatter.jp