Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

<regex>: Improve search performance for regexes with initial+ quantifiers#5509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation

@muellerj2
Copy link
Contributor

@muellerj2muellerj2 commentedMay 15, 2025
edited
Loading

Towards#5468. This is a small change that greatly speeds up searches for regexes likea+ that start with some letter/string/character class followed by a+ quantifier (or any other quantifier requiring at least one repetition). Because this loop must be matched at least once, we can enter the repeated subpattern and look for the first position a letter/string/character class in the subpattern can match.

While working on this, I noticed that I didn't think the implementation oftext_regex::should_search_match_capture_groups() through well enough: I designed it to use relative coordinates for expected submatches, but this isn't so helpful when one wants to ensure that the whole match is in a particular position. So I changed the implementation to use absolute coordinates from the start of the matched string. Luckily, no test seems to have relied on the previous behavior, meaning all of them just matched the start of the input string anyway.

Benchmark

Running on my machine:

BenchmarkBeforeAfterSpeedup
bm_lorem_search/"bibe"/238504 ns39237 ns0.98
bm_lorem_search/"bibe"/376730 ns76730 ns1.00
bm_lorem_search/"bibe"/4153460 ns153460 ns1.00
bm_lorem_search/"(bibe)+"/24814680 ns92076 ns52.29
bm_lorem_search/"(bibe)+"/39521484 ns204041 ns46.66
bm_lorem_search/"(bibe)+"/418158784 ns401088 ns45.27
bm_lorem_search/"(?:bibe)+"/24743304 ns97656 ns48.57
bm_lorem_search/"(?:bibe)+"/39521484 ns192540 ns49.45
bm_lorem_search/"(?:bibe)+"/419531250 ns384976 ns50.73

AlexGuteniev reacted with thumbs up emojiStephanTLavavej reacted with heart emoji
@muellerj2muellerj2 requested a review froma team as acode ownerMay 15, 2025 20:31
@StephanTLavavejStephanTLavavej added performanceMust go faster regexmeow is a substring of homeowner labelsMay 15, 2025
@StephanTLavavejStephanTLavavej self-assigned thisMay 15, 2025
@StephanTLavavejStephanTLavavej removed their assignmentMay 16, 2025
@StephanTLavavejStephanTLavavej moved this fromInitial Review toReady To Merge inSTL Code ReviewsMay 16, 2025
@StephanTLavavejStephanTLavavej moved this fromReady To Merge toMerging inSTL Code ReviewsMay 16, 2025
@StephanTLavavej
Copy link
Member

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej
Copy link
Member

I resolved a trivial adjacent-add conflict with#5494 inVSO_0000000_regex_use.

@StephanTLavavejStephanTLavavej merged commit2391e5e intomicrosoft:mainMay 17, 2025
40 checks passed
@github-project-automationgithub-project-automationbot moved this fromMerging toDone inSTL Code ReviewsMay 17, 2025
@StephanTLavavej
Copy link
Member

➕ 🚀 ⏱️

@muellerj2muellerj2 deleted the regex-improve-regex_search-performance branchMay 31, 2025 21:44
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@StephanTLavavejStephanTLavavejStephanTLavavej approved these changes

Assignees

No one assigned

Labels

performanceMust go fasterregexmeow is a substring of homeowner

Projects

Archived in project

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

@muellerj2@StephanTLavavej

[8]ページ先頭

©2009-2025 Movatter.jp