Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit8a29ed0

Browse files
committed
Fix misoptimization of "{1,1}" quantifiers in regular expressions.
A bounded quantifier with m = n = 1 might be thought a no-op. Butaccording to our documentation (which traces back to Henry Spencer'soriginal man page) it still imposes greediness, or non-greediness in thecase of the non-greedy variant "{1,1}?", on whatever it's attached to.This turns out not to work though, because parseqatom() optimizes awaythe m = n = 1 case without regard for whether it's supposed to changethe greediness of the argument RE.We can fix this by just not applying the optimization when the greedinessneeds to change; the subsequent general cases handle it fine.The three cases in which we can still apply the optimization are(a) no quantifier, or quantifier does not impose a preference;(b) atom has no greediness property, implying it cannot match avariable amount of text anyway; or(c) quantifier's greediness is same as atom's.Note that in most cases where one of these applies, we'd have exitedearlier in the "not a messy case" fast path. I think it's now onlypossible to get to the optimization when the atom involves capturingparentheses or a non-top-level backref.Back-patch to all supported branches. I'd ordinarily be hesitant toput a subtle behavioral change into back branches, but in this caseit's very hard to see a reason why somebody would write "{1,1}?" unlessthey're trying to get the documented change-of-greediness behavior.Discussion:https://postgr.es/m/5bb27a41-350d-37bf-901e-9d26f5592dd0@charter.net
1 parentd02768d commit8a29ed0

File tree

3 files changed

+63
-1
lines changed

3 files changed

+63
-1
lines changed

‎src/backend/regex/regcomp.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1155,7 +1155,10 @@ parseqatom(struct vars *v,
11551155
/* rest of branch can be strung starting from atom->end */
11561156
s2=atom->end;
11571157
}
1158-
elseif (m==1&&n==1)
1158+
elseif (m==1&&n==1&&
1159+
(qprefer==0||
1160+
(atom->flags& (LONGER |SHORTER |MIXED))==0||
1161+
qprefer== (atom->flags& (LONGER |SHORTER |MIXED))))
11591162
{
11601163
/* no/vacuous quantifier: done */
11611164
EMPTYARC(s,atom->begin);/* empty prefix */

‎src/test/regress/expected/regex.out

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -492,6 +492,55 @@ select regexp_matches('foo/bar/baz',
492492
{foo,bar,baz}
493493
(1 row)
494494

495+
-- Test that greediness can be overridden by outer quantifier
496+
select regexp_matches('llmmmfff', '^(l*)(.*)(f*)$');
497+
regexp_matches
498+
----------------
499+
{ll,mmmfff,""}
500+
(1 row)
501+
502+
select regexp_matches('llmmmfff', '^(l*){1,1}(.*)(f*)$');
503+
regexp_matches
504+
----------------
505+
{ll,mmmfff,""}
506+
(1 row)
507+
508+
select regexp_matches('llmmmfff', '^(l*){1,1}?(.*)(f*)$');
509+
regexp_matches
510+
------------------
511+
{"",llmmmfff,""}
512+
(1 row)
513+
514+
select regexp_matches('llmmmfff', '^(l*){1,1}?(.*){1,1}?(f*)$');
515+
regexp_matches
516+
----------------
517+
{"",llmmm,fff}
518+
(1 row)
519+
520+
select regexp_matches('llmmmfff', '^(l*?)(.*)(f*)$');
521+
regexp_matches
522+
------------------
523+
{"",llmmmfff,""}
524+
(1 row)
525+
526+
select regexp_matches('llmmmfff', '^(l*?){1,1}(.*)(f*)$');
527+
regexp_matches
528+
----------------
529+
{ll,mmmfff,""}
530+
(1 row)
531+
532+
select regexp_matches('llmmmfff', '^(l*?){1,1}?(.*)(f*)$');
533+
regexp_matches
534+
------------------
535+
{"",llmmmfff,""}
536+
(1 row)
537+
538+
select regexp_matches('llmmmfff', '^(l*?){1,1}?(.*){1,1}?(f*)$');
539+
regexp_matches
540+
----------------
541+
{"",llmmm,fff}
542+
(1 row)
543+
495544
-- Test for infinite loop in cfindloop with zero-length possible match
496545
-- but no actual match (can only happen in the presence of backrefs)
497546
select 'a' ~ '$()|^\1';

‎src/test/regress/sql/regex.sql

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,16 @@ select regexp_matches('Programmer', '(\w)(.*?\1)', 'g');
118118
select regexp_matches('foo/bar/baz',
119119
'^([^/]+?)(?:/([^/]+?))(?:/([^/]+?))?$','');
120120

121+
-- Test that greediness can be overridden by outer quantifier
122+
select regexp_matches('llmmmfff','^(l*)(.*)(f*)$');
123+
select regexp_matches('llmmmfff','^(l*){1,1}(.*)(f*)$');
124+
select regexp_matches('llmmmfff','^(l*){1,1}?(.*)(f*)$');
125+
select regexp_matches('llmmmfff','^(l*){1,1}?(.*){1,1}?(f*)$');
126+
select regexp_matches('llmmmfff','^(l*?)(.*)(f*)$');
127+
select regexp_matches('llmmmfff','^(l*?){1,1}(.*)(f*)$');
128+
select regexp_matches('llmmmfff','^(l*?){1,1}?(.*)(f*)$');
129+
select regexp_matches('llmmmfff','^(l*?){1,1}?(.*){1,1}?(f*)$');
130+
121131
-- Test for infinite loop in cfindloop with zero-length possible match
122132
-- but no actual match (can only happen in the presence of backrefs)
123133
select'a' ~'$()|^\1';

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp