Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitb7f6bcb

Browse files
committed
Repair bug in regexp split performance improvements.
Commitc8ea87e introduced a temporary conversion buffer forsubstrings extracted during regexp splits. Unfortunately the code thatsized it was failing to ignore the effects of ignored degenerateregexp matches, so for regexp_split_* calls it could under-size thebuffer in such cases.Fix, and add some regression test cases (though those will only catchthe bug if run in a multibyte encoding).Backpatch to 9.3 as the faulty code was.Thanks to the PostGIS project, Regina Obe and Paul Ramsey for thereport (via IRC) and assistance in analysis. Patch by me.
1 parentba37349 commitb7f6bcb

File tree

3 files changed

+31
-6
lines changed

3 files changed

+31
-6
lines changed

‎src/backend/utils/adt/regexp.c

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -982,6 +982,7 @@ setup_regexp_matches(text *orig_str, text *pattern, pg_re_flags *re_flags,
982982
intarray_len;
983983
intarray_idx;
984984
intprev_match_end;
985+
intprev_valid_match_end;
985986
intstart_search;
986987
intmaxlen=0;/* largest fetch length in characters */
987988

@@ -1024,6 +1025,7 @@ setup_regexp_matches(text *orig_str, text *pattern, pg_re_flags *re_flags,
10241025

10251026
/* search for the pattern, perhaps repeatedly */
10261027
prev_match_end=0;
1028+
prev_valid_match_end=0;
10271029
start_search=0;
10281030
while (RE_wchar_execute(cpattern,wide_str,wide_len,start_search,
10291031
pmatch_len,pmatch))
@@ -1076,13 +1078,15 @@ setup_regexp_matches(text *orig_str, text *pattern, pg_re_flags *re_flags,
10761078
matchctx->nmatches++;
10771079

10781080
/*
1079-
* check length of unmatched portion between end of previous match
1080-
* and start of current one
1081+
* check length of unmatched portion between end of previous valid
1082+
* (nondegenerate, or degenerate but not ignored) match and start
1083+
* of current one
10811084
*/
10821085
if (fetching_unmatched&&
10831086
pmatch[0].rm_so >=0&&
1084-
(pmatch[0].rm_so-prev_match_end)>maxlen)
1085-
maxlen= (pmatch[0].rm_so-prev_match_end);
1087+
(pmatch[0].rm_so-prev_valid_match_end)>maxlen)
1088+
maxlen= (pmatch[0].rm_so-prev_valid_match_end);
1089+
prev_valid_match_end=pmatch[0].rm_eo;
10861090
}
10871091
prev_match_end=pmatch[0].rm_eo;
10881092

@@ -1108,8 +1112,8 @@ setup_regexp_matches(text *orig_str, text *pattern, pg_re_flags *re_flags,
11081112
* input string
11091113
*/
11101114
if (fetching_unmatched&&
1111-
(wide_len-prev_match_end)>maxlen)
1112-
maxlen= (wide_len-prev_match_end);
1115+
(wide_len-prev_valid_match_end)>maxlen)
1116+
maxlen= (wide_len-prev_valid_match_end);
11131117

11141118
/*
11151119
* Keep a note of the end position of the string for the benefit of

‎src/test/regress/expected/strings.out

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -674,6 +674,24 @@ SELECT regexp_split_to_array('123456','.');
674674
{"","","","","","",""}
675675
(1 row)
676676

677+
SELECT regexp_split_to_array('123456','');
678+
regexp_split_to_array
679+
-----------------------
680+
{1,2,3,4,5,6}
681+
(1 row)
682+
683+
SELECT regexp_split_to_array('123456','(?:)');
684+
regexp_split_to_array
685+
-----------------------
686+
{1,2,3,4,5,6}
687+
(1 row)
688+
689+
SELECT regexp_split_to_array('1','');
690+
regexp_split_to_array
691+
-----------------------
692+
{1}
693+
(1 row)
694+
677695
-- errors
678696
SELECT foo, length(foo) FROM regexp_split_to_table('thE QUick bROWn FOx jUMPs ovEr The lazy dOG', 'e', 'zippy') AS foo;
679697
ERROR: invalid regexp option: "z"

‎src/test/regress/sql/strings.sql

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -188,6 +188,9 @@ SELECT regexp_split_to_array('the quick brown fox jumps over the lazy dog', 'nom
188188
SELECT regexp_split_to_array('123456','1');
189189
SELECT regexp_split_to_array('123456','6');
190190
SELECT regexp_split_to_array('123456','.');
191+
SELECT regexp_split_to_array('123456','');
192+
SELECT regexp_split_to_array('123456','(?:)');
193+
SELECT regexp_split_to_array('1','');
191194
-- errors
192195
SELECT foo, length(foo)FROM regexp_split_to_table('thE QUick bROWn FOx jUMPs ovEr The lazy dOG','e','zippy')AS foo;
193196
SELECT regexp_split_to_array('thE QUick bROWn FOx jUMPs ovEr The lazy dOG','e','iz');

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp