Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit1dffabe

Browse files
committed
Further fix pg_trgm's extraction of trigrams from regular expressions.
Commit9e43e87 turns out to have been insufficient: not only is itnecessary to track tentative parent links while considering a set ofarc removals, but it's necessary to track tentative flag additionsas well. This is because we always merge arc target states intoarc source states; therefore, when considering a merge of the finalstate with some other, it is the other state that will acquire a newTSTATE_FIN bit. If there's another arc for the same color trigramthat would cause merging of that state with the initial state, wefailed to recognize the problem. The test cases for the prior commitevidently only exercised situations where a tentative merge with theinitial state occurs before one with the final state. If it goes theother way around, we'll happily merge the initial and final states,either producing a broken final graph that would never match anything,or triggering the Assert added by the prior commit.It's tempting to consider switching the merge direction when the mergeinvolves the final state, but I lack the time to analyze that idea indetail. Instead just keep track of the flag changes that would resultfrom proposed merges, in the same way that the prior commit trackedproposed parent links.Along the way, add some more debugging support, because I'm not entirelyconfident that this is the last bug here. And tweak matters so thatthe transformed.dot file uses small integers rather than pointer valuesto identify states; that makes it more readable if you're just eyeballingit rather than fooling with Graphviz. And rename a couple of identicallynamed struct fields to reduce confusion.Per report from Corey Csuhta. Add a test case based on his example.(Note: this case does not trigger the bug under 9.3, apparently becauseits different measurement of costs causes it to stop merging states beforeit hits the failure. I spent some time trying to find a variant that wouldfail in 9.3, without success; but I'm sure such cases exist.)Like the previous patch, back-patch to 9.3 where this code was added.Report:https://postgr.es/m/E2B01A4B-4530-406B-8D17-2F67CF9A16BA@csuhta.com
1 parent139eb96 commit1dffabe

File tree

3 files changed

+136
-42
lines changed

3 files changed

+136
-42
lines changed

‎contrib/pg_trgm/expected/pg_trgm.out

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3497,6 +3497,7 @@ create table test2(t text COLLATE "C");
34973497
insert into test2 values ('abcdef');
34983498
insert into test2 values ('quark');
34993499
insert into test2 values (' z foo bar');
3500+
insert into test2 values ('/123/-45/');
35003501
create index test2_idx_gin on test2 using gin (t gin_trgm_ops);
35013502
set enable_seqscan=off;
35023503
explain (costs off)
@@ -3598,7 +3599,8 @@ select * from test2 where t ~ '(abc)*$';
35983599
abcdef
35993600
quark
36003601
z foo bar
3601-
(3 rows)
3602+
/123/-45/
3603+
(4 rows)
36023604

36033605
select * from test2 where t ~* 'DEF';
36043606
t
@@ -3690,6 +3692,12 @@ select * from test2 where t ~ 'qua(?!foo)';
36903692
quark
36913693
(1 row)
36923694

3695+
select * from test2 where t ~ '/\d+/-\d';
3696+
t
3697+
-----------
3698+
/123/-45/
3699+
(1 row)
3700+
36933701
drop index test2_idx_gin;
36943702
create index test2_idx_gist on test2 using gist (t gist_trgm_ops);
36953703
set enable_seqscan=off;
@@ -3784,7 +3792,8 @@ select * from test2 where t ~ '(abc)*$';
37843792
abcdef
37853793
quark
37863794
z foo bar
3787-
(3 rows)
3795+
/123/-45/
3796+
(4 rows)
37883797

37893798
select * from test2 where t ~* 'DEF';
37903799
t
@@ -3876,6 +3885,12 @@ select * from test2 where t ~ 'qua(?!foo)';
38763885
quark
38773886
(1 row)
38783887

3888+
select * from test2 where t ~ '/\d+/-\d';
3889+
t
3890+
-----------
3891+
/123/-45/
3892+
(1 row)
3893+
38793894
-- Check similarity threshold (bug #14202)
38803895
CREATE TEMP TABLE restaurants (city text);
38813896
INSERT INTO restaurants SELECT 'Warsaw' FROM generate_series(1, 10000);

‎contrib/pg_trgm/sql/pg_trgm.sql

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ create table test2(t text COLLATE "C");
5252
insert into test2values ('abcdef');
5353
insert into test2values ('quark');
5454
insert into test2values (' z foo bar');
55+
insert into test2values ('/123/-45/');
5556
createindextest2_idx_ginon test2 using gin (t gin_trgm_ops);
5657
set enable_seqscan=off;
5758
explain (costs off)
@@ -87,6 +88,7 @@ select * from test2 where t ~ ' z foo bar';
8788
select*from test2where t ~' z foo bar';
8889
select*from test2where t ~' z foo';
8990
select*from test2where t ~'qua(?!foo)';
91+
select*from test2where t ~'/\d+/-\d';
9092
dropindex test2_idx_gin;
9193

9294
createindextest2_idx_giston test2 using gist (t gist_trgm_ops);
@@ -124,6 +126,7 @@ select * from test2 where t ~ ' z foo bar';
124126
select*from test2where t ~' z foo bar';
125127
select*from test2where t ~' z foo';
126128
select*from test2where t ~'qua(?!foo)';
129+
select*from test2where t ~'/\d+/-\d';
127130

128131
-- Check similarity threshold (bug #14202)
129132

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp