Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit938236a

Browse files
committed
The fti.pl supplied with the fulltextindex module generate ALL possible
substrings of two characters or greater, and is case-sensitive.This patch makes it work correctly. It generates only the suffixes of eachword, plus lowercases them - as specified by the README file.This brings it into line with the fti.c function, makes it case-insensitiveproperly, removes the problem with duplicate rows being returned from an ftisearch and greatly reduces the size of the generated index table.It was written by my co-worker, Brett Toolin.Christopher Kings-Lynne
1 parent8c6761a commit938236a

File tree

1 file changed

+13
-12
lines changed

1 file changed

+13
-12
lines changed

‎contrib/fulltextindex/fti.pl

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#!/usr/bin/perl
22
#
3-
# This script substracts allsubstrings out of a specific column in a table
3+
# This script substracts allsuffixes of all words in a specific column in a table
44
# and generates output that can be loaded into a new table with the
55
# psql '\copy' command. The new table should have the following structure:
66
#
@@ -52,27 +52,28 @@
5252
$PGRES_NONFATAL_ERROR = 6 ;
5353
$PGRES_FATAL_ERROR = 7 ;
5454

55+
# the minimum length of word to include in the full text index
56+
$MIN_WORD_LENGTH = 2;
57+
58+
# the minimum length of the substrings in the full text index
59+
$MIN_SUBSTRING_LENGTH = 2;
60+
5561
$[ = 0;# make sure string offsets start at 0
5662

5763
subbreak_up {
5864
my$string =pop@_;
5965

66+
# convert strings to lower case
67+
$string =lc($string);
6068
@strings =split(/\W+/,$string);
6169
@subs = ();
6270

6371
foreach$s (@strings) {
6472
$len =length($s);
65-
nextif ($len < 4);
66-
67-
$lpos =$len-1;
68-
while ($lpos >= 3) {
69-
$fpos =$lpos - 3;
70-
while ($fpos >= 0) {
71-
$sub =substr($s,$fpos,$lpos -$fpos + 1);
72-
push(@subs,$sub);
73-
$fpos =$fpos - 1;
74-
}
75-
$lpos =$lpos - 1;
73+
nextif ($len <=$MIN_WORD_LENGTH);
74+
for ($i = 0;$i <=$len -$MIN_SUBSTRING_LENGTH;$i++) {
75+
$tmp =substr($s,$i);
76+
push(@subs,$tmp);
7677
}
7778
}
7879

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp