Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitd8783c5

Browse files
committed
Per this discussion, here's a patch to implement both levenshtein() and
metaphone() in a contrib. There seem to be a fair number of differentapproaches to both of these algorithms. I used the simplest case forlevenshtein which has a cost of 1 for any character insertion, deletion, orsubstitution. For metaphone, I adapted the same code from CPAN that the PHPfolks did.A couple of questions:1. Does it make sense to fold the soundex contrib together with this one?2. I was debating trying to add multibyte support to levenshtein (it wouldmake no sense at all for metaphone), but a quick search through the contribdirectory found no hits on the word MULTIBYTE. Should worry about addingmultibyte support to levenshtein()?Joe Conway
1 parent0bc291e commitd8783c5

File tree

6 files changed

+963
-0
lines changed

6 files changed

+963
-0
lines changed

‎contrib/README

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,10 @@ fulltextindex -
5555
Full text indexing using triggers
5656
by Maarten Boekhold <maartenb@dutepp0.et.tudelft.nl>
5757

58+
fuzzystrmatch -
59+
Levenshtein and Metaphone fuzzy string matching
60+
by Joe Conway <joseph.conway@home.com>
61+
5862
intarray -
5963
Index support for arrays of int4, using GiST
6064
by Teodor Sigaev <teodor@stack.net> and Oleg Bartunov

‎contrib/fuzzystrmatch/Makefile

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
subdir = contrib/fuzzystrmatch
2+
top_builddir = ../..
3+
include$(top_builddir)/src/Makefile.global
4+
5+
# override libdir to install shlib in contrib not main directory
6+
libdir :=$(libdir)/contrib
7+
8+
# shared library parameters
9+
NAME= fuzzystrmatch
10+
SO_MAJOR_VERSION= 0
11+
SO_MINOR_VERSION= 1
12+
13+
overrideCPPFLAGS := -I$(srcdir)/src/include$(CPPFLAGS)
14+
15+
OBJS= fuzzystrmatch.o
16+
17+
all: all-lib$(NAME).sql
18+
19+
# Shared library stuff
20+
include$(top_srcdir)/src/Makefile.shlib
21+
22+
23+
$(NAME).sql:$(NAME).sql.in
24+
sed -e's:MODULE_PATHNAME:$(libdir)/$(shlib):g'<$<>$@
25+
26+
install: all installdirs install-lib
27+
28+
installdirs:
29+
$(mkinstalldirs)$(DESTDIR)$(libdir)
30+
31+
uninstall: uninstall-lib
32+
33+
cleandistcleanmaintainer-clean: clean-lib
34+
rm -f$(OBJS)$(NAME).sql
35+
36+
dependdep:
37+
$(CC) -MM$(CFLAGS)*.c>depend
38+
39+
ifeq (depend,$(wildcard depend))
40+
include depend
41+
endif
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
/*
2+
* fuzzystrmatch.c
3+
*
4+
* Functions for "fuzzy" comparison of strings
5+
*
6+
* Copyright (c) Joseph Conway <joseph.conway@home.com>, 2001;
7+
*
8+
* levenshtein()
9+
* -------------
10+
* Written based on a description of the algorithm by Michael Gilleland
11+
* found at http://www.merriampark.com/ld.htm
12+
* Also looked at levenshtein.c in the PHP 4.0.6 distribution for
13+
* inspiration.
14+
*
15+
* metaphone()
16+
* -----------
17+
* Modified for PostgreSQL by Joe Conway.
18+
* Based on CPAN's "Text-Metaphone-1.96" by Michael G Schwern <schwern@pobox.com>
19+
* Code slightly modified for use as PostgreSQL function (palloc, elog, etc).
20+
* Metaphone was originally created by Lawrence Philips and presented in article
21+
* in "Computer Language" December 1990 issue.
22+
*
23+
* Permission to use, copy, modify, and distribute this software and its
24+
* documentation for any purpose, without fee, and without a written agreement
25+
* is hereby granted, provided that the above copyright notice and this
26+
* paragraph and the following two paragraphs appear in all copies.
27+
*
28+
* IN NO EVENT SHALL THE AUTHORS OR DISTRIBUTORS BE LIABLE TO ANY PARTY FOR
29+
* DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
30+
* LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS
31+
* DOCUMENTATION, EVEN IF THE AUTHOR OR DISTRIBUTORS HAVE BEEN ADVISED OF THE
32+
* POSSIBILITY OF SUCH DAMAGE.
33+
*
34+
* THE AUTHORS AND DISTRIBUTORS SPECIFICALLY DISCLAIM ANY WARRANTIES,
35+
* INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
36+
* AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS
37+
* ON AN "AS IS" BASIS, AND THE AUTHOR AND DISTRIBUTORS HAS NO OBLIGATIONS TO
38+
* PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
39+
*
40+
*/
41+
42+
43+
Version 0.1 (3 August, 2001):
44+
Functions to calculate the degree to which two strings match in a "fuzzy" way
45+
Tested under Linux (Red Hat 6.2 and 7.0) and PostgreSQL 7.2devel
46+
47+
Release Notes:
48+
49+
Version 0.1
50+
- initial release
51+
52+
Installation:
53+
Place these files in a directory called 'fuzzystrmatch' under 'contrib' in the PostgreSQL source tree. Then run:
54+
55+
make
56+
make install
57+
58+
You can use fuzzystrmatch.sql to create the functions in your database of choice, e.g.
59+
60+
psql -U postgres template1 < fuzzystrmatch.sql
61+
62+
installs following functions into database template1:
63+
64+
levenshtein() - calculates the levenshtein distance between two strings
65+
metaphone() - calculates the metaphone code of an input string
66+
67+
Documentation
68+
==================================================================
69+
Name
70+
71+
levenshtein -- calculates the levenshtein distance between two strings
72+
73+
Synopsis
74+
75+
levenshtein(text source, text target)
76+
77+
Inputs
78+
79+
source
80+
any text string, 255 characters max, NOT NULL
81+
82+
target
83+
any text string, 255 characters max, NOT NULL
84+
85+
Outputs
86+
87+
Returns int
88+
89+
Example usage
90+
91+
select levenshtein('GUMBO','GAMBOL');
92+
93+
==================================================================
94+
Name
95+
96+
metaphone -- calculates the metaphone code of an input string
97+
98+
Synopsis
99+
100+
metaphone(text source, int max_output_length)
101+
102+
Inputs
103+
104+
source
105+
any text string, 255 characters max, NOT NULL
106+
107+
max_output_length
108+
maximum length of the output metaphone code; if longer, the output
109+
is truncated to this length
110+
111+
Outputs
112+
113+
Returns text
114+
115+
Example usage
116+
117+
select metaphone('GUMBO',4);
118+
119+
==================================================================
120+
-- Joe Conway
121+

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp