Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit220f2a7

Browse files
committed
Code review for regexp_replace patch. Improve documentation and comments,
fix problems with replacement-string backslashes that aren't followed byone of the expected characters, avoid giving the impression thatreplace_text_regexp() is meant to be called directly as a SQL function,etc.
1 parent800af89 commit220f2a7

File tree

4 files changed

+146
-112
lines changed

4 files changed

+146
-112
lines changed

‎doc/src/sgml/func.sgml

Lines changed: 46 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.287 2005/10/02 23:50:06 tgl Exp $
2+
$PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.288 2005/10/18 20:38:57 tgl Exp $
33
PostgreSQL documentation
44
-->
55

@@ -1193,9 +1193,6 @@ PostgreSQL documentation
11931193
<indexterm>
11941194
<primary>quote_literal</primary>
11951195
</indexterm>
1196-
<indexterm>
1197-
<primary>regexp_replace</primary>
1198-
</indexterm>
11991196
<indexterm>
12001197
<primary>repeat</primary>
12011198
</indexterm>
@@ -1419,26 +1416,6 @@ PostgreSQL documentation
14191416
<entry><literal>'O''Reilly'</literal></entry>
14201417
</row>
14211418

1422-
<row>
1423-
<entry><literal><function>regexp_replace</function>(<parameter>source</parameter> <type>text</type>,
1424-
<parameter>pattern</parameter> <type>text</type>,
1425-
<parameter>replacement</parameter> <type>text</type>
1426-
<optional>, <parameter>flags</parameter> <type>text</type></optional>)</literal></entry>
1427-
<entry><type>text</type></entry>
1428-
<entry>Replace string that matches the regular expression
1429-
<parameter>pattern</parameter> in <parameter>source</parameter> to
1430-
<parameter>replacement</parameter>.
1431-
<parameter>replacement</parameter> can use <literal>\1</>-<literal>\9</> and <literal>\&amp;</>.
1432-
<literal>\1</>-<literal>\9</> is a back reference to the n'th subexpression, and
1433-
<literal>\&amp;</> is the entire matched string.
1434-
<parameter>flags</parameter> can use <literal>g</>(global) and <literal>i</>(ignore case).
1435-
When flags is not specified, case sensitive matching is used, and it replaces
1436-
only the instance.
1437-
</entry>
1438-
<entry><literal>regexp_replace('1112223333', '(\\d{3})(\\d{3})(\\d{4})', '(\\1) \\2-\\3')</literal></entry>
1439-
<entry><literal>(111) 222-3333</literal></entry>
1440-
</row>
1441-
14421419
<row>
14431420
<entry><literal><function>repeat</function>(<parameter>string</parameter> <type>text</type>, <parameter>number</parameter> <type>int</type>)</literal></entry>
14441421
<entry><type>text</type></entry>
@@ -2821,10 +2798,12 @@ cast(-44 as bit(12)) <lineannotation>111111010100</lineannotation>
28212798
<indexterm>
28222799
<primary>SIMILAR TO</primary>
28232800
</indexterm>
2824-
28252801
<indexterm>
28262802
<primary>substring</primary>
28272803
</indexterm>
2804+
<indexterm>
2805+
<primary>regexp_replace</primary>
2806+
</indexterm>
28282807

28292808
<synopsis>
28302809
<replaceable>string</replaceable> SIMILAR TO <replaceable>pattern</replaceable> <optional>ESCAPE <replaceable>escape-character</replaceable></optional>
@@ -3002,7 +2981,7 @@ substring('foobar' from '#"o_b#"%' for '#') <lineannotation>NULL</lineannotat
30022981
<para>
30032982
A regular expression is a character sequence that is an
30042983
abbreviated definition of a set of strings (a <firstterm>regular
3005-
set</firstterm>). A string is said to match a regular expression
2984+
set</firstterm>). A string is said to match a regular expression
30062985
if it is a member of the regular set described by the regular
30072986
expression. As with <function>LIKE</function>, pattern characters
30082987
match string characters exactly unless they are special characters
@@ -3027,7 +3006,8 @@ substring('foobar' from '#"o_b#"%' for '#') <lineannotation>NULL</lineannotat
30273006
<para>
30283007
The <function>substring</> function with two parameters,
30293008
<function>substring(<replaceable>string</replaceable> from
3030-
<replaceable>pattern</replaceable>)</function>, provides extraction of a substring
3009+
<replaceable>pattern</replaceable>)</function>, provides extraction of a
3010+
substring
30313011
that matches a POSIX regular expression pattern. It returns null if
30323012
there is no match, otherwise the portion of the text that matched the
30333013
pattern. But if the pattern contains any parentheses, the portion
@@ -3048,6 +3028,45 @@ substring('foobar' from 'o(.)b') <lineannotation>o</lineannotation>
30483028
</programlisting>
30493029
</para>
30503030

3031+
<para>
3032+
The <function>regexp_replace</> function provides substitution of
3033+
new text for substrings that match POSIX regular expression patterns.
3034+
It has the syntax
3035+
<function>regexp_replace</function>(<replaceable>source</>,
3036+
<replaceable>pattern</>, <replaceable>replacement</>
3037+
<optional>, <replaceable>flags</> </optional>).
3038+
The <replaceable>source</> string is returned unchanged if
3039+
there is no match to the <replaceable>pattern</>. If there is a
3040+
match, the <replaceable>source</> string is returned with the
3041+
<replaceable>replacement</> string substituted for the matching
3042+
substring. The <replaceable>replacement</> string can contain
3043+
<literal>\</><replaceable>n</>, where <replaceable>n</> is <literal>1</>
3044+
through <literal>9</>, to indicate that the source substring matching the
3045+
<replaceable>n</>'th parenthesized subexpression of the pattern should be
3046+
inserted, and it can contain <literal>\&amp;</> to indicate that the
3047+
substring matching the entire pattern should be inserted. Write
3048+
<literal>\\</> if you need to put a literal backslash in the replacement
3049+
text. (As always, remember to double backslashes written in literal
3050+
constant strings.)
3051+
The <replaceable>flags</> parameter is an optional text
3052+
string containing zero or more single-letter flags that change the
3053+
function's behavior. Flag <literal>i</> specifies case-insensitive
3054+
matching, while flag <literal>g</> specifies replacement of each matching
3055+
substring rather than only the first one.
3056+
</para>
3057+
3058+
<para>
3059+
Some examples:
3060+
<programlisting>
3061+
regexp_replace('foobarbaz', 'b..', 'X')
3062+
<lineannotation>fooXbaz</lineannotation>
3063+
regexp_replace('foobarbaz', 'b..', 'X', 'g')
3064+
<lineannotation>fooXX</lineannotation>
3065+
regexp_replace('foobarbaz', 'b(..)', 'X\\1Y', 'g')
3066+
<lineannotation>fooXarYXazY</lineannotation>
3067+
</programlisting>
3068+
</para>
3069+
30513070
<para>
30523071
<productname>PostgreSQL</productname>'s regular expressions are implemented
30533072
using a package written by Henry Spencer. Much of

‎src/backend/utils/adt/regexp.c

Lines changed: 19 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/utils/adt/regexp.c,v 1.59 2005/10/15 02:49:29 momjian Exp $
11+
* $PostgreSQL: pgsql/src/backend/utils/adt/regexp.c,v 1.60 2005/10/18 20:38:58 tgl Exp $
1212
*
1313
*Alistair Crooks added the code for the regex caching
1414
*agc - cached the regular expressions used - there's a good chance
@@ -83,15 +83,15 @@ static cached_re_str re_array[MAX_CACHED_RES];/* cached re's */
8383
/*
8484
* RE_compile_and_cache - compile a RE, caching if possible
8585
*
86-
* Returns regex_t
86+
* Returns regex_t *
8787
*
8888
*text_re --- the pattern, expressed as an *untoasted* TEXT object
8989
*cflags --- compile options for the pattern
9090
*
9191
* Pattern is given in the database encoding. We internally convert to
9292
* array of pg_wchar which is what Spencer's regex package wants.
9393
*/
94-
staticregex_t
94+
staticregex_t*
9595
RE_compile_and_cache(text*text_re,intcflags)
9696
{
9797
inttext_re_len=VARSIZE(text_re);
@@ -123,7 +123,7 @@ RE_compile_and_cache(text *text_re, int cflags)
123123
re_array[0]=re_temp;
124124
}
125125

126-
returnre_array[0].cre_re;
126+
return&re_array[0].cre_re;
127127
}
128128
}
129129

@@ -188,7 +188,7 @@ RE_compile_and_cache(text *text_re, int cflags)
188188
re_array[0]=re_temp;
189189
num_res++;
190190

191-
returnre_array[0].cre_re;
191+
return&re_array[0].cre_re;
192192
}
193193

194194
/*
@@ -212,7 +212,7 @@ RE_compile_and_execute(text *text_re, char *dat, int dat_len,
212212
pg_wchar*data;
213213
size_tdata_len;
214214
intregexec_result;
215-
regex_tre;
215+
regex_t*re;
216216
charerrMsg[100];
217217

218218
/* Convert data string to wide characters */
@@ -223,7 +223,7 @@ RE_compile_and_execute(text *text_re, char *dat, int dat_len,
223223
re=RE_compile_and_cache(text_re,cflags);
224224

225225
/* Perform RE match and return result */
226-
regexec_result=pg_regexec(&re_array[0].cre_re,
226+
regexec_result=pg_regexec(re,
227227
data,
228228
data_len,
229229
0,
@@ -237,8 +237,7 @@ RE_compile_and_execute(text *text_re, char *dat, int dat_len,
237237
if (regexec_result!=REG_OKAY&&regexec_result!=REG_NOMATCH)
238238
{
239239
/* re failed??? */
240-
pg_regerror(regexec_result,&re_array[0].cre_re,
241-
errMsg,sizeof(errMsg));
240+
pg_regerror(regexec_result,re,errMsg,sizeof(errMsg));
242241
ereport(ERROR,
243242
(errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
244243
errmsg("regular expression failed: %s",errMsg)));
@@ -442,31 +441,27 @@ textregexsubstr(PG_FUNCTION_ARGS)
442441

443442
/*
444443
* textregexreplace_noopt()
445-
*Return areplacestring matched by a regular expression.
446-
*This function is a version that doesn't specify the option of
447-
*textregexreplace. Thisis case sensitive, replace the first
448-
*instance only.
444+
*Return a string matched by a regular expression, with replacement.
445+
*
446+
* Thisversion doesn't have an option argument: we default to case
447+
* sensitive match, replace the firstinstance only.
449448
*/
450449
Datum
451450
textregexreplace_noopt(PG_FUNCTION_ARGS)
452451
{
453452
text*s=PG_GETARG_TEXT_P(0);
454453
text*p=PG_GETARG_TEXT_P(1);
455454
text*r=PG_GETARG_TEXT_P(2);
456-
regex_tre;
455+
regex_t*re;
457456

458457
re=RE_compile_and_cache(p,regex_flavor);
459458

460-
returnDirectFunctionCall4(replace_text_regexp,
461-
PointerGetDatum(s),
462-
PointerGetDatum(&re),
463-
PointerGetDatum(r),
464-
BoolGetDatum(false));
459+
PG_RETURN_TEXT_P(replace_text_regexp(s, (void*)re,r, false));
465460
}
466461

467462
/*
468463
* textregexreplace()
469-
*Return areplacestring matched by a regular expression.
464+
*Return a string matched by a regular expression, with replacement.
470465
*/
471466
Datum
472467
textregexreplace(PG_FUNCTION_ARGS)
@@ -478,9 +473,9 @@ textregexreplace(PG_FUNCTION_ARGS)
478473
char*opt_p=VARDATA(opt);
479474
intopt_len= (VARSIZE(opt)-VARHDRSZ);
480475
inti;
481-
boolglobal= false;
476+
boolglob= false;
482477
boolignorecase= false;
483-
regex_tre;
478+
regex_t*re;
484479

485480
/* parse options */
486481
for (i=0;i<opt_len;i++)
@@ -491,8 +486,7 @@ textregexreplace(PG_FUNCTION_ARGS)
491486
ignorecase= true;
492487
break;
493488
case'g':
494-
global= true;
495-
489+
glob= true;
496490
break;
497491
default:
498492
ereport(ERROR,
@@ -508,11 +502,7 @@ textregexreplace(PG_FUNCTION_ARGS)
508502
else
509503
re=RE_compile_and_cache(p,regex_flavor);
510504

511-
returnDirectFunctionCall4(replace_text_regexp,
512-
PointerGetDatum(s),
513-
PointerGetDatum(&re),
514-
PointerGetDatum(r),
515-
BoolGetDatum(global));
505+
PG_RETURN_TEXT_P(replace_text_regexp(s, (void*)re,r,glob));
516506
}
517507

518508
/* similar_escape()

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp