Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit9eb78be

Browse files
author
Neil Conway
committed
Add three new regexp functions: regexp_matches, regexp_split_to_array,
and regexp_split_to_table. These functions provide access to thecapture groups resulting from a POSIX regular expression match,and provide the ability to split a string on a POSIX regularexpression, respectively. Patch from Jeremy Drake; code review byNeil Conway, additional comments and suggestions from Tom andPeter E.This patch bumps the catversion, adds some regression tests,and updates the docs.
1 parent5e96b04 commit9eb78be

File tree

7 files changed

+1001
-70
lines changed

7 files changed

+1001
-70
lines changed

‎doc/src/sgml/func.sgml

Lines changed: 176 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.369 2007/02/2019:59:04 momjian Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.370 2007/03/2005:44:59 neilc Exp $ -->
22

33
<chapter id="functions">
44
<title>Functions and Operators</title>
@@ -1468,17 +1468,52 @@
14681468
</row>
14691469

14701470
<row>
1471-
<entry><literal><function>regexp_replace</function>(<parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type>, <parameter>replacement</parameter> <type>text</type> [,<parameter>flags</parameter> <type>text</type>])</literal></entry>
1471+
<entry><literal><function>regexp_matches</function>(<parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type> [, <parameter>flags</parameter> <type>text</type>])</literal></entry>
1472+
<entry><type>setof text[]</type></entry>
1473+
<entry>
1474+
Return all capture groups resulting from matching POSIX regular
1475+
expression against the <parameter>string</parameter>. See
1476+
<xref linkend="functions-posix-regexp"> for more information.
1477+
</entry>
1478+
<entry><literal>regexp_matches('foobarbequebaz', '(bar)(beque)')</literal></entry>
1479+
<entry><literal>{bar,beque}</literal></entry>
1480+
</row>
1481+
1482+
<row>
1483+
<entry><literal><function>regexp_replace</function>(<parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type>, <parameter>replacement</parameter> <type>text</type> [, <parameter>flags</parameter> <type>text</type>])</literal></entry>
14721484
<entry><type>text</type></entry>
14731485
<entry>
14741486
Replace substring matching POSIX regular expression. See
1475-
<xref linkend="functions-matching"> for more information on pattern
1476-
matching.
1487+
<xref linkend="functions-posix-regexp"> for more information.
14771488
</entry>
14781489
<entry><literal>regexp_replace('Thomas', '.[mN]a.', 'M')</literal></entry>
14791490
<entry><literal>ThM</literal></entry>
14801491
</row>
14811492

1493+
<row>
1494+
<entry><literal><function>regexp_split_to_array</function>(<parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type> [, <parameter>flags</parameter> <type>text</type> ])</literal></entry>
1495+
<entry><type>text[]</type></entry>
1496+
<entry>
1497+
Split <parameter>string</parameter> using POSIX regular expression as
1498+
the delimiter. See <xref linkend="functions-posix-regexp"> for more
1499+
information.
1500+
</entry>
1501+
<entry><literal>regexp_split_to_array('hello world', E'\\s+')</literal></entry>
1502+
<entry><literal>{hello,world}</literal></entry>
1503+
</row>
1504+
1505+
<row>
1506+
<entry><literal><function>regexp_split_to_table</function>(<parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type> [, <parameter>flags</parameter> <type>text</type>])</literal></entry>
1507+
<entry><type>setof text</type></entry>
1508+
<entry>
1509+
Split <parameter>string</parameter> using POSIX regular expression as
1510+
the delimiter. See <xref linkend="functions-posix-regexp"> for more
1511+
information.
1512+
</entry>
1513+
<entry><literal>regexp_split_to_table('hello world', E'\\s+')</literal></entry>
1514+
<entry><literal>hello</literal><para><literal>world</literal></para> (2 rows)</entry>
1515+
</row>
1516+
14821517
<row>
14831518
<entry><literal><function>repeat</function>(<parameter>string</parameter> <type>text</type>, <parameter>number</parameter> <type>int</type>)</literal></entry>
14841519
<entry><type>text</type></entry>
@@ -2883,9 +2918,6 @@ cast(-44 as bit(12)) <lineannotation>111111010100</lineannotation>
28832918
<indexterm>
28842919
<primary>substring</primary>
28852920
</indexterm>
2886-
<indexterm>
2887-
<primary>regexp_replace</primary>
2888-
</indexterm>
28892921

28902922
<synopsis>
28912923
<replaceable>string</replaceable> SIMILAR TO <replaceable>pattern</replaceable> <optional>ESCAPE <replaceable>escape-character</replaceable></optional>
@@ -3004,6 +3036,21 @@ substring('foobar' from '#"o_b#"%' for '#') <lineannotation>NULL</lineannotat
30043036
<primary>regular expression</primary>
30053037
<seealso>pattern matching</seealso>
30063038
</indexterm>
3039+
<indexterm>
3040+
<primary>substring</primary>
3041+
</indexterm>
3042+
<indexterm>
3043+
<primary>regexp_replace</primary>
3044+
</indexterm>
3045+
<indexterm>
3046+
<primary>regexp_matches</primary>
3047+
</indexterm>
3048+
<indexterm>
3049+
<primary>regexp_split_to_table</primary>
3050+
</indexterm>
3051+
<indexterm>
3052+
<primary>regexp_split_to_array</primary>
3053+
</indexterm>
30073054

30083055
<para>
30093056
<xref linkend="functions-posix-table"> lists the available
@@ -3134,7 +3181,10 @@ substring('foobar' from 'o(.)b') <lineannotation>o</lineannotation>
31343181
string containing zero or more single-letter flags that change the
31353182
function's behavior. Flag <literal>i</> specifies case-insensitive
31363183
matching, while flag <literal>g</> specifies replacement of each matching
3137-
substring rather than only the first one.
3184+
substring rather than only the first one. Other supported flags are
3185+
<literal>m</>, <literal>n</>, <literal>p</>, <literal>w</> and
3186+
<literal>x</>, whose meanings correspond to those shown in
3187+
<xref linkend="posix-embedded-options-table">.
31383188
</para>
31393189

31403190
<para>
@@ -3149,6 +3199,124 @@ regexp_replace('foobarbaz', 'b(..)', E'X\\1Y', 'g')
31493199
</programlisting>
31503200
</para>
31513201

3202+
<para>
3203+
The <function>regexp_matches</> function returns all of the capture
3204+
groups resulting from matching a POSIX regular expression pattern.
3205+
It has the syntax
3206+
<function>regexp_matches</function>(<replaceable>string</>, <replaceable>pattern</>
3207+
<optional>, <replaceable>flags</> </optional>).
3208+
If there is no match to the <replaceable>pattern</>, the function returns no rows.
3209+
If there is a match, the function returns the contents of all of the capture groups
3210+
in a text array, or if there were no capture groups in the pattern, it returns the
3211+
contents of the entire match as a single-element text array.
3212+
The <replaceable>flags</> parameter is an optional text
3213+
string containing zero or more single-letter flags that change the
3214+
function's behavior. Flag <literal>i</> specifies case-insensitive
3215+
matching, while flag <literal>g</> causes the return of each matching
3216+
substring rather than only the first one. Other supported
3217+
flags are <literal>m</>, <literal>n</>, <literal>p</>, <literal>w</> and
3218+
<literal>x</>, whose meanings are described in
3219+
<xref linkend="posix-embedded-options-table">.
3220+
</para>
3221+
3222+
<para>
3223+
Some examples:
3224+
<programlisting>
3225+
SELECT regexp_matches('foobarbequebaz', '(bar)(beque)');
3226+
regexp_matches
3227+
----------------
3228+
{bar,beque}
3229+
(1 row)
3230+
3231+
SELECT regexp_matches('foobarbequebazilbarfbonk', '(b[^b]+)(b[^b]+)', 'g');
3232+
regexp_matches
3233+
----------------
3234+
{bar,beque}
3235+
{bazil,barf}
3236+
(2 rows)
3237+
3238+
SELECT regexp_matches('foobarbequebaz', 'barbeque');
3239+
regexp_matches
3240+
----------------
3241+
{barbeque}
3242+
(1 row)
3243+
</programlisting>
3244+
</para>
3245+
3246+
<para>
3247+
The <function>regexp_split_to_table</> function splits a string using a POSIX
3248+
regular expression pattern as a delimiter. It has the syntax
3249+
<function>regexp_split_to_table</function>(<replaceable>string</>, <replaceable>pattern</>
3250+
<optional>, <replaceable>flags</> </optional>).
3251+
If there is no match to the <replaceable>pattern</>, the function returns the
3252+
<replaceable>string</>. If there is at least one match, for each match it returns
3253+
the text from the end of the last match (or the beginning of the string)
3254+
to the beginning of the match. When there are no more matches, it
3255+
returns the text from the end of the last match to the end of the string.
3256+
The <replaceable>flags</> parameter is an optional text string containing
3257+
zero or more single-letter flags that change the function's behavior.
3258+
<function>regexp_split_to_table</function> supports the flags <literal>i</>,
3259+
<literal>m</>, <literal>n</>, <literal>p</>, <literal>w</> and
3260+
<literal>x</>, whose meanings are described in
3261+
<xref linkend="posix-embedded-options-table">.
3262+
</para>
3263+
3264+
<para>
3265+
The <function>regexp_split_to_array</> function behaves the same as
3266+
<function>regexp_split_to_table</>, except that <function>regexp_split_to_array</>
3267+
returns its results as a <type>text[]</>. It has the syntax
3268+
<function>regexp_split_to_array</function>(<replaceable>string</>, <replaceable>pattern</>
3269+
<optional>, <replaceable>flags</> </optional>).
3270+
The parameters are the same as for <function>regexp_split_to_table</>.
3271+
</para>
3272+
3273+
<para>
3274+
Some examples:
3275+
<programlisting>
3276+
3277+
SELECT foo FROM regexp_split_to_table('the quick brown fox jumped over the lazy dog', E'\\\s+') AS foo;
3278+
foo
3279+
--------
3280+
the
3281+
quick
3282+
brown
3283+
fox
3284+
jumped
3285+
over
3286+
the
3287+
lazy
3288+
dog
3289+
(9 rows)
3290+
3291+
SELECT regexp_split_to_array('the quick brown fox jumped over the lazy dog', E'\\s+');
3292+
regexp_split_to_array
3293+
------------------------------------------------
3294+
{the,quick,brown,fox,jumped,over,the,lazy,dog}
3295+
(1 row)
3296+
3297+
SELECT foo FROM regexp_split_to_table('the quick brown fox', E'\\s*') AS foo;
3298+
foo
3299+
-----
3300+
t
3301+
h
3302+
e
3303+
q
3304+
u
3305+
i
3306+
c
3307+
k
3308+
b
3309+
r
3310+
o
3311+
w
3312+
n
3313+
f
3314+
o
3315+
x
3316+
(16 rows)
3317+
</programlisting>
3318+
</para>
3319+
31523320
<para>
31533321
<productname>PostgreSQL</productname>'s regular expressions are implemented
31543322
using a package written by Henry Spencer. Much of

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp