Return to Answer

edited body

Source Link

editedAug 4, 2012 at 1:54

Bill the Lizard

editedAug 4, 2012 at 1:54

Bill the Lizard

407.4k
213
579
892

Example verbose syntax (fromDive into Python Dive into Python):

Example verbose syntax (fromDive into Python):

alternative readable regex method

Source Link

editedJun 27, 2009 at 22:26

Roger Pate

editedJun 27, 2009 at 22:26

Roger Pate

>>> pattern = """... ^                   # beginning of string... M{0,4}              # thousands - 0 to 4 M's... (CM|CD|D?C{0,3})    # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),...                     #            or 500-800 (D, followed by 0 to 3 C's)... (XC|XL|L?X{0,3})    # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 X's),...                     #        or 50-80 (L, followed by 0 to 3 X's)... (IX|IV|V?I{0,3})    # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),...                     #        or 5-8 (V, followed by 0 to 3 I's)... $                   # end of string... """>>> re.search(pattern, 'M', re.VERBOSE)

>>> p = re.compile(r'(?P<word>\b\w+\b)')>>> m = p.search( '(((( Lots of punctuation )))' )>>> m.group('word')'Lots'

You can also verbosely write a regex without usingre.VERBOSE thanks to string literal concatenation.

>>> pattern = (...     "^"                 # beginning of string...     "M{0,4}"            # thousands - 0 to 4 M's...     "(CM|CD|D?C{0,3})"  # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),...                         #            or 500-800 (D, followed by 0 to 3 C's)...     "(XC|XL|L?X{0,3})"  # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 X's),...                         #        or 50-80 (L, followed by 0 to 3 X's)...     "(IX|IV|V?I{0,3})"  # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),...                         #        or 5-8 (V, followed by 0 to 3 I's)...     "$"                 # end of string... )>>> print pattern"^M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$"

>>> pattern = """ ^                   # beginning of string M{0,4}              # thousands - 0 to 4 M's (CM|CD|D?C{0,3})    # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),                     #            or 500-800 (D, followed by 0 to 3 C's) (XC|XL|L?X{0,3})    # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 X's),                     #        or 50-80 (L, followed by 0 to 3 X's) (IX|IV|V?I{0,3})    # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),                     #        or 5-8 (V, followed by 0 to 3 I's) $                   # end of string """>>> re.search(pattern, 'M', re.VERBOSE)

>>> p = re.compile(r'(?P<word>\b\w+\b)')>>> m = p.search( '(((( Lots of punctuation )))' )>>> m.group('word')'Lots'

>>> pattern = """... ^                   # beginning of string... M{0,4}              # thousands - 0 to 4 M's... (CM|CD|D?C{0,3})    # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),...                     #            or 500-800 (D, followed by 0 to 3 C's)... (XC|XL|L?X{0,3})    # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 X's),...                     #        or 50-80 (L, followed by 0 to 3 X's)... (IX|IV|V?I{0,3})    # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),...                     #        or 5-8 (V, followed by 0 to 3 I's)... $                   # end of string... """>>> re.search(pattern, 'M', re.VERBOSE)

>>> p = re.compile(r'(?P<word>\b\w+\b)')>>> m = p.search( '(((( Lots of punctuation )))' )>>> m.group('word')'Lots'

You can also verbosely write a regex without usingre.VERBOSE thanks to string literal concatenation.

>>> pattern = (...     "^"                 # beginning of string...     "M{0,4}"            # thousands - 0 to 4 M's...     "(CM|CD|D?C{0,3})"  # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),...                         #            or 500-800 (D, followed by 0 to 3 C's)...     "(XC|XL|L?X{0,3})"  # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 X's),...                         #        or 50-80 (L, followed by 0 to 3 X's)...     "(IX|IV|V?I{0,3})"  # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),...                         #        or 5-8 (V, followed by 0 to 3 I's)...     "$"                 # end of string... )>>> print pattern"^M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$"

Post Made Community Wiki byCommunityBot

occurredSep 21, 2008 at 15:00

Source Link

answeredSep 19, 2008 at 12:44

MvdD

answeredSep 19, 2008 at 12:44

MvdD

23.7k
9
72
99

Readable regular expressions

In Python you can split a regular expression over multiple lines, name your matches and insert comments.

Example verbose syntax (fromDive into Python):

>>> pattern = """    ^                   # beginning of string    M{0,4}              # thousands - 0 to 4 M's    (CM|CD|D?C{0,3})    # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),                        #            or 500-800 (D, followed by 0 to 3 C's)    (XC|XL|L?X{0,3})    # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 X's),                        #        or 50-80 (L, followed by 0 to 3 X's)    (IX|IV|V?I{0,3})    # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),                        #        or 5-8 (V, followed by 0 to 3 I's)    $                   # end of string    """>>> re.search(pattern, 'M', re.VERBOSE)

Example naming matches (fromRegular Expression HOWTO)

>>> p = re.compile(r'(?P<word>\b\w+\b)')>>> m = p.search( '(((( Lots of punctuation )))' )>>> m.group('word')'Lots'

lang-py

Movatterモバイル変換

Collectives™ on Stack Overflow

Return to Answer