forked frompostgres/postgres
- Notifications
You must be signed in to change notification settings - Fork6
Commitea1268f
committed
Avoid generating extra subre tree nodes for capturing parentheses.
Previously, each pair of capturing parentheses gave rise to a separatesubre tree node, whose only function was to identify that we ought tocapture the match details for this particular sub-expression. Inmost cases we don't really need that, since we can perfectly wellput a "capture this" annotation on the child node that does the realmatching work. As with the two preceding commits, the main valueof this is to avoid generating and optimizing an NFA for a tree nodethat's not really pulling its weight.The chosen data representation only allows one capture annotationper subre node. In the legal-per-spec, but seemingly not very useful,case where there are multiple capturing parens around the exact samebit of the regex (i.e. "((xyz))"), wrap the child node in N-1 capturenodes that act the same as before. We could work harder at that butI'll refrain, pending some evidence that such cases are worth troublingover.In passing, improve the comments in regex.h to say what all thedifferent re_info bits mean. Some of them were pretty obviousbut others not so much, so reverse-engineer some documentation.This is part of a patch series that in total reduces the regex engine'sruntime by about a factor of four on a large corpus of real-world regexes.Patch by me, reviewed by Joel JacobsonDiscussion:https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us1 parent5810430 commitea1268f
6 files changed
+91
-51
lines changedOriginal file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
130 | 130 |
| |
131 | 131 |
| |
132 | 132 |
| |
133 |
| - | |
134 |
| - | |
| 133 | + | |
| 134 | + | |
135 | 135 |
| |
136 | 136 |
| |
137 | 137 |
| |
| |||
147 | 147 |
| |
148 | 148 |
| |
149 | 149 |
| |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
150 | 161 |
| |
151 | 162 |
| |
152 | 163 |
| |
|
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
452 | 452 |
| |
453 | 453 |
| |
454 | 454 |
| |
455 |
| - | |
| 455 | + | |
456 | 456 |
| |
457 | 457 |
| |
458 | 458 |
| |
| |||
944 | 944 |
| |
945 | 945 |
| |
946 | 946 |
| |
947 |
| - | |
| 947 | + | |
| 948 | + | |
| 949 | + | |
| 950 | + | |
| 951 | + | |
| 952 | + | |
| 953 | + | |
948 | 954 |
| |
949 | 955 |
| |
950 | 956 |
| |
| |||
959 | 965 |
| |
960 | 966 |
| |
961 | 967 |
| |
962 |
| - | |
963 |
| - | |
964 |
| - | |
965 |
| - | |
966 |
| - | |
| 968 | + | |
| 969 | + | |
| 970 | + | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
| 974 | + | |
| 975 | + | |
| 976 | + | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
967 | 983 |
| |
968 | 984 |
| |
969 | 985 |
| |
| |||
976 | 992 |
| |
977 | 993 |
| |
978 | 994 |
| |
979 |
| - | |
| 995 | + | |
980 | 996 |
| |
981 | 997 |
| |
982 | 998 |
| |
| |||
1276 | 1292 |
| |
1277 | 1293 |
| |
1278 | 1294 |
| |
| 1295 | + | |
1279 | 1296 |
| |
1280 |
| - | |
| 1297 | + | |
| 1298 | + | |
1281 | 1299 |
| |
1282 | 1300 |
| |
1283 | 1301 |
| |
| |||
1790 | 1808 |
| |
1791 | 1809 |
| |
1792 | 1810 |
| |
| 1811 | + | |
1793 | 1812 |
| |
1794 |
| - | |
| 1813 | + | |
| 1814 | + | |
1795 | 1815 |
| |
1796 | 1816 |
| |
1797 | 1817 |
| |
| |||
1893 | 1913 |
| |
1894 | 1914 |
| |
1895 | 1915 |
| |
1896 |
| - | |
| 1916 | + | |
1897 | 1917 |
| |
1898 | 1918 |
| |
1899 | 1919 |
| |
| |||
2040 | 2060 |
| |
2041 | 2061 |
| |
2042 | 2062 |
| |
2043 |
| - | |
| 2063 | + | |
2044 | 2064 |
| |
2045 | 2065 |
| |
2046 | 2066 |
| |
| |||
2163 | 2183 |
| |
2164 | 2184 |
| |
2165 | 2185 |
| |
2166 |
| - | |
| 2186 | + | |
2167 | 2187 |
| |
2168 | 2188 |
| |
2169 | 2189 |
| |
| |||
2227 | 2247 |
| |
2228 | 2248 |
| |
2229 | 2249 |
| |
2230 |
| - | |
2231 |
| - | |
| 2250 | + | |
| 2251 | + | |
| 2252 | + | |
| 2253 | + | |
| 2254 | + | |
| 2255 | + | |
2232 | 2256 |
| |
2233 | 2257 |
| |
2234 | 2258 |
| |
|
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
825 | 825 |
| |
826 | 826 |
| |
827 | 827 |
| |
828 |
| - | |
| 828 | + | |
829 | 829 |
| |
830 | 830 |
| |
831 | 831 |
| |
832 | 832 |
| |
833 |
| - | |
| 833 | + | |
834 | 834 |
| |
835 | 835 |
| |
836 | 836 |
| |
| |||
843 | 843 |
| |
844 | 844 |
| |
845 | 845 |
| |
846 |
| - | |
| 846 | + | |
847 | 847 |
| |
848 | 848 |
| |
849 | 849 |
| |
|
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
640 | 640 |
| |
641 | 641 |
| |
642 | 642 |
| |
| 643 | + | |
643 | 644 |
| |
644 | 645 |
| |
645 |
| - | |
| 646 | + | |
646 | 647 |
| |
647 |
| - | |
648 |
| - | |
649 |
| - | |
650 | 648 |
| |
651 | 649 |
| |
652 | 650 |
| |
| |||
667 | 665 |
| |
668 | 666 |
| |
669 | 667 |
| |
670 |
| - | |
| 668 | + | |
671 | 669 |
| |
672 | 670 |
| |
673 | 671 |
| |
| |||
739 | 737 |
| |
740 | 738 |
| |
741 | 739 |
| |
742 |
| - | |
| 740 | + | |
743 | 741 |
| |
744 |
| - | |
| 742 | + | |
745 | 743 |
| |
746 |
| - | |
747 |
| - | |
748 | 744 |
| |
749 | 745 |
| |
750 | 746 |
| |
| |||
758 | 754 |
| |
759 | 755 |
| |
760 | 756 |
| |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
761 | 763 |
| |
762 | 764 |
| |
763 | 765 |
| |
| |||
932 | 934 |
| |
933 | 935 |
| |
934 | 936 |
| |
935 |
| - | |
| 937 | + | |
936 | 938 |
| |
937 | 939 |
| |
938 | 940 |
| |
|
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
56 | 56 |
| |
57 | 57 |
| |
58 | 58 |
| |
59 |
| - | |
60 |
| - | |
61 |
| - | |
62 |
| - | |
63 |
| - | |
64 |
| - | |
65 |
| - | |
66 |
| - | |
67 |
| - | |
68 |
| - | |
69 |
| - | |
70 |
| - | |
71 |
| - | |
72 |
| - | |
73 |
| - | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
74 | 76 |
| |
75 | 77 |
| |
76 | 78 |
| |
|
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
422 | 422 |
| |
423 | 423 |
| |
424 | 424 |
| |
425 |
| - | |
| 425 | + | |
426 | 426 |
| |
427 | 427 |
| |
428 | 428 |
| |
| |||
446 | 446 |
| |
447 | 447 |
| |
448 | 448 |
| |
449 |
| - | |
450 |
| - | |
| 449 | + | |
| 450 | + | |
451 | 451 |
| |
452 | 452 |
| |
453 | 453 |
| |
| |||
457 | 457 |
| |
458 | 458 |
| |
459 | 459 |
| |
460 |
| - | |
461 |
| - | |
462 |
| - | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
463 | 464 |
| |
464 | 465 |
| |
465 | 466 |
| |
|
0 commit comments
Comments
(0)