- Notifications
You must be signed in to change notification settings - Fork5
Commit579840c
committed
Fix O(N^2) performance problems in regular-expression compiler.
Change the singly-linked in-arc and out-arc lists to be doubly-linked,so that arc deletion is constant time rather than having worst-case timeproportional to the number of other arcs on the connected states.Modify the bulk arc transfer operations copyins(), copyouts(), moveins(),moveouts() so that they use a sort-and-merge algorithm whenever there'smore than a small number of arcs to be copied or moved. The previousmethod is O(N^2) in the number of arcs involved, because it performsduplicate checking independently for each copied arc. The new method maychange the ordering of existing arcs for the destination state, but nothingreally cares about that.Provide another bulk arc copying method mergeins(), which is unused asof this commit but is needed for the next one. It basically is likecopyins(), but the source arcs might not all come from the same state.Replace the O(N^2) bubble-sort algorithm used in carcsort() with a qsort()call.These changes greatly improve the performance of regex compilation forlarge or complex regexes, at the cost of extra space for arc storage duringcompilation. The original tradeoff was probably fine when it was made, butnow we care more about speed and less about memory consumption.Back-patch to all supported branches.1 parent48789c5 commit579840c
File tree
3 files changed
+728
-94
lines changed- src
- backend/regex
- include/regex
3 files changed
+728
-94
lines changed0 commit comments
Comments
(0)