Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.7k
Closed
Description
re.sub() is relatively slow, because for every match it calls a Python code.
Implementing it in C allows to speed upre.sub() to 2-3 times.
$ ./python -m timeit -s 'import re; s = "a"' 're.sub("(a)", r"\1", s)'100000 loops, best of 5: 2.45 usec per loop500000 loops, best of 5: 860 nsec per loop$ ./python -m timeit -s 'import re; s = "a"; p = re.compile("(a)")' 'p.sub(r"\1", s)'200000 loops, best of 5: 1.79 usec per loop500000 loops, best of 5: 546 nsec per loop$ ./python -m timeit -s 'import re; s = "a"*10**3' 're.sub("(a)", r"\1", s)'500 loops, best of 5: 620 usec per loop1000 loops, best of 5: 252 usec per loop$ ./python -m timeit -s 'import re; s = "a"' 're.sub("(a)", r"b", s)'500000 loops, best of 5: 711 nsec per loop500000 loops, best of 5: 663 nsec per loop$ ./python -m timeit -s 'import re; s = "a"' 're.sub("(a)", r"\n", s)'200000 loops, best of 5: 1.7 usec per loop500000 loops, best of 5: 864 nsec per loopInitially I also implemented a public API for explicit compilation of the replacement string, but then left it to a separate issue.