
This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.
Created on2015-03-03 13:53 byserhiy.storchaka, last changed2022-04-11 14:58 byadmin. This issue is nowclosed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| str_find_faster.patch | serhiy.storchaka,2015-03-03 13:53 | review | ||
| issue23573_bytes_rfind_memrchr.patch | serhiy.storchaka,2015-07-18 14:38 | review | ||
| Messages (10) | |||
|---|---|---|---|
| msg237137 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2015-03-03 13:53 | |
Currently str.find() and similar methods can make a copy of self or searched string if they have different kinds. In some cases this is redundant because the result can be known before trying to search. Longer string can't be found in shorter string and wider string can't be found in narrower string. Proposed patch avoid creating temporary widened copies in such corner cases. It also adds special cases for searching 1-character strings.Some sample microbenchmark results:$ ./python -m timeit -s "a = 'x'; b = 'x\U00012345'" -- "b.find(a)"Unpatched: 1000000 loops, best of 3: 1.92 usec per loopPatched: 1000000 loops, best of 3: 1.03 usec per loop$ ./python -m timeit -s "a = 'x'; b = 'x\U00012345'" -- "a in b"Unpatched: 1000000 loops, best of 3: 0.543 usec per loopPatched: 1000000 loops, best of 3: 0.25 usec per loop$ ./python -m timeit -s "a = '\U00012345'; b = 'x'*1000" -- "b.find(a)"Unpatched: 100000 loops, best of 3: 4.58 usec per loopPatched: 1000000 loops, best of 3: 0.969 usec per loop$ ./python -m timeit -s "a = 'x'*1000; b = '\U00012345'" -- "b.find(a)"Unpatched: 100000 loops, best of 3: 3.77 usec per loopPatched: 1000000 loops, best of 3: 0.97 usec per loop$ ./python -m timeit -s "a = 'x'*1000; b = '\U00012345'" -- "a in b"Unpatched: 100000 loops, best of 3: 2.4 usec per loopPatched: 1000000 loops, best of 3: 0.225 usec per loop | |||
| msg239171 -(view) | Author: Roundup Robot (python-dev)![]() | Date: 2015-03-24 19:58 | |
New changeset6db9d7c1be29 by Serhiy Storchaka in branch 'default':Issue#23573: Increased performance of string search operations (str.find,https://hg.python.org/cpython/rev/6db9d7c1be29 | |||
| msg239187 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2015-03-24 22:01 | |
Looks as this patch makes buildbots crash. | |||
| msg239209 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2015-03-25 01:47 | |
> Looks as this patch makes buildbots crash.Yep. It took me some minutes to find that the crash was caused by this issue :-phttp://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%203.x/builds/5930/steps/test/logs/stdio...[117/393/1] test_bigmemAssertion failed: 0, file c:\buildbot.python.org\3.x.kloth-win64\build\objects\stringlib/fastsearch.h, line 76Fatal Python error: AbortedCurrent thread 0x000010ec (most recent call first): File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_bigmem.py", line 294 in test_rfind File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\support\__init__.py", line 1641 in wrapper File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\case.py", line 577 in run File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\case.py", line 625 in __call__ File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 122 in run File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 84 in __call__ File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 122 in run File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 84 in __call__ File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\runner.py", line 176 in run File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\support\__init__.py", line 1773 in _run_suite File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\support\__init__.py", line 1807 in run_unittest File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_bigmem.py", line 1252 in test_main... | |||
| msg239211 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2015-03-25 01:58 | |
The problem is that Windows has no memrchr() function, and so fastsearch_memchr_1char() only supports FAST_SEARCH on Windows. | |||
| msg239212 -(view) | Author: Roundup Robot (python-dev)![]() | Date: 2015-03-25 02:17 | |
New changeset3ac58de829ef by Victor Stinner in branch 'default':Issue#23573: Fix bytes.rfind() and bytearray.rfind() on Windowshttps://hg.python.org/cpython/rev/3ac58de829ef | |||
| msg239213 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2015-03-25 02:19 | |
It looks like fastsearch_memchr_1char() manipulate pointers for memory alignment. It's not necessary when looking for ASCII or Latin1 characters or for bytes.I propose to add a new fastsearch_memchr_1byte() function which would be used by bytes and bytearray, but also by str for ASCII and Latin1 strings.Are you interested to implement this idea Serhiy?For Windows without memrchr(), the code can be a simple loop. | |||
| msg239220 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2015-03-25 06:21 | |
Many thanks Victor for fixing crashes. Unfortunately I couldn't reproduce a crash on my computers, perhaps it is was 64-bit only.Yes, I'll look how the code can be optimized. | |||
| msg246901 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2015-07-18 14:38 | |
Here is a patch that restores optimization of bytes.rfind() and bytearray.rfind() with 1-byte argument on Linux (it also revertsbc1a178b3bc8). | |||
| msg247002 -(view) | Author: Roundup Robot (python-dev)![]() | Date: 2015-07-20 19:59 | |
New changeset311a4d28631b by Serhiy Storchaka in branch '3.5':Issue#23573: Restored optimization of bytes.rfind() and bytearray.rfind()https://hg.python.org/cpython/rev/311a4d28631bNew changesetc06410c68217 by Serhiy Storchaka in branch 'default':Issue#23573: Restored optimization of bytes.rfind() and bytearray.rfind()https://hg.python.org/cpython/rev/c06410c68217 | |||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:13 | admin | set | github: 67761 |
| 2015-07-21 05:26:24 | serhiy.storchaka | set | status: open -> closed stage: patch review -> resolved resolution: fixed versions: + Python 3.6 |
| 2015-07-20 19:59:11 | python-dev | set | messages: +msg247002 |
| 2015-07-18 14:38:46 | serhiy.storchaka | set | files: +issue23573_bytes_rfind_memrchr.patch nosy: +christian.heimes messages: +msg246901 resolution: fixed -> (no value) stage: resolved -> patch review |
| 2015-03-25 06:21:57 | serhiy.storchaka | set | messages: +msg239220 |
| 2015-03-25 02:19:38 | vstinner | set | messages: +msg239213 |
| 2015-03-25 02:17:15 | python-dev | set | messages: +msg239212 |
| 2015-03-25 01:58:00 | vstinner | set | messages: +msg239211 |
| 2015-03-25 01:47:05 | vstinner | set | nosy: +vstinner messages: +msg239209 |
| 2015-03-24 22:01:29 | serhiy.storchaka | set | status: closed -> open messages: +msg239187 |
| 2015-03-24 19:59:49 | serhiy.storchaka | set | status: open -> closed assignee:serhiy.storchaka resolution: fixed stage: patch review -> resolved |
| 2015-03-24 19:58:17 | python-dev | set | nosy: +python-dev messages: +msg239171 |
| 2015-03-03 13:53:30 | serhiy.storchaka | create | |