forked frompostgres/postgres
- Notifications
You must be signed in to change notification settings - Fork6
Commit11dce63
committed
Fix recently-introduced performance problem in ts_headline().
The new hlCover() algorithm that I introduced in commitc9b0c67turns out to potentially take O(N^2) or worse time on long documents,if there are many occurrences of individual query words but few or nosubstrings that actually satisfy the query. (One way to hit thisbehavior is with a "common_word & rare_word" type of query.) Thisseems unavoidable given the original goal of checking every substringof the document, so we have to back off that idea. Fortunately, itseems unlikely that anyone would really want headlines spanning all ofa long document, so we can avoid the worse-than-linear behavior byimposing a maximum length of substring that we'll consider.For now, just hard-wire that maximum length as a multiple of max_wordstimes max_fragments. Perhaps at some point somebody will argue forexposing it as a ts_headline parameter, but I'm hesitant to make sucha feature addition in a back-patched bug fix.I also noted that the hlFirstIndex() function I'd added in thatcommit was unnecessarily stupid: it really only needs to check whethera HeadlineWordEntry's item pointer is null or not. This wouldn't makeall that much difference in typical cases with queries having justa few terms, but a cycle shaved is a cycle earned.In addition, add a CHECK_FOR_INTERRUPTS call in TS_execute_recurse.This ensures that hlCover's loop is cancellable if it manages to takea long time, and it may protect some other TS_execute callers as well.Back-patch to 9.6 as the previous commit was. I also chose to add theCHECK_FOR_INTERRUPTS call to 9.5. The old hlCover() algorithm seemsto avoid the O(N^2) behavior, at least on the test case I tried, butnonetheless it's not very quick on a long document.Per report from Stephen Frost.Discussion:https://postgr.es/m/20200724160535.GW12375@tamriel.snowman.net1 parent0d9b64f commit11dce63
2 files changed
+35
-25
lines changedLines changed: 32 additions & 25 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
2010 | 2010 |
| |
2011 | 2011 |
| |
2012 | 2012 |
| |
2013 |
| - | |
| 2013 | + | |
2014 | 2014 |
| |
2015 | 2015 |
| |
2016 | 2016 |
| |
2017 |
| - | |
2018 | 2017 |
| |
2019 | 2018 |
| |
2020 |
| - | |
2021 |
| - | |
2022 |
| - | |
2023 |
| - | |
2024 |
| - | |
2025 |
| - | |
2026 |
| - | |
2027 |
| - | |
2028 |
| - | |
2029 |
| - | |
2030 |
| - | |
| 2019 | + | |
| 2020 | + | |
2031 | 2021 |
| |
2032 | 2022 |
| |
2033 | 2023 |
| |
2034 | 2024 |
| |
2035 | 2025 |
| |
2036 | 2026 |
| |
2037 | 2027 |
| |
2038 |
| - | |
2039 |
| - | |
| 2028 | + | |
| 2029 | + | |
| 2030 | + | |
| 2031 | + | |
| 2032 | + | |
| 2033 | + | |
| 2034 | + | |
| 2035 | + | |
2040 | 2036 |
| |
2041 | 2037 |
| |
2042 | 2038 |
| |
| |||
2045 | 2041 |
| |
2046 | 2042 |
| |
2047 | 2043 |
| |
2048 |
| - | |
| 2044 | + | |
| 2045 | + | |
2049 | 2046 |
| |
2050 | 2047 |
| |
2051 | 2048 |
| |
| |||
2059 | 2056 |
| |
2060 | 2057 |
| |
2061 | 2058 |
| |
2062 |
| - | |
| 2059 | + | |
2063 | 2060 |
| |
2064 | 2061 |
| |
2065 | 2062 |
| |
| |||
2080 | 2077 |
| |
2081 | 2078 |
| |
2082 | 2079 |
| |
2083 |
| - | |
| 2080 | + | |
2084 | 2081 |
| |
2085 | 2082 |
| |
2086 | 2083 |
| |
| |||
2091 | 2088 |
| |
2092 | 2089 |
| |
2093 | 2090 |
| |
2094 |
| - | |
| 2091 | + | |
2095 | 2092 |
| |
2096 | 2093 |
| |
2097 | 2094 |
| |
| |||
2193 | 2190 |
| |
2194 | 2191 |
| |
2195 | 2192 |
| |
2196 |
| - | |
| 2193 | + | |
2197 | 2194 |
| |
2198 | 2195 |
| |
2199 | 2196 |
| |
| |||
2220 | 2217 |
| |
2221 | 2218 |
| |
2222 | 2219 |
| |
2223 |
| - | |
| 2220 | + | |
2224 | 2221 |
| |
2225 | 2222 |
| |
2226 | 2223 |
| |
| |||
2375 | 2372 |
| |
2376 | 2373 |
| |
2377 | 2374 |
| |
2378 |
| - | |
| 2375 | + | |
2379 | 2376 |
| |
2380 | 2377 |
| |
2381 | 2378 |
| |
| |||
2393 | 2390 |
| |
2394 | 2391 |
| |
2395 | 2392 |
| |
2396 |
| - | |
| 2393 | + | |
2397 | 2394 |
| |
2398 | 2395 |
| |
2399 | 2396 |
| |
| |||
2549 | 2546 |
| |
2550 | 2547 |
| |
2551 | 2548 |
| |
| 2549 | + | |
2552 | 2550 |
| |
2553 | 2551 |
| |
2554 | 2552 |
| |
| |||
2588 | 2586 |
| |
2589 | 2587 |
| |
2590 | 2588 |
| |
| 2589 | + | |
| 2590 | + | |
| 2591 | + | |
| 2592 | + | |
| 2593 | + | |
| 2594 | + | |
| 2595 | + | |
| 2596 | + | |
| 2597 | + | |
2591 | 2598 |
| |
2592 | 2599 |
| |
2593 | 2600 |
| |
| |||
2612 | 2619 |
| |
2613 | 2620 |
| |
2614 | 2621 |
| |
2615 |
| - | |
| 2622 | + | |
2616 | 2623 |
| |
2617 | 2624 |
| |
2618 |
| - | |
| 2625 | + | |
2619 | 2626 |
| |
2620 | 2627 |
| |
2621 | 2628 |
| |
|
Lines changed: 3 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
1868 | 1868 |
| |
1869 | 1869 |
| |
1870 | 1870 |
| |
| 1871 | + | |
| 1872 | + | |
| 1873 | + | |
1871 | 1874 |
| |
1872 | 1875 |
| |
1873 | 1876 |
| |
|
0 commit comments
Comments
(0)