- Notifications
You must be signed in to change notification settings - Fork4.9k
Commit428b1d6
committed
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through theOS page cache. This means that some operating systems can end upcollecting a large number of dirty buffers in their respective pagecaches. When these dirty buffers are flushed to storage rapidly, be itbecause of fsync(), timeouts, or dirty ratios, latency for other readsand writes can increase massively. This is the primary reason forregular massive stalls observed in real world scenarios and artificialbenchmarks; on rotating disks stalls on the order of hundreds of secondshave been observed.On linux it is possible to control this by reducing the global dirtylimits significantly, reducing the above problem. But globalconfiguration is rather problematic because it'll affect otherapplications; also PostgreSQL itself doesn't always generally want thisbehavior, e.g. for temporary files it's undesirable.Several operating systems allow some control over the kernel pagecache. Linux has sync_file_range(2), several posix systems have msync(2)and posix_fadvise(2). sync_file_range(2) is preferable because itrequires no special setup, whereas msync() requires the to-be-flushedrange to be mmap'ed. For the purpose of flushing dirty dataposix_fadvise(2) is the worst alternative, as flushing dirty data isjust a side-effect of POSIX_FADV_DONTNEED, which also removes the pagesfrom the page cache. Thus the feature is enabled by default only onlinux, but can be enabled on all systems that have any of the aboveAPIs.While desirable and likely possible this patch does not contain animplementation for windows.With the infrastructure added, writes made via checkpointer, bgwriterand normal user backends can be flushed after a configurable number ofwrites. Each of these sources of writes controlled by a separate GUC,checkpointer_flush_after, bgwriter_flush_after and backend_flush_afterrespectively; they're separate because the number of flushes that aregood are separate, and because the performance considerations ofcontrolled flushing for each of these are different.A later patch will add checkpoint sorting - after that flushes from theckeckpoint will almost always be desirable. Bgwriter flushes are most ofthe time going to be random, which are slow on lots of storage hardware.Flushing in backends works well if the storage and bgwriter can keep up,but if not it can have negative consequences. This patch is likely tohave negative performance consequences without checkpoint sorting, butunfortunately so has sorting without flush control.Discussion: alpine.DEB.2.10.1506011320000.28433@stoAuthor: Fabien Coelho and Andres Freund1 parentc82c92b commit428b1d6
File tree
15 files changed
+601
-31
lines changed- doc/src/sgml
- src
- backend
- postmaster
- storage
- buffer
- file
- smgr
- utils/misc
- include/storage
- tools/pgindent
15 files changed
+601
-31
lines changedLines changed: 87 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
1843 | 1843 |
| |
1844 | 1844 |
| |
1845 | 1845 |
| |
| 1846 | + | |
| 1847 | + | |
| 1848 | + | |
| 1849 | + | |
| 1850 | + | |
| 1851 | + | |
| 1852 | + | |
| 1853 | + | |
| 1854 | + | |
| 1855 | + | |
| 1856 | + | |
| 1857 | + | |
| 1858 | + | |
| 1859 | + | |
| 1860 | + | |
| 1861 | + | |
| 1862 | + | |
| 1863 | + | |
| 1864 | + | |
| 1865 | + | |
| 1866 | + | |
| 1867 | + | |
| 1868 | + | |
| 1869 | + | |
| 1870 | + | |
| 1871 | + | |
| 1872 | + | |
| 1873 | + | |
| 1874 | + | |
1846 | 1875 |
| |
1847 | 1876 |
| |
1848 | 1877 |
| |
| |||
1944 | 1973 |
| |
1945 | 1974 |
| |
1946 | 1975 |
| |
| 1976 | + | |
| 1977 | + | |
| 1978 | + | |
| 1979 | + | |
| 1980 | + | |
| 1981 | + | |
| 1982 | + | |
| 1983 | + | |
| 1984 | + | |
| 1985 | + | |
| 1986 | + | |
| 1987 | + | |
| 1988 | + | |
| 1989 | + | |
| 1990 | + | |
| 1991 | + | |
| 1992 | + | |
| 1993 | + | |
| 1994 | + | |
| 1995 | + | |
| 1996 | + | |
| 1997 | + | |
| 1998 | + | |
| 1999 | + | |
| 2000 | + | |
| 2001 | + | |
| 2002 | + | |
| 2003 | + | |
| 2004 | + | |
1947 | 2005 |
| |
1948 | 2006 |
| |
1949 | 2007 |
| |
| |||
2475 | 2533 |
| |
2476 | 2534 |
| |
2477 | 2535 |
| |
| 2536 | + | |
| 2537 | + | |
| 2538 | + | |
| 2539 | + | |
| 2540 | + | |
| 2541 | + | |
| 2542 | + | |
| 2543 | + | |
| 2544 | + | |
| 2545 | + | |
| 2546 | + | |
| 2547 | + | |
| 2548 | + | |
| 2549 | + | |
| 2550 | + | |
| 2551 | + | |
| 2552 | + | |
| 2553 | + | |
| 2554 | + | |
| 2555 | + | |
| 2556 | + | |
| 2557 | + | |
| 2558 | + | |
| 2559 | + | |
| 2560 | + | |
| 2561 | + | |
| 2562 | + | |
| 2563 | + | |
| 2564 | + | |
2478 | 2565 |
| |
2479 | 2566 |
| |
2480 | 2567 |
| |
|
Lines changed: 11 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
545 | 545 |
| |
546 | 546 |
| |
547 | 547 |
| |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
548 | 559 |
| |
549 | 560 |
| |
550 | 561 |
| |
|
Lines changed: 7 additions & 1 deletion
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
111 | 111 |
| |
112 | 112 |
| |
113 | 113 |
| |
| 114 | + | |
114 | 115 |
| |
115 | 116 |
| |
116 | 117 |
| |
| |||
164 | 165 |
| |
165 | 166 |
| |
166 | 167 |
| |
| 168 | + | |
| 169 | + | |
167 | 170 |
| |
168 | 171 |
| |
169 | 172 |
| |
| |||
208 | 211 |
| |
209 | 212 |
| |
210 | 213 |
| |
| 214 | + | |
| 215 | + | |
| 216 | + | |
211 | 217 |
| |
212 | 218 |
| |
213 | 219 |
| |
| |||
272 | 278 |
| |
273 | 279 |
| |
274 | 280 |
| |
275 |
| - | |
| 281 | + | |
276 | 282 |
| |
277 | 283 |
| |
278 | 284 |
| |
|
Lines changed: 5 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
23 | 23 |
| |
24 | 24 |
| |
25 | 25 |
| |
| 26 | + | |
26 | 27 |
| |
27 | 28 |
| |
28 | 29 |
| |
| |||
149 | 150 |
| |
150 | 151 |
| |
151 | 152 |
| |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
152 | 157 |
| |
153 | 158 |
| |
154 | 159 |
| |
|
0 commit comments
Comments
(0)