@@ -240,3 +240,142 @@ It doesn't allow you to do several important optimizations.
240240---------------------------(end of broadcast)---------------------------
241241TIP 4: Don't 'kill -9' the postmaster
242242
243+ From pgsql-general-owner+M14300@postgresql.org Mon Aug 27 13:07:32 2001
244+ Return-path: <pgsql-general-owner+M14300@postgresql.org>
245+ Received: from server1.pgsql.org (server1.pgsql.org [64.39.15.238])
246+ by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id f7RH7VF04800
247+ for <pgman@candle.pha.pa.us>; Mon, 27 Aug 2001 13:07:31 -0400 (EDT)
248+ Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
249+ by server1.pgsql.org (8.11.6/8.11.6) with ESMTP id f7RH7Tq17721;
250+ Mon, 27 Aug 2001 12:07:29 -0500 (CDT)
251+ (envelope-from pgsql-general-owner+M14300@postgresql.org)
252+ Received: from svana.org (svana.org [210.9.66.30])
253+ by postgresql.org (8.11.3/8.11.4) with ESMTP id f7RFE1f13269
254+ for <pgsql-general@postgresql.org>; Mon, 27 Aug 2001 11:14:01 -0400 (EDT)
255+ (envelope-from kleptog@svana.org)
256+ Received: from kleptog by svana.org with local (Exim 3.12 #1 (Debian))
257+ id 15bO5x-0000Fd-00; Tue, 28 Aug 2001 01:14:33 +1000
258+ Date: Tue, 28 Aug 2001 01:14:33 +1000
259+ From: Martijn van Oosterhout <kleptog@svana.org>
260+ To: Andrew Snow <andrew@modulus.org>
261+ cc: pgsql-general@postgresql.org
262+ Subject: Re: [GENERAL] raw partition
263+ Message-ID: <20010828011433.E32309@svana.org>
264+ Reply-To: Martijn van Oosterhout <kleptog@svana.org>
265+ References: <20010827233815.B32309@svana.org> <000101c12f00$dc5814b0$fa01b5ca@avon>
266+ MIME-Version: 1.0
267+ Content-Type: text/plain; charset=us-ascii
268+ Content-Disposition: inline
269+ User-Agent: Mutt/1.2.5i
270+ In-Reply-To: <000101c12f00$dc5814b0$fa01b5ca@avon>; from andrew@modulus.org on Tue, Aug 28, 2001 at 12:02:08AM +1000
271+ Precedence: bulk
272+ Sender: pgsql-general-owner@postgresql.org
273+ Status: OR
274+
275+ On Tue, Aug 28, 2001 at 12:02:08AM +1000, Andrew Snow wrote:
276+ >
277+ > What I think would be better would be moving postgresql to a system of
278+ > using memory-mapped I/O. instead of the shared buffer cache, files
279+ > would be directly memory-mapped and the OS would do the caching. I
280+ > can't see this happening though because of platform dependancy, but I
281+ > think its worth another look soon because many unix platforms support
282+ > mmap(). I think it would improve the performance of disk-intensive
283+ > tasks noticeably.
284+
285+ Well, this has other problems. Consider tables that are larger than your
286+ system memory. You'd have to continuously map and unmap different sections.
287+ That can have odd side effects (witness mozilla on linux having 15,000
288+ mapped areas or so...)
289+
290+ You would still however get the advantage that you wouldn't have to copy the
291+ data from the disk buffers to user space, you simply get the disk buffer
292+ mapped into your address space.
293+
294+ I think that for commonly used tables that are under 100K in size (most of
295+ the system tables), this is quite a workable idea. If you don't mind keeping
296+ them mapped the whole time.
297+
298+ --
299+ Martijn van Oosterhout <kleptog@svana.org>
300+ http://svana.org/kleptog/
301+ > It would be nice if someone came up with a certification system that
302+ > actually separated those who can barely regurgitate what they crammed over
303+ > the last few weeks from those who command secret ninja networking powers.
304+
305+ ---------------------------(end of broadcast)---------------------------
306+ TIP 3: if posting/reading through Usenet, please send an appropriate
307+ subscribe-nomail command to majordomo@postgresql.org so that your
308+ message can get through to the mailing list cleanly
309+
310+ From pgsql-general-owner+M14319@postgresql.org Mon Aug 27 16:57:10 2001
311+ Return-path: <pgsql-general-owner+M14319@postgresql.org>
312+ Received: from server1.pgsql.org (server1.pgsql.org [64.39.15.238])
313+ by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id f7RKv9F16849
314+ for <pgman@candle.pha.pa.us>; Mon, 27 Aug 2001 16:57:09 -0400 (EDT)
315+ Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
316+ by server1.pgsql.org (8.11.6/8.11.6) with ESMTP id f7RKv9q31456;
317+ Mon, 27 Aug 2001 15:57:09 -0500 (CDT)
318+ (envelope-from pgsql-general-owner+M14319@postgresql.org)
319+ Received: from sss.pgh.pa.us ([192.204.191.242])
320+ by postgresql.org (8.11.3/8.11.4) with ESMTP id f7RJrsf55472
321+ for <pgsql-general@postgresql.org>; Mon, 27 Aug 2001 15:53:54 -0400 (EDT)
322+ (envelope-from tgl@sss.pgh.pa.us)
323+ Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
324+ by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id f7RJrGK19431;
325+ Mon, 27 Aug 2001 15:53:16 -0400 (EDT)
326+ To: Martijn van Oosterhout <kleptog@svana.org>
327+ cc: Andrew Snow <andrew@modulus.org>, pgsql-general@postgresql.org
328+ Subject: Re: [GENERAL] raw partition
329+ In-Reply-To: <20010828011433.E32309@svana.org>
330+ References: <20010827233815.B32309@svana.org> <000101c12f00$dc5814b0$fa01b5ca@avon> <20010828011433.E32309@svana.org>
331+ Comments: In-reply-to Martijn van Oosterhout <kleptog@svana.org>
332+ message dated "Tue, 28 Aug 2001 01:14:33 +1000"
333+ Date: Mon, 27 Aug 2001 15:53:15 -0400
334+ Message-ID: <19428.998941995@sss.pgh.pa.us>
335+ From: Tom Lane <tgl@sss.pgh.pa.us>
336+ Precedence: bulk
337+ Sender: pgsql-general-owner@postgresql.org
338+ Status: OR
339+
340+ Martijn van Oosterhout <kleptog@svana.org> writes:
341+ > You would still however get the advantage that you wouldn't have to copy the
342+ > data from the disk buffers to user space, you simply get the disk buffer
343+ > mapped into your address space.
344+
345+ AFAICS this would be the *only* advantage. While it's not negligible,
346+ it's quite unclear that it's worth the bookkeeping and portability
347+ headaches of managing lots of mmap'd areas, either.
348+
349+ Before I take this idea seriously at all, I'd want to see a design that
350+ addresses a couple of critical issues:
351+
352+ 1. Postgres' shared buffers are *shared*, potentially across many
353+ processes. How will you deal with buffers for files that have been
354+ mmap'd by only some of the processes? (Maybe this means that the
355+ whole concept of shared buffers goes away, and each process does its
356+ own buffer management based on its own mmaps. Not sure. That would be
357+ a pretty radical restructuring though, and would completely invalidate
358+ our present approach to page-level locking.)
359+
360+ 2. How do you deal with extending a file? My system's mmap man page
361+ says
362+ If the size of the mapped file changes after the call to mmap(), the
363+ effect of references to portions of the mapped region that correspond
364+ to added or removed portions of the file is unspecified.
365+ This suggests that the only portable way to cope is to issue a separate
366+ mmap for every disk page. Will typical Unix systems perform well with
367+ umpteen thousand small mmap requests?
368+
369+ 3. How do you persuade the other backends to drop their mmaps of a table
370+ you are deleting?
371+
372+ There are probably other gotchas, but without an understanding of how
373+ to address these, I doubt it's worth looking further ...
374+
375+ regards, tom lane
376+
377+ ---------------------------(end of broadcast)---------------------------
378+ TIP 5: Have you checked our extensive FAQ?
379+
380+ http://www.postgresql.org/users-lounge/docs/faq.html
381+