Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit726926a

Browse files
committed
Update pgcvslog
1 parent127f785 commit726926a

File tree

4 files changed

+345
-146
lines changed

4 files changed

+345
-146
lines changed

‎doc/TODO

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -192,7 +192,8 @@ PERFORMANCE
192192

193193
FSYNC
194194

195-
* Allow transaction commits with rollback with no-fsync performance [fsync](Vadim)
195+
* Allow transaction commits with rollback with no-fsync performance
196+
[fsync] (Vadim)
196197

197198
INDEXES
198199

@@ -231,6 +232,7 @@ MISC
231232
* Remove pg_listener index
232233
* Remove ANALYZE from VACUUM so it can be run separately without locks
233234
* Gather more accurate statistics using indexes
235+
* Improve statistics storage in pg_class [performance]
234236

235237
SOURCE CODE
236238
-----------

‎doc/TODO.detail/performance

Lines changed: 211 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -341,3 +341,214 @@ Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612
341341
good, you'll have to ram them down people's throats." -- Howard Aiken
342342

343343

344+
From owner-pgsql-hackers@hub.org Tue Oct 19 10:31:10 1999
345+
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
346+
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA29087
347+
for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 10:31:08 -0400 (EDT)
348+
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.2 $) with ESMTP id KAA27535 for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 10:19:47 -0400 (EDT)
349+
Received: from localhost (majordom@localhost)
350+
by hub.org (8.9.3/8.9.3) with SMTP id KAA30328;
351+
Tue, 19 Oct 1999 10:12:10 -0400 (EDT)
352+
(envelope-from owner-pgsql-hackers)
353+
Received: by hub.org (bulk_mailer v1.5); Tue, 19 Oct 1999 10:11:55 -0400
354+
Received: (from majordom@localhost)
355+
by hub.org (8.9.3/8.9.3) id KAA30030
356+
for pgsql-hackers-outgoing; Tue, 19 Oct 1999 10:11:00 -0400 (EDT)
357+
(envelope-from owner-pgsql-hackers@postgreSQL.org)
358+
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
359+
by hub.org (8.9.3/8.9.3) with ESMTP id KAA29914
360+
for <pgsql-hackers@postgreSQL.org>; Tue, 19 Oct 1999 10:10:33 -0400 (EDT)
361+
(envelope-from tgl@sss.pgh.pa.us)
362+
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
363+
by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id KAA09038;
364+
Tue, 19 Oct 1999 10:09:15 -0400 (EDT)
365+
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
366+
cc: "Vadim Mikheev" <vadim@krs.ru>, pgsql-hackers@postgreSQL.org
367+
Subject: Re: [HACKERS] mdnblocks is an amazing time sink in huge relations
368+
In-reply-to: Your message of Tue, 19 Oct 1999 19:03:22 +0900
369+
<000801bf1a19$2d88ae20$2801007e@cadzone.tpf.co.jp>
370+
Date: Tue, 19 Oct 1999 10:09:15 -0400
371+
Message-ID: <9036.940342155@sss.pgh.pa.us>
372+
From: Tom Lane <tgl@sss.pgh.pa.us>
373+
Sender: owner-pgsql-hackers@postgreSQL.org
374+
Status: OR
375+
376+
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
377+
> 1. shared cache holds committed system tuples.
378+
> 2. private cache holds uncommitted system tuples.
379+
> 3. relpages of shared cache are updated immediately by
380+
> phisical change and corresponding buffer pages are
381+
> marked dirty.
382+
> 4. on commit, the contents of uncommitted tuples except
383+
> relpages,reltuples,... are copied to correponding tuples
384+
> in shared cache and the combined contents are
385+
> committed.
386+
> If so,catalog cache invalidation would be no longer needed.
387+
> But synchronization of the step 4. may be difficult.
388+
389+
I think the main problem is that relpages and reltuples shouldn't
390+
be kept in pg_class columns at all, because they need to have
391+
very different update behavior from the other pg_class columns.
392+
393+
The rest of pg_class is update-on-commit, and we can lock down any one
394+
row in the normal MVCC way (if transaction A has modified a row and
395+
transaction B also wants to modify it, B waits for A to commit or abort,
396+
so it can know which version of the row to start from). Furthermore,
397+
there can legitimately be several different values of a row in use in
398+
different places: the latest committed, an uncommitted modification, and
399+
one or more old values that are still being used by active transactions
400+
because they were current when those transactions started. (BTW, the
401+
present relcache is pretty bad about maintaining pure MVCC transaction
402+
semantics like this, but it seems clear to me that that's the direction
403+
we want to go in.)
404+
405+
relpages cannot operate this way. To be useful for avoiding lseeks,
406+
relpages *must* change exactly when the physical file changes. It
407+
matters not at all whether the particular transaction that extended the
408+
file ultimately commits or not. Moreover there can be only one correct
409+
value (per relation) across the whole system, because there is only one
410+
length of the relation file.
411+
412+
If we want to take reltuples seriously and try to maintain it
413+
on-the-fly, then I think it needs still a third behavior. Clearly
414+
it cannot be updated using MVCC rules, or we lose all writer
415+
concurrency (if A has added tuples to a rel, B would have to wait
416+
for A to commit before it could update reltuples...). Furthermore
417+
"updating" isn't a simple matter of storing what you think the new
418+
value is; otherwise two transactions adding tuples in parallel would
419+
leave the wrong answer after B commits and overwrites A's value.
420+
I think it would work for each transaction to keep track of a net delta
421+
in reltuples for each table it's changed (total tuples added less total
422+
tuples deleted), and then atomically add that value to the table's
423+
shared reltuples counter during commit. But that still leaves the
424+
problem of how you use the counter during a transaction to get an
425+
accurate answer to the question "If I scan this table now, how many tuples
426+
will I see?" At the time the question is asked, the current shared
427+
counter value might include the effects of transactions that have
428+
committed since your transaction started, and therefore are not visible
429+
under MVCC rules. I think getting the correct answer would involve
430+
making an instantaneous copy of the current counter at the start of
431+
your xact, and then adding your own private net-uncommitted-delta to
432+
the saved shared counter value when asked the question. This doesn't
433+
look real practical --- you'd have to save the reltuples counts of
434+
*all* tables in the database at the start of each xact, on the off
435+
chance that you might need them. Ugh. Perhaps someone has a better
436+
idea. In any case, reltuples clearly needs different mechanisms than
437+
the ordinary fields in pg_class do, because updating it will be a
438+
performance bottleneck otherwise.
439+
440+
If we allow reltuples to be updated only by vacuum-like events, as
441+
it is now, then I think keeping it in pg_class is still OK.
442+
443+
In short, it seems clear to me that relpages should be removed from
444+
pg_class and kept somewhere else if we want to make it more reliable
445+
than it is now, and the same for reltuples (but reltuples doesn't
446+
behave the same as relpages, and probably ought to be handled
447+
differently).
448+
449+
regards, tom lane
450+
451+
************
452+
453+
From owner-pgsql-hackers@hub.org Tue Oct 19 21:25:30 1999
454+
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
455+
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA28130
456+
for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 21:25:26 -0400 (EDT)
457+
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.2 $) with ESMTP id VAA10512 for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 21:15:28 -0400 (EDT)
458+
Received: from localhost (majordom@localhost)
459+
by hub.org (8.9.3/8.9.3) with SMTP id VAA50745;
460+
Tue, 19 Oct 1999 21:07:23 -0400 (EDT)
461+
(envelope-from owner-pgsql-hackers)
462+
Received: by hub.org (bulk_mailer v1.5); Tue, 19 Oct 1999 21:07:01 -0400
463+
Received: (from majordom@localhost)
464+
by hub.org (8.9.3/8.9.3) id VAA50644
465+
for pgsql-hackers-outgoing; Tue, 19 Oct 1999 21:06:06 -0400 (EDT)
466+
(envelope-from owner-pgsql-hackers@postgreSQL.org)
467+
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
468+
by hub.org (8.9.3/8.9.3) with ESMTP id VAA50584
469+
for <pgsql-hackers@postgreSQL.org>; Tue, 19 Oct 1999 21:05:26 -0400 (EDT)
470+
(envelope-from Inoue@tpf.co.jp)
471+
Received: from cadzone ([126.0.1.40] (may be forged))
472+
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
473+
id KAA01715; Wed, 20 Oct 1999 10:05:14 +0900
474+
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
475+
To: "Tom Lane" <tgl@sss.pgh.pa.us>
476+
Cc: <pgsql-hackers@postgreSQL.org>
477+
Subject: RE: [HACKERS] mdnblocks is an amazing time sink in huge relations
478+
Date: Wed, 20 Oct 1999 10:09:13 +0900
479+
Message-ID: <000501bf1a97$b925a860$2801007e@cadzone.tpf.co.jp>
480+
MIME-Version: 1.0
481+
Content-Type: text/plain;
482+
charset="iso-8859-1"
483+
Content-Transfer-Encoding: 7bit
484+
X-Priority: 3 (Normal)
485+
X-MSMail-Priority: Normal
486+
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
487+
X-Mimeole: Produced By Microsoft MimeOLE V4.72.2106.4
488+
Importance: Normal
489+
Sender: owner-pgsql-hackers@postgreSQL.org
490+
Status: ORr
491+
492+
> -----Original Message-----
493+
> From: Hiroshi Inoue [mailto:Inoue@tpf.co.jp]
494+
> Sent: Tuesday, October 19, 1999 6:45 PM
495+
> To: Tom Lane
496+
> Cc: pgsql-hackers@postgreSQL.org
497+
> Subject: RE: [HACKERS] mdnblocks is an amazing time sink in huge
498+
> relations
499+
>
500+
>
501+
> >
502+
> > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
503+
>
504+
> [snip]
505+
>
506+
> >
507+
> > > Deletion is necessary only not to consume disk space.
508+
> > >
509+
> > > For example vacuum could remove not deleted files.
510+
> >
511+
> > Hmm ... interesting idea ... but I can hear the complaints
512+
> > from users already...
513+
> >
514+
>
515+
> My idea is only an analogy of PostgreSQL's simple recovery
516+
> mechanism of tuples.
517+
>
518+
> And my main point is
519+
> "delete fails after commit" doesn't harm the database
520+
> except that not deleted files consume disk space.
521+
>
522+
> Of cource,it's preferable to delete relation files immediately
523+
> after(or just when) commit.
524+
> Useless files are visible though useless tuples are invisible.
525+
>
526+
527+
Anyway I don't need "DROP TABLE inside transactions" now
528+
and my idea is originally for that issue.
529+
530+
After a thought,I propose the following solution.
531+
532+
1. mdcreate() couldn't create existent relation files.
533+
If the existent file is of length zero,we would overwrite
534+
the file.(seems the comment in md.c says so but the
535+
code doesn't do so).
536+
If the file is an Index relation file,we would overwrite
537+
the file.
538+
539+
2. mdunlink() couldn't unlink non-existent relation files.
540+
mdunlink() doesn't call elog(ERROR) even if the file
541+
doesn't exist,though I couldn't find where to change
542+
now.
543+
mdopen() doesn't call elog(ERROR) even if the file
544+
doesn't exist and leaves the relation as CLOSED.
545+
546+
Comments ?
547+
548+
Regards.
549+
550+
Hiroshi Inoue
551+
Inoue@tpf.co.jp
552+
553+
************
554+

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp