Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit9c08aea

Browse files
committed
Add new block-by-block strategy for CREATE DATABASE.
Because this strategy logs changes on a block-by-block basis, itavoids the need to checkpoint before and after the operation.However, because it logs each changed block individually, it mightgenerate a lot of extra write-ahead logging if the template databaseis large. Therefore, the older strategy remains available via a newSTRATEGY parameter to CREATE DATABASE, and a corresponding --strategyoption to createdb.Somewhat controversially, this patch assembles the list of relationsto be copied to the new database by reading the pg_class relation ofthe template database. Cross-database access like this isn't normallypossible, but it can be made to work here because there can't be anyconnections to the database being copied, nor can it contain anyin-doubt transactions. Even so, we have to use lower-level interfacesthan normal, since the table scan and relcache interfaces will notwork for a database to which we're not connected. The advantage ofthis approach is that we do not need to rely on the filesystem todetermine what ought to be copied, but instead on PostgreSQL's ownknowledge of the database structure. This avoids, for example,copying stray files that happen to be located in the source databasedirectory.Dilip Kumar, with a fairly large number of cosmetic changes by me.Reviewed and tested by Ashutosh Sharma, Andres Freund, John Naylor,Greg Nancarrow, Neha Sharma. Additional feedback from Bruce Momjian,Heikki Linnakangas, Julien Rouhaud, Adam Brusselback, KyotaroHoriguchi, Tomas Vondra, Andrew Dunstan, Álvaro Herrera, and others.Discussion:http://postgr.es/m/CA+TgmoYtcdxBjLh31DLxUXHxFVMPGzrU5_T=CYCvRyFHywSBUQ@mail.gmail.com
1 parentbf902c1 commit9c08aea

File tree

28 files changed

+1081
-157
lines changed

28 files changed

+1081
-157
lines changed

‎contrib/bloom/blinsert.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@ blbuildempty(Relation index)
173173
* Write the page and log it. It might seem that an immediate sync would
174174
* be sufficient to guarantee that the file exists on disk, but recovery
175175
* itself might remove it while replaying, for example, an
176-
* XLOG_DBASE_CREATE or XLOG_TBLSPC_CREATE record. Therefore, we need
176+
* XLOG_DBASE_CREATE* or XLOG_TBLSPC_CREATE record. Therefore, we need
177177
* this even when wal_level=minimal.
178178
*/
179179
PageSetChecksumInplace(metapage,BLOOM_METAPAGE_BLKNO);

‎doc/src/sgml/monitoring.sgml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1502,6 +1502,10 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
15021502
<entry><literal>TwophaseFileWrite</literal></entry>
15031503
<entry>Waiting for a write of a two phase state file.</entry>
15041504
</row>
1505+
<row>
1506+
<entry><literal>VersionFileWrite</literal></entry>
1507+
<entry>Waiting for the version file to be written while creating a database.</entry>
1508+
</row>
15051509
<row>
15061510
<entry><literal>WALBootstrapSync</literal></entry>
15071511
<entry>Waiting for WAL to reach durable storage during

‎doc/src/sgml/ref/create_database.sgml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
2525
[ [ WITH ] [ OWNER [=] <replaceable class="parameter">user_name</replaceable> ]
2626
[ TEMPLATE [=] <replaceable class="parameter">template</replaceable> ]
2727
[ ENCODING [=] <replaceable class="parameter">encoding</replaceable> ]
28+
[ STRATEGY [=] <replaceable class="parameter">strategy</replaceable> ] ]
2829
[ LOCALE [=] <replaceable class="parameter">locale</replaceable> ]
2930
[ LC_COLLATE [=] <replaceable class="parameter">lc_collate</replaceable> ]
3031
[ LC_CTYPE [=] <replaceable class="parameter">lc_ctype</replaceable> ]
@@ -118,6 +119,27 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
118119
</para>
119120
</listitem>
120121
</varlistentry>
122+
<varlistentry id="create-database-strategy" xreflabel="CREATE DATABASE STRATEGY">
123+
<term><replaceable class="parameter">strategy</replaceable></term>
124+
<listitem>
125+
<para>
126+
Strategy to be used in creating the new database. If
127+
the <literal>WAL_LOG</literal> strategy is used, the database will be
128+
copied block by block and each block will be separately written
129+
to the write-ahead log. This is the most efficient strategy in
130+
cases where the template database is small, and therefore it is the
131+
default. The older <literal>FILE_COPY</literal> strategy is also
132+
available. This strategy writes a small record to the write-ahead log
133+
for each tablespace used by the target database. Each such record
134+
represents copying an entire directory to a new location at the
135+
filesystem level. While this does reduce the write-ahed
136+
log volume substantially, especially if the template database is large,
137+
it also forces the system to perform a checkpoint both before and
138+
after the creation of the new database. In some situations, this may
139+
have a noticeable negative impact on overall system performance.
140+
</para>
141+
</listitem>
142+
</varlistentry>
121143
<varlistentry>
122144
<term><replaceable class="parameter">locale</replaceable></term>
123145
<listitem>

‎doc/src/sgml/ref/createdb.sgml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,17 @@ PostgreSQL documentation
177177
</listitem>
178178
</varlistentry>
179179

180+
<varlistentry>
181+
<term><option>-S <replaceable class="parameter">template</replaceable></option></term>
182+
<term><option>--strategy=<replaceable class="parameter">strategy</replaceable></option></term>
183+
<listitem>
184+
<para>
185+
Specifies the database creation strategy. See
186+
<xref linkend="create-database-strategy" /> for more details.
187+
</para>
188+
</listitem>
189+
</varlistentry>
190+
180191
<varlistentry>
181192
<term><option>-T <replaceable class="parameter">template</replaceable></option></term>
182193
<term><option>--template=<replaceable class="parameter">template</replaceable></option></term>

‎src/backend/access/heap/heapam_handler.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -593,15 +593,15 @@ heapam_relation_set_new_filenode(Relation rel,
593593
*/
594594
*minmulti=GetOldestMultiXactId();
595595

596-
srel=RelationCreateStorage(*newrnode,persistence);
596+
srel=RelationCreateStorage(*newrnode,persistence, true);
597597

598598
/*
599599
* If required, set up an init fork for an unlogged table so that it can
600600
* be correctly reinitialized on restart. An immediate sync is required
601601
* even if the page has been logged, because the write did not go through
602602
* shared_buffers and therefore a concurrent checkpoint may have moved the
603603
* redo pointer past our xlog record. Recovery may as well remove it
604-
* while replaying, for example, XLOG_DBASE_CREATE or XLOG_TBLSPC_CREATE
604+
* while replaying, for example, XLOG_DBASE_CREATE* or XLOG_TBLSPC_CREATE
605605
* record. Therefore, logging is necessary even if wal_level=minimal.
606606
*/
607607
if (persistence==RELPERSISTENCE_UNLOGGED)
@@ -645,7 +645,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
645645
* NOTE: any conflict in relfilenode value will be caught in
646646
* RelationCreateStorage().
647647
*/
648-
RelationCreateStorage(*newrnode,rel->rd_rel->relpersistence);
648+
RelationCreateStorage(*newrnode,rel->rd_rel->relpersistence, true);
649649

650650
/* copy main fork */
651651
RelationCopyStorage(RelationGetSmgr(rel),dstrel,MAIN_FORKNUM,

‎src/backend/access/nbtree/nbtree.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -161,7 +161,7 @@ btbuildempty(Relation index)
161161
* Write the page and log it. It might seem that an immediate sync would
162162
* be sufficient to guarantee that the file exists on disk, but recovery
163163
* itself might remove it while replaying, for example, an
164-
* XLOG_DBASE_CREATE or XLOG_TBLSPC_CREATE record. Therefore, we need
164+
* XLOG_DBASE_CREATE* or XLOG_TBLSPC_CREATE record. Therefore, we need
165165
* this even when wal_level=minimal.
166166
*/
167167
PageSetChecksumInplace(metapage,BTREE_METAPAGE);

‎src/backend/access/rmgrdesc/dbasedesc.c

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,23 @@ dbase_desc(StringInfo buf, XLogReaderState *record)
2424
char*rec=XLogRecGetData(record);
2525
uint8info=XLogRecGetInfo(record)& ~XLR_INFO_MASK;
2626

27-
if (info==XLOG_DBASE_CREATE)
27+
if (info==XLOG_DBASE_CREATE_FILE_COPY)
2828
{
29-
xl_dbase_create_rec*xlrec= (xl_dbase_create_rec*)rec;
29+
xl_dbase_create_file_copy_rec*xlrec=
30+
(xl_dbase_create_file_copy_rec*)rec;
3031

3132
appendStringInfo(buf,"copy dir %u/%u to %u/%u",
3233
xlrec->src_tablespace_id,xlrec->src_db_id,
3334
xlrec->tablespace_id,xlrec->db_id);
3435
}
36+
elseif (info==XLOG_DBASE_CREATE_WAL_LOG)
37+
{
38+
xl_dbase_create_wal_log_rec*xlrec=
39+
(xl_dbase_create_wal_log_rec*)rec;
40+
41+
appendStringInfo(buf,"create dir %u/%u",
42+
xlrec->tablespace_id,xlrec->db_id);
43+
}
3544
elseif (info==XLOG_DBASE_DROP)
3645
{
3746
xl_dbase_drop_rec*xlrec= (xl_dbase_drop_rec*)rec;
@@ -51,8 +60,11 @@ dbase_identify(uint8 info)
5160

5261
switch (info& ~XLR_INFO_MASK)
5362
{
54-
caseXLOG_DBASE_CREATE:
55-
id="CREATE";
63+
caseXLOG_DBASE_CREATE_FILE_COPY:
64+
id="CREATE_FILE_COPY";
65+
break;
66+
caseXLOG_DBASE_CREATE_WAL_LOG:
67+
id="CREATE_WAL_LOG";
5668
break;
5769
caseXLOG_DBASE_DROP:
5870
id="DROP";

‎src/backend/access/transam/xlogutils.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -484,7 +484,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
484484
{
485485
/* page exists in file */
486486
buffer=ReadBufferWithoutRelcache(rnode,forknum,blkno,
487-
mode,NULL);
487+
mode,NULL, true);
488488
}
489489
else
490490
{
@@ -509,7 +509,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
509509
ReleaseBuffer(buffer);
510510
}
511511
buffer=ReadBufferWithoutRelcache(rnode,forknum,
512-
P_NEW,mode,NULL);
512+
P_NEW,mode,NULL, true);
513513
}
514514
while (BufferGetBlockNumber(buffer)<blkno);
515515
/* Handle the corner case that P_NEW returns non-consecutive pages */
@@ -519,7 +519,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
519519
LockBuffer(buffer,BUFFER_LOCK_UNLOCK);
520520
ReleaseBuffer(buffer);
521521
buffer=ReadBufferWithoutRelcache(rnode,forknum,blkno,
522-
mode,NULL);
522+
mode,NULL, true);
523523
}
524524
}
525525

‎src/backend/catalog/heap.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -387,7 +387,7 @@ heap_create(const char *relname,
387387
relpersistence,
388388
relfrozenxid,relminmxid);
389389
elseif (RELKIND_HAS_STORAGE(rel->rd_rel->relkind))
390-
RelationCreateStorage(rel->rd_node,relpersistence);
390+
RelationCreateStorage(rel->rd_node,relpersistence, true);
391391
else
392392
Assert(false);
393393
}

‎src/backend/catalog/storage.c

Lines changed: 22 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -112,12 +112,14 @@ AddPendingSync(const RelFileNode *rnode)
112112
* modules that need them.
113113
*
114114
* This function is transactional. The creation is WAL-logged, and if the
115-
* transaction aborts later on, the storage will be destroyed.
115+
* transaction aborts later on, the storage will be destroyed. A caller
116+
* that does not want the storage to be destroyed in case of an abort may
117+
* pass register_delete = false.
116118
*/
117119
SMgrRelation
118-
RelationCreateStorage(RelFileNodernode,charrelpersistence)
120+
RelationCreateStorage(RelFileNodernode,charrelpersistence,
121+
boolregister_delete)
119122
{
120-
PendingRelDelete*pending;
121123
SMgrRelationsrel;
122124
BackendIdbackend;
123125
boolneeds_wal;
@@ -149,15 +151,23 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
149151
if (needs_wal)
150152
log_smgrcreate(&srel->smgr_rnode.node,MAIN_FORKNUM);
151153

152-
/* Add the relation to the list of stuff to delete at abort */
153-
pending= (PendingRelDelete*)
154-
MemoryContextAlloc(TopMemoryContext,sizeof(PendingRelDelete));
155-
pending->relnode=rnode;
156-
pending->backend=backend;
157-
pending->atCommit= false;/* delete if abort */
158-
pending->nestLevel=GetCurrentTransactionNestLevel();
159-
pending->next=pendingDeletes;
160-
pendingDeletes=pending;
154+
/*
155+
* Add the relation to the list of stuff to delete at abort, if we are
156+
* asked to do so.
157+
*/
158+
if (register_delete)
159+
{
160+
PendingRelDelete*pending;
161+
162+
pending= (PendingRelDelete*)
163+
MemoryContextAlloc(TopMemoryContext,sizeof(PendingRelDelete));
164+
pending->relnode=rnode;
165+
pending->backend=backend;
166+
pending->atCommit= false;/* delete if abort */
167+
pending->nestLevel=GetCurrentTransactionNestLevel();
168+
pending->next=pendingDeletes;
169+
pendingDeletes=pending;
170+
}
161171

162172
if (relpersistence==RELPERSISTENCE_PERMANENT&& !XLogIsNeeded())
163173
{

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp