Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitd2bddc2

Browse files
committed
Add huge_page_size setting for use on Linux.
This allows the huge page size to be set explicitly. The default is 0,meaning it will use the system default, as before.Author: Odin Ugedal <odin@ugedal.com>Discussion:https://postgr.es/m/20200608154639.20254-1-odin%40ugedal.com
1 parentd66b23b commitd2bddc2

File tree

6 files changed

+141
-38
lines changed

6 files changed

+141
-38
lines changed

‎doc/src/sgml/config.sgml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1582,6 +1582,33 @@ include_dir 'conf.d'
15821582
</listitem>
15831583
</varlistentry>
15841584

1585+
<varlistentry id="guc-huge-page-size" xreflabel="huge_page_size">
1586+
<term><varname>huge_page_size</varname> (<type>integer</type>)
1587+
<indexterm>
1588+
<primary><varname>huge_page_size</varname> configuration parameter</primary>
1589+
</indexterm>
1590+
</term>
1591+
<listitem>
1592+
<para>
1593+
Controls the size of huge pages, when they are enabled with
1594+
<xref linkend="guc-huge-pages"/>.
1595+
The default is zero (<literal>0</literal>).
1596+
When set to <literal>0</literal>, the default huge page size on the
1597+
system will be used.
1598+
</para>
1599+
<para>
1600+
Some commonly available page sizes on modern 64 bit server architectures include:
1601+
<literal>2MB</literal> and <literal>1GB</literal> (Intel and AMD), <literal>16MB</literal> and
1602+
<literal>16GB</literal> (IBM POWER), and <literal>64kB</literal>, <literal>2MB</literal>,
1603+
<literal>32MB</literal> and <literal>1GB</literal> (ARM). For more information
1604+
about usage and support, see <xref linkend="linux-huge-pages"/>.
1605+
</para>
1606+
<para>
1607+
Non-default settings are currently supported only on Linux.
1608+
</para>
1609+
</listitem>
1610+
</varlistentry>
1611+
15851612
<varlistentry id="guc-temp-buffers" xreflabel="temp_buffers">
15861613
<term><varname>temp_buffers</varname> (<type>integer</type>)
15871614
<indexterm>

‎doc/src/sgml/runtime.sgml

Lines changed: 35 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1391,41 +1391,55 @@ export PG_OOM_ADJUST_VALUE=0
13911391
using large values of <xref linkend="guc-shared-buffers"/>. To use this
13921392
feature in <productname>PostgreSQL</productname> you need a kernel
13931393
with <varname>CONFIG_HUGETLBFS=y</varname> and
1394-
<varname>CONFIG_HUGETLB_PAGE=y</varname>. You will also have to adjust
1395-
the kernel setting <varname>vm.nr_hugepages</varname>. To estimate the
1396-
number of huge pages needed, start <productname>PostgreSQL</productname>
1397-
without huge pages enabled and check the
1398-
postmaster's anonymous shared memory segment size, as well as the system's
1399-
huge page size, using the <filename>/proc</filename> file system. This might
1400-
look like:
1394+
<varname>CONFIG_HUGETLB_PAGE=y</varname>. You will also have to configure
1395+
the operating system to provide enough huge pages of the desired size.
1396+
To estimate the number of huge pages needed, start
1397+
<productname>PostgreSQL</productname> without huge pages enabled and check
1398+
the postmaster's anonymous shared memory segment size, as well as the
1399+
system's default and supported huge page sizes, using the
1400+
<filename>/proc</filename> and <filename>/sys</filename> file systems.
1401+
This might look like:
14011402
<programlisting>
14021403
$ <userinput>head -1 $PGDATA/postmaster.pid</userinput>
14031404
4170
14041405
$ <userinput>pmap 4170 | awk '/rw-s/ &amp;&amp; /zero/ {print $2}'</userinput>
14051406
6490428K
14061407
$ <userinput>grep ^Hugepagesize /proc/meminfo</userinput>
14071408
Hugepagesize: 2048 kB
1409+
$ <userinput>ls /sys/kernel/mm/hugepages</userinput>
1410+
hugepages-1048576kB hugepages-2048kB
14081411
</programlisting>
1412+
1413+
In this example the default is 2MB, but you can also explicitly request
1414+
either 2MB or 1GB with <xref linkend="guc-huge-page-size"/>.
1415+
1416+
Assuming <literal>2MB</literal> huge pages,
14091417
<literal>6490428</literal> / <literal>2048</literal> gives approximately
14101418
<literal>3169.154</literal>, so in this example we need at
1411-
least <literal>3170</literal> huge pages, which we can set with:
1419+
least <literal>3170</literal> huge pages. A larger setting would be
1420+
appropriate if other programs on the machine also need huge pages.
1421+
We can set this with:
1422+
<programlisting>
1423+
# <userinput>sysctl -w vm.nr_hugepages=3170</userinput>
1424+
</programlisting>
1425+
Don't forget to add this setting to <filename>/etc/sysctl.conf</filename>
1426+
so that it is reapplied after reboots. For non-default huge page sizes,
1427+
we can instead use:
14121428
<programlisting>
1413-
$ <userinput>sysctl -w vm.nr_hugepages=3170</userinput>
1429+
# <userinput>echo 3170 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages</userinput>
14141430
</programlisting>
1415-
A larger setting would be appropriate if other programs on the machine
1416-
also need huge pages. Don't forget to add this setting
1417-
to <filename>/etc/sysctl.conf</filename> so that it will be reapplied
1418-
after reboots.
1431+
It is also possible to provide these settings at boot time using
1432+
kernel parameters such as <literal>hugepagesz=2M hugepages=3170</literal>.
14191433
</para>
14201434

14211435
<para>
14221436
Sometimes the kernel is not able to allocate the desired number of huge
1423-
pages immediately, so it might be necessary to repeat the command or to
1424-
reboot. (Immediately after a reboot, most of the machine's memory
1425-
should be available to convert into huge pages.) To verify the huge
1426-
page allocation situation, use:
1437+
pages immediately due to fragmentation, so it might be necessary
1438+
to repeat the command or toreboot. (Immediately after a reboot, most of
1439+
the machine's memoryshould be available to convert into huge pages.)
1440+
To verify the hugepage allocation situation for a given size, use:
14271441
<programlisting>
1428-
$ <userinput>grep Huge /proc/meminfo</userinput>
1442+
$ <userinput>cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages</userinput>
14291443
</programlisting>
14301444
</para>
14311445

@@ -1438,8 +1452,9 @@ $ <userinput>grep Huge /proc/meminfo</userinput>
14381452

14391453
<para>
14401454
The default behavior for huge pages in
1441-
<productname>PostgreSQL</productname> is to use them when possible and
1442-
to fall back to normal pages when failing. To enforce the use of huge
1455+
<productname>PostgreSQL</productname> is to use them when possible, with
1456+
the system's default huge page size, and
1457+
to fall back to normal pages on failure. To enforce the use of huge
14431458
pages, you can set <xref linkend="guc-huge-pages"/>
14441459
to <literal>on</literal> in <filename>postgresql.conf</filename>.
14451460
Note that with this setting <productname>PostgreSQL</productname> will fail to

‎src/backend/port/sysv_shmem.c

Lines changed: 45 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232
#endif
3333

3434
#include"miscadmin.h"
35+
#include"port/pg_bitutils.h"
3536
#include"portability/mem.h"
3637
#include"storage/dsm.h"
3738
#include"storage/fd.h"
@@ -448,7 +449,7 @@ PGSharedMemoryAttach(IpcMemoryId shmId,
448449
#ifdefMAP_HUGETLB
449450

450451
/*
451-
* Identify the huge page size to use.
452+
* Identify the huge page size to use, and compute the related mmap flags.
452453
*
453454
* Some Linux kernel versions have a bug causing mmap() to fail on requests
454455
* that are not a multiple of the hugepage size. Versions without that bug
@@ -464,25 +465,13 @@ PGSharedMemoryAttach(IpcMemoryId shmId,
464465
* hugepage sizes, we might want to think about more invasive strategies,
465466
* such as increasing shared_buffers to absorb the extra space.
466467
*
467-
* Returns the (realorassumed) page size into *hugepagesize,
468+
* Returns the (real, assumedorconfig provided) page size into *hugepagesize,
468469
* and the hugepage-related mmap flags to use into *mmap_flags.
469-
*
470-
* Currently *mmap_flags is always just MAP_HUGETLB. Someday, on systems
471-
* that support it, we might OR in additional bits to specify a particular
472-
* non-default huge page size.
473470
*/
474471
staticvoid
475472
GetHugePageSize(Size*hugepagesize,int*mmap_flags)
476473
{
477-
/*
478-
* If we fail to find out the system's default huge page size, assume it
479-
* is 2MB. This will work fine when the actual size is less. If it's
480-
* more, we might get mmap() or munmap() failures due to unaligned
481-
* requests; but at this writing, there are no reports of any non-Linux
482-
* systems being picky about that.
483-
*/
484-
*hugepagesize=2*1024*1024;
485-
*mmap_flags=MAP_HUGETLB;
474+
Sizedefault_hugepagesize=0;
486475

487476
/*
488477
* System-dependent code to find out the default huge page size.
@@ -491,6 +480,7 @@ GetHugePageSize(Size *hugepagesize, int *mmap_flags)
491480
* nnnn kB". Ignore any failures, falling back to the preset default.
492481
*/
493482
#ifdef__linux__
483+
494484
{
495485
FILE*fp=AllocateFile("/proc/meminfo","r");
496486
charbuf[128];
@@ -505,7 +495,7 @@ GetHugePageSize(Size *hugepagesize, int *mmap_flags)
505495
{
506496
if (ch=='k')
507497
{
508-
*hugepagesize=sz* (Size)1024;
498+
default_hugepagesize=sz* (Size)1024;
509499
break;
510500
}
511501
/* We could accept other units besides kB, if needed */
@@ -515,6 +505,44 @@ GetHugePageSize(Size *hugepagesize, int *mmap_flags)
515505
}
516506
}
517507
#endif/* __linux__ */
508+
509+
if (huge_page_size!=0)
510+
{
511+
/* If huge page size is requested explicitly, use that. */
512+
*hugepagesize= (Size)huge_page_size*1024;
513+
}
514+
elseif (default_hugepagesize!=0)
515+
{
516+
/* Otherwise use the system default, if we have it. */
517+
*hugepagesize=default_hugepagesize;
518+
}
519+
else
520+
{
521+
/*
522+
* If we fail to find out the system's default huge page size, or no
523+
* huge page size is requested explicitly, assume it is 2MB. This will
524+
* work fine when the actual size is less. If it's more, we might get
525+
* mmap() or munmap() failures due to unaligned requests; but at this
526+
* writing, there are no reports of any non-Linux systems being picky
527+
* about that.
528+
*/
529+
*hugepagesize=2*1024*1024;
530+
}
531+
532+
*mmap_flags=MAP_HUGETLB;
533+
534+
/*
535+
* On recent enough Linux, also include the explicit page size, if
536+
* necessary.
537+
*/
538+
#if defined(MAP_HUGE_MASK)&& defined(MAP_HUGE_SHIFT)
539+
if (*hugepagesize!=default_hugepagesize)
540+
{
541+
intshift=pg_ceil_log2_64(*hugepagesize);
542+
543+
*mmap_flags |= (shift&MAP_HUGE_MASK) <<MAP_HUGE_SHIFT;
544+
}
545+
#endif
518546
}
519547

520548
#endif/* MAP_HUGETLB */
@@ -583,7 +611,7 @@ CreateAnonymousSegment(Size *size)
583611
"(currently %zu bytes), reduce PostgreSQL's shared "
584612
"memory usage, perhaps by reducing shared_buffers or "
585613
"max_connections.",
586-
*size) :0));
614+
allocsize) :0));
587615
}
588616

589617
*size=allocsize;

‎src/backend/utils/misc/guc.c

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,14 @@
2020
#include<float.h>
2121
#include<math.h>
2222
#include<limits.h>
23-
#include<unistd.h>
23+
#ifndefWIN32
24+
#include<sys/mman.h>
25+
#endif
2426
#include<sys/stat.h>
2527
#ifdefHAVE_SYSLOG
2628
#include<syslog.h>
2729
#endif
30+
#include<unistd.h>
2831

2932
#include"access/commit_ts.h"
3033
#include"access/gin.h"
@@ -198,6 +201,7 @@ static bool check_max_wal_senders(int *newval, void **extra, GucSource source);
198201
staticboolcheck_autovacuum_work_mem(int*newval,void**extra,GucSourcesource);
199202
staticboolcheck_effective_io_concurrency(int*newval,void**extra,GucSourcesource);
200203
staticboolcheck_maintenance_io_concurrency(int*newval,void**extra,GucSourcesource);
204+
staticboolcheck_huge_page_size(int*newval,void**extra,GucSourcesource);
201205
staticvoidassign_pgstat_temp_directory(constchar*newval,void*extra);
202206
staticboolcheck_application_name(char**newval,void**extra,GucSourcesource);
203207
staticvoidassign_application_name(constchar*newval,void*extra);
@@ -576,6 +580,7 @@ intssl_renegotiation_limit;
576580
* need to be duplicated in all the different implementations of pg_shmem.c.
577581
*/
578582
inthuge_pages;
583+
inthuge_page_size;
579584

580585
/*
581586
* These variables are all dummies that don't do anything, except in some
@@ -3381,6 +3386,17 @@ static struct config_int ConfigureNamesInt[] =
33813386
NULL,assign_tcp_user_timeout,show_tcp_user_timeout
33823387
},
33833388

3389+
{
3390+
{"huge_page_size",PGC_POSTMASTER,RESOURCES_MEM,
3391+
gettext_noop("The size of huge page that should be requested."),
3392+
NULL,
3393+
GUC_UNIT_KB
3394+
},
3395+
&huge_page_size,
3396+
0,0,INT_MAX,
3397+
check_huge_page_size,NULL,NULL
3398+
},
3399+
33843400
/* End-of-list marker */
33853401
{
33863402
{NULL,0,0,NULL,NULL},NULL,0,0,0,NULL,NULL,NULL
@@ -11565,6 +11581,20 @@ check_maintenance_io_concurrency(int *newval, void **extra, GucSource source)
1156511581
return true;
1156611582
}
1156711583

11584+
staticbool
11585+
check_huge_page_size(int*newval,void**extra,GucSourcesource)
11586+
{
11587+
#if !(defined(MAP_HUGE_MASK)&& defined(MAP_HUGE_SHIFT))
11588+
/* Recent enough Linux only, for now. See GetHugePageSize(). */
11589+
if (*newval!=0)
11590+
{
11591+
GUC_check_errdetail("huge_page_size must be 0 on this platform.");
11592+
return false;
11593+
}
11594+
#endif
11595+
return true;
11596+
}
11597+
1156811598
staticvoid
1156911599
assign_pgstat_temp_directory(constchar*newval,void*extra)
1157011600
{

‎src/backend/utils/misc/postgresql.conf.sample

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,8 @@
122122
# (change requires restart)
123123
#huge_pages = try# on, off, or try
124124
# (change requires restart)
125+
#huge_page_size = 0# zero for system default
126+
# (change requires restart)
125127
#temp_buffers = 8MB# min 800kB
126128
#max_prepared_transactions = 0# zero disables the feature
127129
# (change requires restart)

‎src/include/storage/pg_shmem.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ typedef struct PGShmemHeader/* standard header for all Postgres shmem */
4444
/* GUC variables */
4545
externintshared_memory_type;
4646
externinthuge_pages;
47+
externinthuge_page_size;
4748

4849
/* Possible values for huge_pages */
4950
typedefenum

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp