Documentation Home
MySQL 9.4 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 41.2Mb
PDF (A4) - 41.3Mb
Man Pages (TGZ) - 262.8Kb
Man Pages (Zip) - 368.8Kb
Info (Gzip) - 4.1Mb
Info (Zip) - 4.1Mb


MySQL 9.4 Reference Manual  / ...  / The InnoDB Storage Engine  / InnoDB Configuration  / Configuring Optimizer Statistics for InnoDB  /  Configuring Persistent Optimizer Statistics Parameters

17.8.10.1 Configuring Persistent Optimizer Statistics Parameters

The persistent optimizer statistics feature improvesplan stability by storing statistics to disk and making them persistent across server restarts so that theoptimizer is more likely to make consistent choices each time for a given query.

Optimizer statistics are persisted to disk wheninnodb_stats_persistent=ON or when individual tables are defined withSTATS_PERSISTENT=1.innodb_stats_persistent is enabled by default.

Formerly, optimizer statistics were cleared when restarting the server and after some other types of operations, and recomputed on the next table access. Consequently, different estimates could be produced when recalculating statistics leading to different choices in query execution plans and variation in query performance.

Persistent statistics are stored in themysql.innodb_table_stats andmysql.innodb_index_stats tables. SeeSection 17.8.10.1.5, “InnoDB Persistent Statistics Tables”.

If you prefer not to persist optimizer statistics to disk, seeSection 17.8.10.2, “Configuring Non-Persistent Optimizer Statistics Parameters”

17.8.10.1.1 Configuring Automatic Statistics Calculation for Persistent OptimizerStatistics

Theinnodb_stats_auto_recalc variable, which is enabled by default, controls whether statistics are calculated automatically when a table undergoes changes to more than 10% of its rows. You can also configure automatic statistics recalculation for individual tables by specifying theSTATS_AUTO_RECALC clause when creating or altering a table.

Because of the asynchronous nature of automatic statistics recalculation, which occurs in the background, statistics may not be recalculated instantly after running a DML operation that affects more than 10% of a table, even wheninnodb_stats_auto_recalc is enabled. Statistics recalculation can be delayed by few seconds in some cases. If up-to-date statistics are required immediately, runANALYZE TABLE to initiate a synchronous (foreground) recalculation of statistics.

Ifinnodb_stats_auto_recalc is disabled, you can ensure the accuracy of optimizer statistics by executing theANALYZE TABLE statement after making substantial changes to indexed columns. You might also consider addingANALYZE TABLE to setup scripts that you run after loading data, and runningANALYZE TABLE on a schedule at times of low activity.

When an index is added to an existing table, or when a column is added or dropped, index statistics are calculated and added to theinnodb_index_stats table regardless of the value ofinnodb_stats_auto_recalc.

For a histogram withAUTO UPDATE enabled (seeHistogram Statistics Analysis), automatic recalculation of persistent statistics also causes the histogram to be updated.

17.8.10.1.2 Configuring Optimizer Statistics Parameters for Individual Tables

innodb_stats_persistent,innodb_stats_auto_recalc, andinnodb_stats_persistent_sample_pages are global variables. To override these system-wide settings and configure optimizer statistics parameters for individual tables, you can defineSTATS_PERSISTENT,STATS_AUTO_RECALC, andSTATS_SAMPLE_PAGES clauses inCREATE TABLE orALTER TABLE statements.

  • STATS_PERSISTENT specifies whether to enablepersistent statistics for anInnoDB table. The valueDEFAULT causes the persistent statistics setting for the table to be determined by theinnodb_stats_persistent setting. A value of1 enables persistent statistics for the table, while a value of0 disables the feature. After enabling persistent statistics for an individual table, useANALYZE TABLE to calculate statistics after table data is loaded.

  • STATS_AUTO_RECALC specifies whether to automatically recalculatepersistent statistics. The valueDEFAULT causes the persistent statistics setting for the table to be determined by theinnodb_stats_auto_recalc setting. A value of1 causes statistics to be recalculated when 10% of table data has changed. A value0 prevents automatic recalculation for the table. When using a value of 0, useANALYZE TABLE to recalculate statistics after making substantial changes to the table.

  • STATS_SAMPLE_PAGES specifies the number of index pages to sample when cardinality and other statistics are calculated for an indexed column, by anANALYZE TABLE operation, for example.

All three clauses are specified in the followingCREATE TABLE example:

CREATE TABLE `t1` (`id` int(8) NOT NULL auto_increment,`data` varchar(255),`date` datetime,PRIMARY KEY  (`id`),INDEX `DATE_IX` (`date`)) ENGINE=InnoDB,  STATS_PERSISTENT=1,  STATS_AUTO_RECALC=1,  STATS_SAMPLE_PAGES=25;
17.8.10.1.3 Configuring the Number of Sampled Pages for InnoDB Optimizer Statistics

The optimizer uses estimatedstatistics about key distributions to choose the indexes for an execution plan, based on the relativeselectivity of the index. Operations such asANALYZE TABLE causeInnoDB to sample random pages from each index on a table to estimate thecardinality of the index. This sampling technique is known as arandom dive.

Theinnodb_stats_persistent_sample_pages controls the number of sampled pages. You can adjust the setting at runtime to manage the quality of statistics estimates used by the optimizer. The default value is 20. Consider modifying the setting when encountering the following issues:

  1. Statistics are not accurate enough and the optimizer chooses suboptimal plans, as shown inEXPLAIN output. You can check the accuracy of statistics by comparing the actual cardinality of an index (determined by runningSELECT DISTINCT on the index columns) with the estimates in themysql.innodb_index_stats table.

    If it is determined that statistics are not accurate enough, the value ofinnodb_stats_persistent_sample_pages should be increased until the statistics estimates are sufficiently accurate. Increasinginnodb_stats_persistent_sample_pages too much, however, could causeANALYZE TABLE to run slowly.

  2. ANALYZE TABLE is too slow. In this caseinnodb_stats_persistent_sample_pages should be decreased untilANALYZE TABLE execution time is acceptable. Decreasing the value too much, however, could lead to the first problem of inaccurate statistics and suboptimal query execution plans.

    If a balance cannot be achieved between accurate statistics andANALYZE TABLE execution time, consider decreasing the number of indexed columns in the table or limiting the number of partitions to reduceANALYZE TABLE complexity. The number of columns in the table's primary key is also important to consider, as primary key columns are appended to each nonunique index.

    For related information, seeSection 17.8.10.3, “Estimating ANALYZE TABLE Complexity for InnoDB Tables”.

17.8.10.1.4 Including Delete-marked Records in Persistent Statistics Calculations

By default,InnoDB reads uncommitted data when calculating statistics. In the case of an uncommitted transaction that deletes rows from a table, delete-marked records are excluded when calculating row estimates and index statistics, which can lead to non-optimal execution plans for other transactions that are operating on the table concurrently using a transaction isolation level other thanREAD UNCOMMITTED. To avoid this scenario,innodb_stats_include_delete_marked can be enabled to ensure that delete-marked records are included when calculating persistent optimizer statistics.

Wheninnodb_stats_include_delete_marked is enabled,ANALYZE TABLE considers delete-marked records when recalculating statistics.

innodb_stats_include_delete_marked is a global setting that affects allInnoDB tables, and it is only applicable to persistent optimizer statistics.

17.8.10.1.5 InnoDB Persistent Statistics Tables

The persistent statistics feature relies on the internally managed tables in themysql database, namedinnodb_table_stats andinnodb_index_stats. These tables are set up automatically in all install, upgrade, and build-from-source procedures.

Table 17.6 Columns of innodb_table_stats

Column nameDescription
database_nameDatabase name
table_nameTable name, partition name, or subpartition name
last_updateA timestamp indicating the last time thatInnoDB updated this row
n_rowsThe number of rows in the table
clustered_index_sizeThe size of the primary index, in pages
sum_of_other_index_sizesThe total size of other (non-primary) indexes, in pages

Table 17.7 Columns of innodb_index_stats

Column nameDescription
database_nameDatabase name
table_nameTable name, partition name, or subpartition name
index_nameIndex name
last_updateA timestamp indicating the last time the row was updated
stat_nameThe name of the statistic, whose value is reported in thestat_value column
stat_valueThe value of the statistic that is named instat_name column
sample_sizeThe number of pages sampled for the estimate provided in thestat_value column
stat_descriptionDescription of the statistic that is named in thestat_name column

Theinnodb_table_stats andinnodb_index_stats tables include alast_update column that shows when index statistics were last updated:

mysql> SELECT * FROM innodb_table_stats \G*************************** 1. row ***************************           database_name: sakila              table_name: actor             last_update: 2014-05-28 16:16:44                  n_rows: 200    clustered_index_size: 1sum_of_other_index_sizes: 1...
mysql> SELECT * FROM innodb_index_stats \G*************************** 1. row ***************************   database_name: sakila      table_name: actor      index_name: PRIMARY     last_update: 2014-05-28 16:16:44       stat_name: n_diff_pfx01      stat_value: 200     sample_size: 1     ...

Theinnodb_table_stats andinnodb_index_stats tables can be updated manually, which makes it possible to force a specific query optimization plan or test alternative plans without modifying the database. If you manually update statistics, use theFLUSH TABLEtbl_name statement to load the updated statistics.

Persistent statistics are considered local information, because they relate to the server instance. Theinnodb_table_stats andinnodb_index_stats tables are therefore not replicated when automatic statistics recalculation takes place. If you runANALYZE TABLE to initiate a synchronous recalculation of statistics, the statement is replicated (unless you suppressed logging for it), and recalculation takes place on replicas.

17.8.10.1.6 InnoDB Persistent Statistics Tables Example

Theinnodb_table_stats table contains one row for each table. The following example demonstrates the type of data collected.

Tablet1 contains a primary index (columnsa,b) secondary index (columnsc,d), and unique index (columnse, f):

CREATE TABLE t1 (a INT, b INT, c INT, d INT, e INT, f INT,PRIMARY KEY (a, b), KEY i1 (c, d), UNIQUE KEY i2uniq (e, f)) ENGINE=INNODB;

After inserting five rows of sample data, tablet1 appears as follows:

mysql> SELECT * FROM t1;+---+---+------+------+------+------+| a | b | c    | d    | e    | f    |+---+---+------+------+------+------+| 1 | 1 |   10 |   11 |  100 |  101 || 1 | 2 |   10 |   11 |  200 |  102 || 1 | 3 |   10 |   11 |  100 |  103 || 1 | 4 |   10 |   12 |  200 |  104 || 1 | 5 |   10 |   12 |  100 |  105 |+---+---+------+------+------+------+

To immediately update statistics, runANALYZE TABLE (ifinnodb_stats_auto_recalc is enabled, statistics are updated automatically within a few seconds assuming that the 10% threshold for changed table rows is reached):

mysql> ANALYZE TABLE t1;+---------+---------+----------+----------+| Table   | Op      | Msg_type | Msg_text |+---------+---------+----------+----------+| test.t1 | analyze | status   | OK       |+---------+---------+----------+----------+

Table statistics for tablet1 show the last timeInnoDB updated the table statistics (2014-03-14 14:36:34), the number of rows in the table (5), the clustered index size (1 page), and the combined size of the other indexes (2 pages).

mysql> SELECT * FROM mysql.innodb_table_stats WHERE table_name like 't1'\G*************************** 1. row ***************************           database_name: test              table_name: t1             last_update: 2014-03-14 14:36:34                  n_rows: 5    clustered_index_size: 1sum_of_other_index_sizes: 2

Theinnodb_index_stats table contains multiple rows for each index. Each row in theinnodb_index_stats table provides data related to a particular index statistic which is named in thestat_name column and described in thestat_description column. For example:

mysql> SELECT index_name, stat_name, stat_value, stat_description       FROM mysql.innodb_index_stats WHERE table_name like 't1';+------------+--------------+------------+-----------------------------------+| index_name | stat_name    | stat_value | stat_description                  |+------------+--------------+------------+-----------------------------------+| PRIMARY    | n_diff_pfx01 |          1 | a                                 || PRIMARY    | n_diff_pfx02 |          5 | a,b                               || PRIMARY    | n_leaf_pages |          1 | Number of leaf pages in the index || PRIMARY    | size         |          1 | Number of pages in the index      || i1         | n_diff_pfx01 |          1 | c                                 || i1         | n_diff_pfx02 |          2 | c,d                               || i1         | n_diff_pfx03 |          2 | c,d,a                             || i1         | n_diff_pfx04 |          5 | c,d,a,b                           || i1         | n_leaf_pages |          1 | Number of leaf pages in the index || i1         | size         |          1 | Number of pages in the index      || i2uniq     | n_diff_pfx01 |          2 | e                                 || i2uniq     | n_diff_pfx02 |          5 | e,f                               || i2uniq     | n_leaf_pages |          1 | Number of leaf pages in the index || i2uniq     | size         |          1 | Number of pages in the index      |+------------+--------------+------------+-----------------------------------+

Thestat_name column shows the following types of statistics:

  • size: Wherestat_name=size, thestat_value column displays the total number of pages in the index.

  • n_leaf_pages: Wherestat_name=n_leaf_pages, thestat_value column displays the number of leaf pages in the index.

  • n_diff_pfxNN: Wherestat_name=n_diff_pfx01, thestat_value column displays the number of distinct values in the first column of the index. Wherestat_name=n_diff_pfx02, thestat_value column displays the number of distinct values in the first two columns of the index, and so on. Wherestat_name=n_diff_pfxNN, thestat_description column shows a comma separated list of the index columns that are counted.

To further illustrate then_diff_pfxNN statistic, which provides cardinality data, consider once again thet1 table example that was introduced previously. As shown below, thet1 table is created with a primary index (columnsa,b), a secondary index (columnsc,d), and a unique index (columnse,f):

CREATE TABLE t1 (  a INT, b INT, c INT, d INT, e INT, f INT,  PRIMARY KEY (a, b), KEY i1 (c, d), UNIQUE KEY i2uniq (e, f)) ENGINE=INNODB;

After inserting five rows of sample data, tablet1 appears as follows:

mysql> SELECT * FROM t1;+---+---+------+------+------+------+| a | b | c    | d    | e    | f    |+---+---+------+------+------+------+| 1 | 1 |   10 |   11 |  100 |  101 || 1 | 2 |   10 |   11 |  200 |  102 || 1 | 3 |   10 |   11 |  100 |  103 || 1 | 4 |   10 |   12 |  200 |  104 || 1 | 5 |   10 |   12 |  100 |  105 |+---+---+------+------+------+------+

When you query theindex_name,stat_name,stat_value, andstat_description, wherestat_name LIKE 'n_diff%', the following result set is returned:

mysql> SELECT index_name, stat_name, stat_value, stat_description       FROM mysql.innodb_index_stats       WHERE table_name like 't1' AND stat_name LIKE 'n_diff%';+------------+--------------+------------+------------------+| index_name | stat_name    | stat_value | stat_description |+------------+--------------+------------+------------------+| PRIMARY    | n_diff_pfx01 |          1 | a                || PRIMARY    | n_diff_pfx02 |          5 | a,b              || i1         | n_diff_pfx01 |          1 | c                || i1         | n_diff_pfx02 |          2 | c,d              || i1         | n_diff_pfx03 |          2 | c,d,a            || i1         | n_diff_pfx04 |          5 | c,d,a,b          || i2uniq     | n_diff_pfx01 |          2 | e                || i2uniq     | n_diff_pfx02 |          5 | e,f              |+------------+--------------+------------+------------------+

For thePRIMARY index, there are twon_diff% rows. The number of rows is equal to the number of columns in the index.

Note

For nonunique indexes,InnoDB appends the columns of the primary key.

  • Whereindex_name=PRIMARY andstat_name=n_diff_pfx01, thestat_value is1, which indicates that there is a single distinct value in the first column of the index (columna). The number of distinct values in columna is confirmed by viewing the data in columna in tablet1, in which there is a single distinct value (1). The counted column (a) is shown in thestat_description column of the result set.

  • Whereindex_name=PRIMARY andstat_name=n_diff_pfx02, thestat_value is5, which indicates that there are five distinct values in the two columns of the index (a,b). The number of distinct values in columnsa andb is confirmed by viewing the data in columnsa andb in tablet1, in which there are five distinct values: (1,1), (1,2), (1,3), (1,4) and (1,5). The counted columns (a,b) are shown in thestat_description column of the result set.

For the secondary index (i1), there are fourn_diff% rows. Only two columns are defined for the secondary index (c,d) but there are fourn_diff% rows for the secondary index becauseInnoDB suffixes all nonunique indexes with the primary key. As a result, there are fourn_diff% rows instead of two to account for the both the secondary index columns (c,d) and the primary key columns (a,b).

  • Whereindex_name=i1 andstat_name=n_diff_pfx01, thestat_value is1, which indicates that there is a single distinct value in the first column of the index (columnc). The number of distinct values in columnc is confirmed by viewing the data in columnc in tablet1, in which there is a single distinct value: (10). The counted column (c) is shown in thestat_description column of the result set.

  • Whereindex_name=i1 andstat_name=n_diff_pfx02, thestat_value is2, which indicates that there are two distinct values in the first two columns of the index (c,d). The number of distinct values in columnsc and is confirmed by viewing the data in columnsc andd in tablet1, in which there are two distinct values: (10,11) and (10,12). The counted columns (c,d) are shown in thestat_description column of the result set.

  • Whereindex_name=i1 andstat_name=n_diff_pfx03, thestat_value is2, which indicates that there are two distinct values in the first three columns of the index (c,d,a). The number of distinct values in columnsc,d, anda is confirmed by viewing the data in columnc,d, anda in tablet1, in which there are two distinct values: (10,11,1) and (10,12,1). The counted columns (c,d,a) are shown in thestat_description column of the result set.

  • Whereindex_name=i1 andstat_name=n_diff_pfx04, thestat_value is5, which indicates that there are five distinct values in the four columns of the index (c,d,a,b). The number of distinct values in columnsc,d,a andb is confirmed by viewing the data in columnsc,d,a, andb in tablet1, in which there are five distinct values: (10,11,1,1), (10,11,1,2), (10,11,1,3), (10,12,1,4), and (10,12,1,5). The counted columns (c,d,a,b) are shown in thestat_description column of the result set.

For the unique index (i2uniq), there are twon_diff% rows.

  • Whereindex_name=i2uniq andstat_name=n_diff_pfx01, thestat_value is2, which indicates that there are two distinct values in the first column of the index (columne). The number of distinct values in columne is confirmed by viewing the data in columne in tablet1, in which there are two distinct values: (100) and (200). The counted column (e) is shown in thestat_description column of the result set.

  • Whereindex_name=i2uniq andstat_name=n_diff_pfx02, thestat_value is5, which indicates that there are five distinct values in the two columns of the index (e,f). The number of distinct values in columnse andf is confirmed by viewing the data in columnse andf in tablet1, in which there are five distinct values: (100,101), (200,102), (100,103), (200,104), and (100,105). The counted columns (e,f) are shown in thestat_description column of the result set.

17.8.10.1.7 Retrieving Index Size Using the innodb_index_stats Table

You can retrieve the index size for tables, partitions, or subpartitions can using theinnodb_index_stats table. In the following example, index sizes are retrieved for tablet1. For a definition of tablet1 and corresponding index statistics, seeSection 17.8.10.1.6, “InnoDB Persistent Statistics Tables Example”.

mysql> SELECT SUM(stat_value) pages, index_name,       SUM(stat_value)*@@innodb_page_size size       FROM mysql.innodb_index_stats WHERE table_name='t1'       AND stat_name = 'size' GROUP BY index_name;+-------+------------+-------+| pages | index_name | size  |+-------+------------+-------+|     1 | PRIMARY    | 16384 ||     1 | i1         | 16384 ||     1 | i2uniq     | 16384 |+-------+------------+-------+

For partitions or subpartitions, you can use the same query with a modifiedWHERE clause to retrieve index sizes. For example, the following query retrieves index sizes for partitions of tablet1:

mysql> SELECT SUM(stat_value) pages, index_name,       SUM(stat_value)*@@innodb_page_size size       FROM mysql.innodb_index_stats WHERE table_name like 't1#P%'       AND stat_name = 'size' GROUP BY index_name;