Documentation Home
MySQL 5.7 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 35.1Mb
PDF (A4) - 35.2Mb
Man Pages (TGZ) - 256.4Kb
Man Pages (Zip) - 361.2Kb
Info (Gzip) - 3.4Mb
Info (Zip) - 3.4Mb
Excerpts from this Manual

MySQL 5.7 Reference Manual  / ...  / Optimization  / Optimization and Indexes  /  InnoDB and MyISAM Index Statistics Collection

8.3.7 InnoDB and MyISAM Index Statistics Collection

Storage engines collect statistics about tables for use by the optimizer. Table statistics are based on value groups, where a value group is a set of rows with the same key prefix value. For optimizer purposes, an important statistic is the average value group size.

MySQL uses the average value group size in the following ways:

  • To estimate how many rows must be read for eachref access

  • To estimate how many rows a partial join produces; that is, the number of rows that an operation of this form produces:

    (...) JOINtbl_name ONtbl_name.key =expr

As the average value group size for an index increases, the index is less useful for those two purposes because the average number of rows per lookup increases: For the index to be good for optimization purposes, it is best that each index value target a small number of rows in the table. When a given index value yields a large number of rows, the index is less useful and MySQL is less likely to use it.

The average value group size is related to table cardinality, which is the number of value groups. TheSHOW INDEX statement displays a cardinality value based onN/S, whereN is the number of rows in the table andS is the average value group size. That ratio yields an approximate number of value groups in the table.

For a join based on the<=> comparison operator,NULL is not treated differently from any other value:NULL <=> NULL, just asN <=>N for any otherN.

However, for a join based on the= operator,NULL is different from non-NULL values:expr1 =expr2 is not true whenexpr1 orexpr2 (or both) areNULL. This affectsref accesses for comparisons of the formtbl_name.key =expr: MySQL does not access the table if the current value ofexpr isNULL, because the comparison cannot be true.

For= comparisons, it does not matter how manyNULL values are in the table. For optimization purposes, the relevant value is the average size of the non-NULL value groups. However, MySQL does not currently enable that average size to be collected or used.

ForInnoDB andMyISAM tables, you have some control over collection of table statistics by means of theinnodb_stats_method andmyisam_stats_method system variables, respectively. These variables have three possible values, which differ as follows:

  • When the variable is set tonulls_equal, allNULL values are treated as identical (that is, they all form a single value group).

    If theNULL value group size is much higher than the average non-NULL value group size, this method skews the average value group size upward. This makes index appear to the optimizer to be less useful than it really is for joins that look for non-NULL values. Consequently, thenulls_equal method may cause the optimizer not to use the index forref accesses when it should.

  • When the variable is set tonulls_unequal,NULL values are not considered the same. Instead, eachNULL value forms a separate value group of size 1.

    If you have manyNULL values, this method skews the average value group size downward. If the average non-NULL value group size is large, countingNULL values each as a group of size 1 causes the optimizer to overestimate the value of the index for joins that look for non-NULL values. Consequently, thenulls_unequal method may cause the optimizer to use this index forref lookups when other methods may be better.

  • When the variable is set tonulls_ignored,NULL values are ignored.

If you tend to use many joins that use<=> rather than=,NULL values are not special in comparisons and oneNULL is equal to another. In this case,nulls_equal is the appropriate statistics method.

Theinnodb_stats_method system variable has a global value; themyisam_stats_method system variable has both global and session values. Setting the global value affects statistics collection for tables from the corresponding storage engine. Setting the session value affects statistics collection only for the current client connection. This means that you can force a table's statistics to be regenerated with a given method without affecting other clients by setting the session value ofmyisam_stats_method.

To regenerateMyISAM table statistics, you can use any of the following methods:

Some caveats regarding the use ofinnodb_stats_method andmyisam_stats_method:

  • You can force table statistics to be collected explicitly, as just described. However, MySQL may also collect statistics automatically. For example, if during the course of executing statements for a table, some of those statements modify the table, MySQL may collect statistics. (This may occur for bulk inserts or deletes, or someALTER TABLE statements, for example.) If this happens, the statistics are collected using whatever valueinnodb_stats_method ormyisam_stats_method has at the time. Thus, if you collect statistics using one method, but the system variable is set to the other method when a table's statistics are collected automatically later, the other method is used.

  • There is no way to tell which method was used to generate statistics for a given table.

  • These variables apply only toInnoDB andMyISAM tables. Other storage engines have only one method for collecting table statistics. Usually it is closer to thenulls_equal method.