Documentation Home
MySQL 9.1 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 40.4Mb
PDF (A4) - 40.5Mb
Man Pages (TGZ) - 259.5Kb
Man Pages (Zip) - 366.7Kb
Info (Gzip) - 4.1Mb
Info (Zip) - 4.1Mb


MySQL 9.1 Reference Manual  / ...  / SQL Statements  / Data Definition Statements  / CREATE TABLE Statement  /  Secondary Indexes and Generated Columns

15.1.20.9 Secondary Indexes and Generated Columns

InnoDB supports secondary indexes on virtual generated columns. Other index types are not supported. A secondary index defined on a virtual column is sometimes referred to as avirtual index.

A secondary index may be created on one or more virtual columns or on a combination of virtual columns and regular columns or stored generated columns. Secondary indexes that include virtual columns may be defined asUNIQUE.

When a secondary index is created on a virtual generated column, generated column values are materialized in the records of the index. If the index is acovering index (one that includes all the columns retrieved by a query), generated column values are retrieved from materialized values in the index structure instead of computedon the fly.

There are additional write costs to consider when using a secondary index on a virtual column due to computation performed when materializing virtual column values in secondary index records duringINSERT andUPDATE operations. Even with additional write costs, secondary indexes on virtual columns may be preferable to generatedstored columns, which are materialized in the clustered index, resulting in larger tables that require more disk space and memory. If a secondary index is not defined on a virtual column, there are additional costs for reads, as virtual column values must be computed each time the column's row is examined.

Values of an indexed virtual column are MVCC-logged to avoid unnecessary recomputation of generated column values during rollback or during a purge operation. The data length of logged values is limited by the index key limit of 767 bytes forCOMPACT andREDUNDANT row formats, and 3072 bytes forDYNAMIC andCOMPRESSED row formats.

Adding or dropping a secondary index on a virtual column is an in-place operation.

Indexing a Generated Column to Provide a JSON Column Index

As noted elsewhere,JSON columns cannot be indexed directly. To create an index that references such a column indirectly, you can define a generated column that extracts the information that should be indexed, then create an index on the generated column, as shown in this example:

mysql> CREATE TABLE jemp (    ->     c JSON,    ->     g INT GENERATED ALWAYS AS (c->"$.id"),    ->     INDEX i (g)    -> );Query OK, 0 rows affected (0.28 sec)mysql> INSERT INTO jemp (c) VALUES     >   ('{"id": "1", "name": "Fred"}'), ('{"id": "2", "name": "Wilma"}'),     >   ('{"id": "3", "name": "Barney"}'), ('{"id": "4", "name": "Betty"}');Query OK, 4 rows affected (0.04 sec)Records: 4  Duplicates: 0  Warnings: 0mysql> SELECT c->>"$.name" AS name     >     FROM jemp WHERE g > 2;+--------+| name   |+--------+| Barney || Betty  |+--------+2 rows in set (0.00 sec)mysql> EXPLAIN SELECT c->>"$.name" AS name     >    FROM jemp WHERE g > 2\G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: jemp   partitions: NULL         type: rangepossible_keys: i          key: i      key_len: 5          ref: NULL         rows: 2     filtered: 100.00        Extra: Using where1 row in set, 1 warning (0.00 sec)mysql> SHOW WARNINGS\G*************************** 1. row ***************************  Level: Note   Code: 1003Message: /* select#1 */ select json_unquote(json_extract(`test`.`jemp`.`c`,'$.name'))AS `name` from `test`.`jemp` where (`test`.`jemp`.`g` > 2)1 row in set (0.00 sec)

(We have wrapped the output from the last statement in this example to fit the viewing area.)

When you useEXPLAIN on aSELECT or other SQL statement containing one or more expressions that use the-> or->> operator, these expressions are translated into their equivalents usingJSON_EXTRACT() and (if needed)JSON_UNQUOTE() instead, as shown here in the output fromSHOW WARNINGS immediately following thisEXPLAIN statement:

mysql> EXPLAIN SELECT c->>"$.name"     > FROM jemp WHERE g > 2 ORDER BY c->"$.name"\G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: jemp   partitions: NULL         type: rangepossible_keys: i          key: i      key_len: 5          ref: NULL         rows: 2     filtered: 100.00        Extra: Using where; Using filesort1 row in set, 1 warning (0.00 sec)mysql> SHOW WARNINGS\G*************************** 1. row ***************************  Level: Note   Code: 1003Message: /* select#1 */ select json_unquote(json_extract(`test`.`jemp`.`c`,'$.name')) AS`c->>"$.name"` from `test`.`jemp` where (`test`.`jemp`.`g` > 2) order byjson_extract(`test`.`jemp`.`c`,'$.name')1 row in set (0.00 sec)

See the descriptions of the-> and->> operators, as well as those of theJSON_EXTRACT() andJSON_UNQUOTE() functions, for additional information and examples.

This technique also can be used to provide indexes that indirectly reference columns of other types that cannot be indexed directly, such asGEOMETRY columns.

It is also possible to create an index on aJSON column using theJSON_VALUE() function with an expression that can be used to optimize queries employing the expression. See the description of that function for more information and examples.

JSON columns and indirect indexing in NDB Cluster

It is also possible to use indirect indexing of JSON columns in MySQL NDB Cluster, subject to the following conditions:

  1. NDB handles aJSON column value internally as aBLOB. This means that anyNDB table having one or more JSON columns must have a primary key, else it cannot be recorded in the binary log.

  2. TheNDB storage engine does not support indexing of virtual columns. Since the default for generated columns isVIRTUAL, you must specify explicitly the generated column to which to apply the indirect index asSTORED.

TheCREATE TABLE statement used to create the tablejempn shown here is a version of thejemp table shown previously, with modifications making it compatible withNDB:

CREATE TABLE jempn (  a BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,  c JSON DEFAULT NULL,  g INT GENERATED ALWAYS AS (c->"$.id") STORED,  INDEX i (g)) ENGINE=NDB;

We can populate this table using the followingINSERT statement:

INSERT INTO jempn (c) VALUES  ('{"id": "1", "name": "Fred"}'),  ('{"id": "2", "name": "Wilma"}'),  ('{"id": "3", "name": "Barney"}'),  ('{"id": "4", "name": "Betty"}');

NowNDB can use indexi, as shown here:

mysql> EXPLAIN SELECT c->>"$.name" AS name    ->           FROM jempn WHERE g > 2\G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: jempn   partitions: p0,p1,p2,p3         type: rangepossible_keys: i          key: i      key_len: 5          ref: NULL         rows: 3     filtered: 100.00        Extra: Using pushed condition (`test`.`jempn`.`g` > 2)1 row in set, 1 warning (0.01 sec)mysql> SHOW WARNINGS\G*************************** 1. row ***************************  Level: Note   Code: 1003Message: /* select#1 */ selectjson_unquote(json_extract(`test`.`jempn`.`c`,'$.name')) AS `name` from`test`.`jempn` where (`test`.`jempn`.`g` > 2)   1 row in set (0.00 sec)

You should keep in mind that a stored generated column, as well as any index on such a column, usesDataMemory.