Documentation Home
MySQL 8.0 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 43.3Mb
PDF (A4) - 43.4Mb
Man Pages (TGZ) - 297.2Kb
Man Pages (Zip) - 402.4Kb
Info (Gzip) - 4.3Mb
Info (Zip) - 4.3Mb
Excerpts from this Manual

25.6.11.1 NDB Cluster Disk Data Objects

NDB Cluster Disk Data storage is implemented using the following objects:

  • Tablespace: Acts as containers for other Disk Data objects. A tablespace contains one or more data files and one or more undo log file groups.

  • Data file: Stores column data. A data file is assigned directly to a tablespace.

  • Undo log file: Contains undo information required for rolling back transactions. Assigned to an undo log file group.

  • log file group: Contains one or more undo log files. Assigned to a tablespace.

Undo log files and data files are actual files in the file system of each data node; by default they are placed inndb_node_id_fs in theDataDir specified in the NDB Clusterconfig.ini file, and wherenode_id is the data node's node ID. It is possible to place these elsewhere by specifying either an absolute or relative path as part of the filename when creating the undo log or data file. Statements that create these files are shown later in this section.

Undo log files are used only by Disk Data tables, and are not needed or used byNDB tables that are stored in memory only.

NDB Cluster tablespaces and log file groups are not implemented as files.

Although not all Disk Data objects are implemented as files, they all share the same namespace. This means thateach Disk Data object must be uniquely named (and not merely each Disk Data object of a given type). For example, you cannot have a tablespace and a log file group both nameddd1.

Assuming that you have already set up an NDB Cluster with all nodes (including management and SQL nodes), the basic steps for creating an NDB Cluster table on disk are as follows:

  1. Create a log file group, and assign one or more undo log files to it (an undo log file is also sometimes referred to as anundofile).

  2. Create a tablespace; assign the log file group, as well as one or more data files, to the tablespace.

  3. Create a Disk Data table that uses this tablespace for data storage.

Each of these tasks can be accomplished using SQL statements in themysql client or other MySQL client application, as shown in the example that follows.

  1. We create a log file group namedlg_1 usingCREATE LOGFILE GROUP. This log file group is to be made up of two undo log files, which we nameundo_1.log andundo_2.log, whose initial sizes are 16 MB and 12 MB, respectively. (The default initial size for an undo log file is 128 MB.) Optionally, you can also specify a size for the log file group's undo buffer, or permit it to assume the default value of 8 MB. In this example, we set the UNDO buffer's size at 2 MB. A log file group must be created with an undo log file; so we addundo_1.log tolg_1 in thisCREATE LOGFILE GROUP statement:

    CREATE LOGFILE GROUP lg_1    ADD UNDOFILE 'undo_1.log'    INITIAL_SIZE 16M    UNDO_BUFFER_SIZE 2M    ENGINE NDBCLUSTER;

    To addundo_2.log to the log file group, use the followingALTER LOGFILE GROUP statement:

    ALTER LOGFILE GROUP lg_1    ADD UNDOFILE 'undo_2.log'    INITIAL_SIZE 12M    ENGINE NDBCLUSTER;

    Some items of note:

    • The.log file extension used here is not required. We employ it merely to make the log files easily recognizable.

    • EveryCREATE LOGFILE GROUP andALTER LOGFILE GROUP statement must include anENGINE option. The only permitted values for this option areNDBCLUSTER andNDB.

      Important

      There can exist at most one log file group in the same NDB Cluster at any given time.

    • When you add an undo log file to a log file group usingADD UNDOFILE 'filename', a file with the namefilename is created in thendb_node_id_fs directory within theDataDir of each data node in the cluster, wherenode_id is the node ID of the data node. Each undo log file is of the size specified in the SQL statement. For example, if an NDB Cluster has 4 data nodes, then theALTER LOGFILE GROUP statement just shown creates 4 undo log files, 1 each on in the data directory of each of the 4 data nodes; each of these files is namedundo_2.log and each file is 12 MB in size.

    • UNDO_BUFFER_SIZE is limited by the amount of system memory available.

    • SeeSection 15.1.16, “CREATE LOGFILE GROUP Statement”, andSection 15.1.6, “ALTER LOGFILE GROUP Statement”, for more information about these statements.

  2. Now we can create a tablespace—an abstract container for files used by Disk Data tables to store data. A tablespace is associated with a particular log file group; when creating a new tablespace, you must specify the log file group it uses for undo logging. You must also specify at least one data file; you can add more data files to the tablespace after the tablespace is created. It is also possible to drop data files from a tablespace (see example later in this section).

    Assume that we wish to create a tablespace namedts_1 which useslg_1 as its log file group. We want the tablespace to contain two data files, nameddata_1.dat anddata_2.dat, whose initial sizes are 32 MB and 48 MB, respectively. (The default value forINITIAL_SIZE is 128 MB.) We can do this using two SQL statements, as shown here:

    CREATE TABLESPACE ts_1    ADD DATAFILE 'data_1.dat'    USE LOGFILE GROUP lg_1    INITIAL_SIZE 32M    ENGINE NDBCLUSTER;ALTER TABLESPACE ts_1    ADD DATAFILE 'data_2.dat'    INITIAL_SIZE 48M;

    TheCREATE TABLESPACE statement creates a tablespacets_1 with the data filedata_1.dat, and associatests_1 with log file grouplg_1. TheALTER TABLESPACE adds the second data file (data_2.dat).

    Some items of note:

    • As is the case with the.log file extension used in this example for undo log files, there is no special significance for the.dat file extension; it is used merely for easy recognition.

    • When you add a data file to a tablespace usingADD DATAFILE 'filename', a file with the namefilename is created in thendb_node_id_fs directory within theDataDir of each data node in the cluster, wherenode_id is the node ID of the data node. Each data file is of the size specified in the SQL statement. For example, if an NDB Cluster has 4 data nodes, then theALTER TABLESPACE statement just shown creates 4 data files, 1 each in the data directory of each of the 4 data nodes; each of these files is nameddata_2.dat and each file is 48 MB in size.

    • NDB reserves 4% of each tablespace for use during data node restarts. This space is not available for storing data.

    • CREATE TABLESPACE statements must contain anENGINE clause; only tables using the same storage engine as the tablespace can be created in the tablespace. ForALTER TABLESPACE, anENGINE clause is accepted but is deprecated and subject to removal in a future release. ForNDB tablespaces, the only permitted values for this option areNDBCLUSTER andNDB.

    • In NDB 8.0.20 and later, allocation of extents is performed in round-robin fashion among all data files used by a given tablespace.

    • For more information about theCREATE TABLESPACE andALTER TABLESPACE statements, seeSection 15.1.21, “CREATE TABLESPACE Statement”, andSection 15.1.10, “ALTER TABLESPACE Statement”.

  3. Now it is possible to create a table whose unindexed columns are stored on disk using files in tablespacets_1:

    CREATE TABLE dt_1 (    member_id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,    last_name VARCHAR(50) NOT NULL,    first_name VARCHAR(50) NOT NULL,    dob DATE NOT NULL,    joined DATE NOT NULL,    INDEX(last_name, first_name)    )    TABLESPACE ts_1 STORAGE DISK    ENGINE NDBCLUSTER;

    TABLESPACE ts_1 STORAGE DISK tells theNDB storage engine to use tablespacets_1 for data storage on disk.

    Once tablets_1 has been created as shown, you can performINSERT,SELECT,UPDATE, andDELETE statements on it just as you would with any other MySQL table.

    It is also possible to specify whether an individual column is stored on disk or in memory by using aSTORAGE clause as part of the column's definition in aCREATE TABLE orALTER TABLE statement.STORAGE DISK causes the column to be stored on disk, andSTORAGE MEMORY causes in-memory storage to be used. SeeSection 15.1.20, “CREATE TABLE Statement”, for more information.

You can obtain information about theNDB disk data files and undo log files just created by querying theFILES table in theINFORMATION_SCHEMA database, as shown here:

mysql> SELECT              FILE_NAME AS File, FILE_TYPE AS Type,              TABLESPACE_NAME AS Tablespace, TABLE_NAME AS Name,              LOGFILE_GROUP_NAME AS 'File group',              FREE_EXTENTS AS Free, TOTAL_EXTENTS AS Total          FROM INFORMATION_SCHEMA.FILES          WHERE ENGINE='ndbcluster';+--------------+----------+------------+------+------------+------+---------+| File         | Type     | Tablespace | Name | File group | Free | Total   |+--------------+----------+------------+------+------------+------+---------+| ./undo_1.log | UNDO LOG | lg_1       | NULL | lg_1       |    0 | 4194304 || ./undo_2.log | UNDO LOG | lg_1       | NULL | lg_1       |    0 | 3145728 || ./data_1.dat | DATAFILE | ts_1       | NULL | lg_1       |   32 |      32 || ./data_2.dat | DATAFILE | ts_1       | NULL | lg_1       |   48 |      48 |+--------------+----------+------------+------+------------+------+---------+4 rows in set (0.00 sec)

For more information and examples, seeSection 28.3.15, “The INFORMATION_SCHEMA FILES Table”.

Indexing of columns implicitly stored on disk.  For tabledt_1 as defined in the example just shown, only thedob andjoined columns are stored on disk. This is because there are indexes on theid,last_name, andfirst_name columns, and so data belonging to these columns is stored in RAM. Only nonindexed columns can be held on disk; indexes and indexed column data continue to be stored in memory. This tradeoff between the use of indexes and conservation of RAM is something you must keep in mind as you design Disk Data tables.

You cannot add an index to a column that has been explicitly declaredSTORAGE DISK, without first changing its storage type toMEMORY; any attempt to do so fails with an error. A column whichimplicitly uses disk storage can be indexed; when this is done, the column's storage type is changed toMEMORY automatically. Byimplicitly, we mean a column whose storage type is not declared, but which is which inherited from the parent table. In the following CREATE TABLE statement (using the tablespacets_1 defined previously), columnsc2 andc3 use disk storage implicitly:

mysql> CREATE TABLE ti (    ->     c1 INT PRIMARY KEY,    ->     c2 INT,    ->     c3 INT,    ->     c4 INT    -> )    ->     STORAGE DISK    ->     TABLESPACE ts_1    ->     ENGINE NDBCLUSTER;Query OK, 0 rows affected (1.31 sec)

Becausec2,c3, andc4 are themselves not declared withSTORAGE DISK, it is possible to index them. Here, we add indexes toc2 andc3, using, respectively,CREATE INDEX andALTER TABLE:

mysql> CREATE INDEX i1 ON ti(c2);Query OK, 0 rows affected (2.72 sec)Records: 0  Duplicates: 0  Warnings: 0mysql> ALTER TABLE ti ADD INDEX i2(c3);Query OK, 0 rows affected (0.92 sec)Records: 0  Duplicates: 0  Warnings: 0

SHOW CREATE TABLE confirms that the indexes were added.

mysql> SHOW CREATE TABLE ti\G*************************** 1. row ***************************       Table: tiCreate Table: CREATE TABLE `ti` (  `c1` int(11) NOT NULL,  `c2` int(11) DEFAULT NULL,  `c3` int(11) DEFAULT NULL,  `c4` int(11) DEFAULT NULL,  PRIMARY KEY (`c1`),  KEY `i1` (`c2`),  KEY `i2` (`c3`)) /*!50100 TABLESPACE `ts_1` STORAGE DISK */ ENGINE=ndbcluster DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci1 row in set (0.00 sec)

You can see usingndb_desc that the indexed columns (emphasized text) now use in-memory rather than on-disk storage:

$> ./ndb_desc -d test t1-- t1 --Version: 33554433Fragment type: HashMapPartitionK Value: 6Min load factor: 78Max load factor: 80Temporary table: noNumber of attributes: 4Number of primary keys: 1Length of frm data: 317Max Rows: 0Row Checksum: 1Row GCI: 1SingleUserMode: 0ForceVarPart: 1PartitionCount: 4FragmentCount: 4PartitionBalance: FOR_RP_BY_LDMExtraRowGciBits: 0ExtraRowAuthorBits: 0TableStatus: RetrievedTable options:HashMap: DEFAULT-HASHMAP-3840-4-- Attributes --c1 Int PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORYc2 Int NULL AT=FIXED ST=MEMORYc3 Int NULL AT=FIXED ST=MEMORYc4 Int NULL AT=FIXED ST=DISK-- Indexes --PRIMARY KEY(c1) - UniqueHashIndexi2(c3) - OrderedIndexPRIMARY(c1) - OrderedIndexi1(c2) - OrderedIndex

Performance note.  The performance of a cluster using Disk Data storage is greatly improved if Disk Data files are kept on a separate physical disk from the data node file system. This must be done for each data node in the cluster to derive any noticeable benefit.

You can use absolute and relative file system paths withADD UNDOFILE andADD DATAFILE; relative paths are calculated with respect to the data node's data directory.

A log file group, a tablespace, and any Disk Data tables using these must be created in a particular order. This is also true for dropping these objects, subject to the following constraints:

  • A log file group cannot be dropped as long as any tablespaces use it.

  • A tablespace cannot be dropped as long as it contains any data files.

  • You cannot drop any data files from a tablespace as long as there remain any tables which are using the tablespace.

  • It is not possible to drop files created in association with a different tablespace other than the one with which the files were created.

For example, to drop all the objects created so far in this section, you can use the following statements:

mysql> DROP TABLE dt_1;mysql> ALTER TABLESPACE ts_1    -> DROP DATAFILE 'data_2.dat'    -> ENGINE NDBCLUSTER;mysql> ALTER TABLESPACE ts_1    -> DROP DATAFILE 'data_1.dat'    -> ENGINE NDBCLUSTER;mysql> DROP TABLESPACE ts_1    -> ENGINE NDBCLUSTER;mysql> DROP LOGFILE GROUP lg_1    -> ENGINE NDBCLUSTER;

These statements must be performed in the order shown, except that the twoALTER TABLESPACE ... DROP DATAFILE statements may be executed in either order.