Documentation Home
MySQL 9.5 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 41.4Mb
PDF (A4) - 41.5Mb
Man Pages (TGZ) - 272.3Kb
Man Pages (Zip) - 378.2Kb
Info (Gzip) - 4.1Mb
Info (Zip) - 4.1Mb


25.4.3.6 Defining NDB Cluster Data Nodes

The[ndbd] and[ndbd default] sections are used to configure the behavior of the cluster's data nodes.

[ndbd] and[ndbd default] are always used as the section names whether you are usingndbd orndbmtd binaries for the data node processes.

There are many parameters which control buffer sizes, pool sizes, timeouts, and so forth. The only mandatory parameter isExecuteOnComputer; this must be defined in the local[ndbd] section.

The parameterNoOfReplicas should be defined in the[ndbd default] section, as it is common to all Cluster data nodes. It is not strictly necessary to setNoOfReplicas, but it is good practice to set it explicitly.

Most data node parameters are set in the[ndbd default] section. Only those parameters explicitly stated as being able to set local values are permitted to be changed in the[ndbd] section. Where present,HostName andNodeIdmust be defined in the local[ndbd] section, and not in any other section ofconfig.ini. In other words, settings for these parameters are specific to one data node.

For those parameters affecting memory usage or buffer sizes, it is possible to useK,M, orG as a suffix to indicate units of 1024, 1024×1024, or 1024×1024×1024. (For example,100K means 100 × 1024 = 102400.)

Parameter names and values are case-insensitive, unless used in a MySQL Servermy.cnf ormy.ini file, in which case they are case-sensitive.

Information about configuration parameters specific to NDB Cluster Disk Data tables can be found later in this section (seeDisk Data Configuration Parameters).

All of these parameters also apply tondbmtd (the multithreaded version ofndbd). Three additional data node configuration parameters—MaxNoOfExecutionThreads,ThreadConfig, andNoOfFragmentLogParts—apply tondbmtd only; these have no effect when used withndbd. For more information, seeMulti-Threading Configuration Parameters (ndbmtd). See alsoSection 25.5.3, “ndbmtd — The NDB Cluster Data Node Daemon (Multi-Threaded)”.

Identifying data nodes.  TheNodeId orId value (that is, the data node identifier) can be allocated on the command line when the node is started or in the configuration file.

  • NodeId

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default[...]
    Range1 - 144
    Restart Type

    Initial System Restart:Requires a complete shutdown of the cluster, wiping and restoring the cluster file system from abackup, and then restarting the cluster. (NDB 9.5.0)

    A unique node ID is used as the node's address for all cluster internal messages. For data nodes, this is an integer in the range 1 to 144 inclusive. Each node in the cluster must have a unique identifier.

    NodeId is the only supported parameter name to use when identifying data nodes.

  • ExecuteOnComputer

    Version (or later)NDB 9.5.0
    Type or unitsname
    Default[...]
    Range...
    DeprecatedYes (in NDB 7.5)
    Restart Type

    System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

    This refers to theId set for one of the computers defined in a[computer] section.

    Important

    This parameter is deprecated, and is subject to removal in a future release. Use theHostName parameter instead.

  • The node ID for this node can be given out only to connections that explicitly request it. A management server that requestsany node ID cannot use this one. This parameter can be used when running multiple management servers on the same host, andHostName is not sufficient for distinguishing among processes.

  • HostName

    Version (or later)NDB 9.5.0
    Type or unitsname or IP address
    Defaultlocalhost
    Range...
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Specifying this parameter defines the hostname of the computer on which the data node is to reside. UseHostName to specify a host name other thanlocalhost.

  • ServerPort

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default[...]
    Range1 - 64K
    Restart Type

    System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

    Each node in the cluster uses a port to connect to other nodes. By default, this port is allocated dynamically in such a way as to ensure that no two nodes on the same host computer receive the same port number, so it should normally not be necessary to specify a value for this parameter.

    However, if you need to be able to open specific ports in a firewall to permit communication between data nodes and API nodes (including SQL nodes), you can set this parameter to the number of the desired port in an[ndbd] section or (if you need to do this for multiple data nodes) the[ndbd default] section of theconfig.ini file, and then open the port having that number for incoming connections from SQL nodes, API nodes, or both.

    Note

    Connections from data nodes to management nodes is done using thendb_mgmd management port (the management server'sPortNumber) so outgoing connections to that port from any data nodes should always be permitted.

  • TcpBind_INADDR_ANY

    Setting this parameter toTRUE or1 bindsIP_ADDR_ANY so that connections can be made from anywhere (for autogenerated connections). The default isFALSE (0).

  • NodeGroup

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default[...]
    Range0 - 65536
    Restart Type

    Initial System Restart:Requires a complete shutdown of the cluster, wiping and restoring the cluster file system from abackup, and then restarting the cluster. (NDB 9.5.0)

    This parameter can be used to assign a data node to a specific node group. It is read only when the cluster is started for the first time, and cannot be used to reassign a data node to a different node group online. It is generally not desirable to use this parameter in the[ndbd default] section of theconfig.ini file, and care must be taken not to assign nodes to node groups in such a way that an invalid numbers of nodes are assigned to any node groups.

    TheNodeGroup parameter is chiefly intended for use in adding a new node group to a running NDB Cluster without having to perform a rolling restart. For this purpose, you should set it to 65536 (the maximum value). You are not required to set aNodeGroup value for all cluster data nodes, only for those nodes which are to be started and added to the cluster as a new node group at a later time. For more information, seeSection 25.6.7.3, “Adding NDB Cluster Data Nodes Online: Detailed Example”.

  • LocationDomainId

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default0
    Range0 - 16
    Restart Type

    System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

    Assigns a data node to a specificavailability domain (also known as an availability zone) within a cloud. By informingNDB which nodes are in which availability domains, performance can be improved in a cloud environment in the following ways:

    • If requested data is not found on the same node, reads can be directed to another node in the same availability domain.

    • Communication between nodes in different availability domains are guaranteed to useNDB transporters' WAN support without any further manual intervention.

    • The transporter's group number can be based on which availability domain is used, such that also SQL and other API nodes communicate with local data nodes in the same availability domain whenever possible.

    • The arbitrator can be selected from an availability domain in which no data nodes are present, or, if no such availability domain can be found, from a third availability domain.

    LocationDomainId takes an integer value between 0 and 16 inclusive, with 0 being the default; using 0 is the same as leaving the parameter unset.

  • NoOfReplicas

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default2
    Range1 - 4
    Restart Type

    Initial System Restart:Requires a complete shutdown of the cluster, wiping and restoring the cluster file system from abackup, and then restarting the cluster. (NDB 9.5.0)

    This global parameter can be set only in the[ndbd default] section, and defines the number of fragment replicas for each table stored in the cluster. This parameter also specifies the size of node groups. A node group is a set of nodes all storing the same information.

    Node groups are formed implicitly. The first node group is formed by the set of data nodes with the lowest node IDs, the next node group by the set of the next lowest node identities, and so on. By way of example, assume that we have 4 data nodes and thatNoOfReplicas is set to 2. The four data nodes have node IDs 2, 3, 4 and 5. Then the first node group is formed from nodes 2 and 3, and the second node group by nodes 4 and 5. It is important to configure the cluster in such a manner that nodes in the same node groups are not placed on the same computer because a single hardware failure would cause the entire cluster to fail.

    If no node IDs are provided, the order of the data nodes is the determining factor for the node group. Whether or not explicit assignments are made, they can be viewed in the output of the management client'sSHOW command.

    The default value forNoOfReplicas is 2. This is the recommended value for most production environments. Setting this parameter's value to 3 or 4 is also supported.

    Warning

    SettingNoOfReplicas to 1 means that there is only a single copy of all Cluster data; in this case, the loss of a single data node causes the cluster to fail because there are no additional copies of the data stored by that node.

    The number of data nodes in the cluster must be evenly divisible by the value of this parameter. For example, if there are two data nodes, thenNoOfReplicas must be equal to either 1 or 2, since 2/3 and 2/4 both yield fractional values; if there are four data nodes, thenNoOfReplicas must be equal to 1, 2, or 4.

  • DataDir

    Version (or later)NDB 9.5.0
    Type or unitspath
    Default.
    Range...
    Restart Type

    Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

    This parameter specifies the directory where trace files, log files, pid files and error logs are placed.

    The default is the data node process working directory.

  • FileSystemPath

    Version (or later)NDB 9.5.0
    Type or unitspath
    DefaultDataDir
    Range...
    Restart Type

    Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

    This parameter specifies the directory where all files created for metadata, REDO logs, UNDO logs (for Disk Data tables), and data files are placed. The default is the directory specified byDataDir.

    Note

    This directory must exist before thendbd process is initiated.

    The recommended directory hierarchy for NDB Cluster includes/var/lib/mysql-cluster, under which a directory for the node's file system is created. The name of this subdirectory contains the node ID. For example, if the node ID is 2, this subdirectory is namedndb_2_fs.

  • BackupDataDir

    Version (or later)NDB 9.5.0
    Type or unitspath
    DefaultFileSystemPath
    Range...
    Restart Type

    Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

    This parameter specifies the directory in which backups are placed.

    Important

    The string '/BACKUP' is always appended to this value. For example, if you set the value ofBackupDataDir to/var/lib/cluster-data, then all backups are stored under/var/lib/cluster-data/BACKUP. This also means that theeffective default backup location is the directory namedBACKUP under the location specified by theFileSystemPath parameter.

Data Memory, Index Memory, and String Memory

DataMemory andIndexMemory are[ndbd] parameters specifying the size of memory segments used to store the actual records and their indexes. In setting values for these, it is important to understand howDataMemory is used, as it usually needs to be updated to reflect actual usage by the cluster.

Note

IndexMemory is deprecated, and subject to removal in a future version of NDB Cluster. See the descriptions that follow for further information.

  • DataMemory

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default98M
    Range1M - 16T
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter defines the amount of space (in bytes) available for storing database records. The entire amount specified by this value is allocated in memory, so it is extremely important that the machine has sufficient physical memory to accommodate it.

    The memory allocated byDataMemory is used to store both the actual records and indexes. There is a 16-byte overhead on each record; an additional amount for each record is incurred because it is stored in a 32KB page with 128 byte page overhead (see below). There is also a small amount wasted per page due to the fact that each record is stored in only one page.

    For variable-size table attributes, the data is stored on separate data pages, allocated fromDataMemory. Variable-length records use a fixed-size part with an extra overhead of 4 bytes to reference the variable-size part. The variable-size part has 2 bytes overhead plus 2 bytes per attribute.

    The maximum record size is 30000 bytes.

    Resources assigned toDataMemory are used for storing all data and indexes. (Any memory configured asIndexMemory is automatically added to that used byDataMemory to form a common resource pool.)

    The memory space allocated byDataMemory consists of 32KB pages, which are allocated to table fragments. Each table is normally partitioned into the same number of fragments as there are data nodes in the cluster. Thus, for each node, there are the same number of fragments as are set inNoOfReplicas.

    Once a page has been allocated, it is currently not possible to return it to the pool of free pages, except by deleting the table. (This also means thatDataMemory pages, once allocated to a given table, cannot be used by other tables.) Performing a data node recovery also compresses the partition because all records are inserted into empty partitions from other live nodes.

    TheDataMemory memory space also contains UNDO information: For each update, a copy of the unaltered record is allocated in theDataMemory. There is also a reference to each copy in the ordered table indexes. Unique hash indexes are updated only when the unique index columns are updated, in which case a new entry in the index table is inserted and the old entry is deleted upon commit. For this reason, it is also necessary to allocate enough memory to handle the largest transactions performed by applications using the cluster. In any case, performing a few large transactions holds no advantage over using many smaller ones, for the following reasons:

    • Large transactions are not any faster than smaller ones

    • Large transactions increase the number of operations that are lost and must be repeated in event of transaction failure

    • Large transactions use more memory

    The default value forDataMemory is 98MB. The minimum value is 1MB. There is no maximum size, but in reality the maximum size has to be adapted so that the process does not start swapping when the limit is reached. This limit is determined by the amount of physical RAM available on the machine and by the amount of memory that the operating system may commit to any one process. 32-bit operating systems are generally limited to 2−4GB per process; 64-bit operating systems can use more. For large databases, it may be preferable to use a 64-bit operating system for this reason.

  • IndexMemory

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default0
    Range1M - 1T
    DeprecatedYes (in NDB 7.6)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    TheIndexMemory parameter is deprecated (and subject to future removal); any memory assigned toIndexMemory is allocated instead to the same pool asDataMemory, which is solely responsible for all resources needed for storing data and indexes in memory. In NDB 9.5, the use ofIndexMemory in the cluster configuration file triggers a warning from the management server.

    You can estimate the size of a hash index using this formula:

      size  = ( (fragments * 32K) + (rows * 18) )          *fragment_replicas

    fragments is the number of fragments,fragment_replicas is the number of fragment replicas (normally 2), androws is the number of rows. If a table has one million rows, eight fragments, and two fragment replicas, the expected index memory usage is calculated as shown here:

      ((8 * 32K) + (1000000 * 18)) * 2 = ((8 * 32768) + (1000000 * 18)) * 2  = (262144 + 18000000) * 2  = 18262144 * 2 = 36524288 bytes = ~35MB

    Index statistics for ordered indexes (when these are enabled) are stored in themysql.ndb_index_stat_sample table. Since this table has a hash index, this adds to index memory usage. An upper bound to the number of rows for a given ordered index can be calculated as follows:

      sample_size= key_size + ((key_attributes + 1) * 4)  sample_rows = IndexStatSaveSize                * ((0.01 * IndexStatSaveScale * log2(rows * sample_size)) + 1)                / sample_size

    In the preceding formula,key_size is the size of the ordered index key in bytes,key_attributes is the number of attributes in the ordered index key, androws is the number of rows in the base table.

    Assume that tablet1 has 1 million rows and an ordered index namedix1 on two four-byte integers. Assume in addition thatIndexStatSaveSize andIndexStatSaveScale are set to their default values (32K and 100, respectively). Using the previous 2 formulas, we can calculate as follows:

      sample_size = 8  + ((1 + 2) * 4) = 20 bytes  sample_rows = 32K                * ((0.01 * 100 * log2(1000000*20)) + 1)                / 20                = 32768 * ( (1 * ~16.811) +1) / 20                = 32768 * ~17.811 / 20                = ~29182 rows

    The expected index memory usage is thus 2 * 18 * 29182 = ~1050550 bytes.

    The minimum and default value for this parameter is 0 (zero).

  • StringMemory

    Version (or later)NDB 9.5.0
    Type or units% or bytes
    Default25
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

    This parameter determines how much memory is allocated for strings such as table names, and is specified in an[ndbd] or[ndbd default] section of theconfig.ini file. A value between0 and100 inclusive is interpreted as a percent of the maximum default value, which is calculated based on a number of factors including the number of tables, maximum table name size, maximum size of.FRM files,MaxNoOfTriggers, maximum column name size, and maximum default column value.

    A value greater than100 is interpreted as a number of bytes.

    The default value is 25—that is, 25 percent of the default maximum.

    Under most circumstances, the default value should be sufficient, but when you have a great manyNDB tables (1000 or more), it is possible to get Error 773Out of string memory, please modify StringMemory config parameter: Permanent error: Schema error, in which case you should increase this value.25 (25 percent) is not excessive, and should prevent this error from recurring in all but the most extreme conditions.

The following example illustrates how memory is used for a table. Consider this table definition:

CREATE TABLE example (  a INT NOT NULL,  b INT NOT NULL,  c INT NOT NULL,  PRIMARY KEY(a),  UNIQUE(b)) ENGINE=NDBCLUSTER;

For each record, there are 12 bytes of data plus 12 bytes overhead. Having no nullable columns saves 4 bytes of overhead. In addition, we have two ordered indexes on columnsa andb consuming roughly 10 bytes each per record. There is a primary key hash index on the base table using roughly 29 bytes per record. The unique constraint is implemented by a separate table withb as primary key anda as a column. This other table consumes an additional 29 bytes of index memory per record in theexample table as well 8 bytes of record data plus 12 bytes of overhead.

Thus, for one million records, we need 58MB for index memory to handle the hash indexes for the primary key and the unique constraint. We also need 64MB for the records of the base table and the unique index table, plus the two ordered index tables.

You can see that hash indexes takes up a fair amount of memory space; however, they provide very fast access to the data in return. They are also used in NDB Cluster to handle uniqueness constraints.

Currently, the only partitioning algorithm is hashing and ordered indexes are local to each node. Thus, ordered indexes cannot be used to handle uniqueness constraints in the general case.

An important point for bothIndexMemory andDataMemory is that the total database size is the sum of all data memory and all index memory for each node group. Each node group is used to store replicated information, so if there are four nodes with two fragment replicas, there are two node groups. Thus, the total data memory available is 2 ×DataMemory for each data node.

It is highly recommended thatDataMemory andIndexMemory be set to the same values for all nodes. Data distribution is even over all nodes in the cluster, so the maximum amount of space available for any node can be no greater than that of the smallest node in the cluster.

DataMemory can be changed, but decreasing it can be risky; doing so can easily lead to a node or even an entire NDB Cluster that is unable to restart due to there being insufficient memory space. Increasing these values should be acceptable, but it is recommended that such upgrades are performed in the same manner as a software upgrade, beginning with an update of the configuration file, and then restarting the management server followed by restarting each data node in turn.

MinFreePct.  A proportion (5% by default) of data node resources includingDataMemory is kept in reserve to insure that the data node does not exhaust its memory when performing a restart. This can be adjusted using theMinFreePct data node configuration parameter (default 5).

Version (or later)NDB 9.5.0
Type or unitsunsigned
Default5
Range0 - 100
Restart Type

Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

Updates do not increase the amount of index memory used. Inserts take effect immediately; however, rows are not actually deleted until the transaction is committed.

Transaction parameters.  The next few[ndbd] parameters that we discuss are important because they affect the number of parallel transactions and the sizes of transactions that can be handled by the system.MaxNoOfConcurrentTransactions sets the number of parallel transactions possible in a node.MaxNoOfConcurrentOperations sets the number of records that can be in update phase or locked simultaneously.

Both of these parameters (especiallyMaxNoOfConcurrentOperations) are likely targets for users setting specific values and not using the default value. The default value is set for systems using small transactions, to ensure that these do not use excessive memory.

MaxDMLOperationsPerTransaction sets the maximum number of DML operations that can be performed in a given transaction.

  • MaxNoOfConcurrentTransactions

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default4096
    Range32 - 4294967039 (0xFFFFFEFF)
    DeprecatedYes (in NDB 8.0)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Each cluster data node requires a transaction record for each active transaction in the cluster. The task of coordinating transactions is distributed among all of the data nodes. The total number of transaction records in the cluster is the number of transactions in any given node times the number of nodes in the cluster.

    Transaction records are allocated to individual MySQL servers. Each connection to a MySQL server requires at least one transaction record, plus an additional transaction object per table accessed by that connection. This means that a reasonable minimum for the total number of transactions in the cluster can be expressed as

    TotalNoOfConcurrentTransactions =    (maximum number of tables accessed in any single transaction + 1)    * number of SQL nodes

    Suppose that there are 10 SQL nodes using the cluster. A single join involving 10 tables requires 11 transaction records; if there are 10 such joins in a transaction, then 10 * 11 = 110 transaction records are required for this transaction, per MySQL server, or 110 * 10 = 1100 transaction records total. Each data node can be expected to handle TotalNoOfConcurrentTransactions / number of data nodes. For an NDB Cluster having 4 data nodes, this would mean settingMaxNoOfConcurrentTransactions on each data node to 1100 / 4 = 275. In addition, you should provide for failure recovery by ensuring that a single node group can accommodate all concurrent transactions; in other words, that each data node's MaxNoOfConcurrentTransactions is sufficient to cover a number of transactions equal to TotalNoOfConcurrentTransactions / number of node groups. If this cluster has a single node group, thenMaxNoOfConcurrentTransactions should be set to 1100 (the same as the total number of concurrent transactions for the entire cluster).

    In addition, each transaction involves at least one operation; for this reason, the value set forMaxNoOfConcurrentTransactions should always be no more than the value ofMaxNoOfConcurrentOperations.

    This parameter must be set to the same value for all cluster data nodes. This is due to the fact that, when a data node fails, the oldest surviving node re-creates the transaction state of all transactions that were ongoing in the failed node.

    It is possible to change this value using a rolling restart, but the amount of traffic on the cluster must be such that no more transactions occur than the lower of the old and new levels while this is taking place.

    The default value is 4096.

  • MaxNoOfConcurrentOperations

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default32K
    Range32 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    It is a good idea to adjust the value of this parameter according to the size and number of transactions. When performing transactions which involve only a few operations and records, the default value for this parameter is usually sufficient. Performing large transactions involving many records usually requires that you increase its value.

    Records are kept for each transaction updating cluster data, both in the transaction coordinator and in the nodes where the actual updates are performed. These records contain state information needed to find UNDO records for rollback, lock queues, and other purposes.

    This parameter should be set at a minimum to the number of records to be updated simultaneously in transactions, divided by the number of cluster data nodes. For example, in a cluster which has four data nodes and which is expected to handle one million concurrent updates using transactions, you should set this value to 1000000 / 4 = 250000. To help provide resiliency against failures, it is suggested that you set this parameter to a value that is high enough to permit an individual data node to handle the load for its node group. In other words, you should set the value equal tototal number of concurrent operations / number of node groups. (In the case where there is a single node group, this is the same as the total number of concurrent operations for the entire cluster.)

    Because each transaction always involves at least one operation, the value ofMaxNoOfConcurrentOperations should always be greater than or equal to the value ofMaxNoOfConcurrentTransactions.

    Read queries which set locks also cause operation records to be created. Some extra space is allocated within individual nodes to accommodate cases where the distribution is not perfect over the nodes.

    When queries make use of the unique hash index, there are actually two operation records used per record in the transaction. The first record represents the read in the index table and the second handles the operation on the base table.

    The default value is 32768.

    This parameter actually handles two values that can be configured separately. The first of these specifies how many operation records are to be placed with the transaction coordinator. The second part specifies how many operation records are to be local to the database.

    A very large transaction performed on an eight-node cluster requires as many operation records in the transaction coordinator as there are reads, updates, and deletes involved in the transaction. However, the operation records of the are spread over all eight nodes. Thus, if it is necessary to configure the system for one very large transaction, it is a good idea to configure the two parts separately.MaxNoOfConcurrentOperations is always used to calculate the number of operation records in the transaction coordinator portion of the node.

    It is also important to have an idea of the memory requirements for operation records. These consume about 1KB per record.

  • MaxNoOfLocalOperations

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    DefaultUNDEFINED
    Range32 - 4294967039 (0xFFFFFEFF)
    DeprecatedYes (in NDB 8.0)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    By default, this parameter is calculated as 1.1 ×MaxNoOfConcurrentOperations. This fits systems with many simultaneous transactions, none of them being very large. If there is a need to handle one very large transaction at a time and there are many nodes, it is a good idea to override the default value by explicitly specifying this parameter.

    This parameter is deprecated and subject to removal in a future NDB Cluster release. In addition, this parameter is incompatible with theTransactionMemory parameter; if you try to set values for both parameters in the cluster configuration file (config.ini), the management server refuses to start.

  • MaxDMLOperationsPerTransaction

    Version (or later)NDB 9.5.0
    Type or unitsoperations (DML)
    Default4294967295
    Range32 - 4294967295
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter limits the size of a transaction. The transaction is aborted if it requires more than this many DML operations.

    The value of this parameter cannot exceed that set forMaxNoOfConcurrentOperations.

Transaction temporary storage.  The next set of[ndbd] parameters is used to determine temporary storage when executing a statement that is part of a Cluster transaction. All records are released when the statement is completed and the cluster is waiting for the commit or rollback.

The default values for these parameters are adequate for most situations. However, users with a need to support transactions involving large numbers of rows or operations may need to increase these values to enable better parallelism in the system, whereas users whose applications require relatively small transactions can decrease the values to save memory.

  • MaxNoOfConcurrentIndexOperations

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default8K
    Range0 - 4294967039 (0xFFFFFEFF)
    DeprecatedYes (in NDB 8.0)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    For queries using a unique hash index, another temporary set of operation records is used during a query's execution phase. This parameter sets the size of that pool of records. Thus, this record is allocated only while executing a part of a query. As soon as this part has been executed, the record is released. The state needed to handle aborts and commits is handled by the normal operation records, where the pool size is set by the parameterMaxNoOfConcurrentOperations.

    The default value of this parameter is 8192. Only in rare cases of extremely high parallelism using unique hash indexes should it be necessary to increase this value. Using a smaller value is possible and can save memory if the DBA is certain that a high degree of parallelism is not required for the cluster.

    This parameter is deprecated and subject to removal in a future NDB Cluster release. In addition, this parameter is incompatible with theTransactionMemory parameter; if you try to set values for both parameters in the cluster configuration file (config.ini), the management server refuses to start.

  • MaxNoOfFiredTriggers

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default4000
    Range0 - 4294967039 (0xFFFFFEFF)
    DeprecatedYes (in NDB 8.0)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The default value ofMaxNoOfFiredTriggers is 4000, which is sufficient for most situations. In some cases it can even be decreased if the DBA feels certain the need for parallelism in the cluster is not high.

    A record is created when an operation is performed that affects a unique hash index. Inserting or deleting a record in a table with unique hash indexes or updating a column that is part of a unique hash index fires an insert or a delete in the index table. The resulting record is used to represent this index table operation while waiting for the original operation that fired it to complete. This operation is short-lived but can still require a large number of records in its pool for situations with many parallel write operations on a base table containing a set of unique hash indexes.

    This parameter is deprecated and subject to removal in a future NDB Cluster release. In addition, this parameter is incompatible with theTransactionMemory parameter; if you try to set values for both parameters in the cluster configuration file (config.ini), the management server refuses to start.

  • TransactionBufferMemory

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default1M
    Range1K - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The memory affected by this parameter is used for tracking operations fired when updating index tables and reading unique indexes. This memory is used to store the key and column information for these operations. It is only very rarely that the value for this parameter needs to be altered from the default.

    The default value forTransactionBufferMemory is 1MB.

    Normal read and write operations use a similar buffer, whose usage is even more short-lived. The compile-time parameterZATTRBUF_FILESIZE (found inndb/src/kernel/blocks/Dbtc/Dbtc.hpp) set to 4000 × 128 bytes (500KB). A similar buffer for key information,ZDATABUF_FILESIZE (also inDbtc.hpp) contains 4000 × 16 = 62.5KB of buffer space.Dbtc is the module that handles transaction coordination.

Transaction resource allocation parameters.  The parameters in the following list are used to allocate transaction resources in the transaction coordinator (DBTC). Leaving any one of these set to the default (0) dedicates transaction memory for 25% of estimated total data node usage for the corresponding resource. The actual maximum possible values for these parameters are typically limited by the amount of memory available to the data node; setting them has no impact on the total amount of memory allocated to the data node. In addition, you should keep in mind that they control numbers of reserved internal records for the data node independent of any settings forMaxDMLOperationsPerTransaction,MaxNoOfConcurrentIndexOperations,MaxNoOfConcurrentOperations,MaxNoOfConcurrentScans,MaxNoOfConcurrentTransactions,MaxNoOfFiredTriggers,MaxNoOfLocalScans, orTransactionBufferMemory (seeTransaction parameters andTransaction temporary storage).

  • ReservedConcurrentIndexOperations

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Number of simultaneous index operations having dedicated resources on one data node.

  • ReservedConcurrentOperations

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Number of simultaneous operations having dedicated resources in transaction coordinators on one data node.

  • ReservedConcurrentScans

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Number of simultaneous scans having dedicated resources on one data node.

  • ReservedConcurrentTransactions

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Number of simultaneous transactions having dedicated resources on one data node.

  • ReservedFiredTriggers

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Number of triggers that have dedicated resources on one ndbd(DB) node.

  • ReservedLocalScans

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Number of simultaneous fragment scans having dedicated resources on one data node.

  • ReservedTransactionBufferMemory

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    DeprecatedYes (in NDB 8.0)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Dynamic buffer space (in bytes) for key and attribute data allocated to each data node.

  • TransactionMemory

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default0
    Range0 - 16384G
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Important

    A number of configuration parameters are incompatible withTransactionMemory; it is not possible to set any of these parameters concurrently withTransactionMemory, and if you attempt to do so, the management server is unable to start (seeParameters incompatible with TransactionMemory).

    This parameter determines the memory (in bytes) allocated for transactions on each data node. Setting of transaction memory is handled as follows:

    • IfTransactionMemory is set, this value is used for determining transaction memory.

    • Otherwise, transaction memory is calculated as it was previous to NDB 8.0.

    Parameters incompatible with TransactionMemory.  The following parameters cannot be used concurrently withTransactionMemory and are therefore deprecated:

    Explicitly setting any of the parameters just listed whenTransactionMemory has also been set in the cluster configuration file (config.ini) keeps the management node from starting.

    For more information regarding resource allocation in NDB Cluster data nodes, seeSection 25.4.3.13, “Data Node Memory Management”.

Scans and buffering.  There are additional[ndbd] parameters in theDblqh module (inndb/src/kernel/blocks/Dblqh/Dblqh.hpp) that affect reads and updates. These includeZATTRINBUF_FILESIZE, set by default to 10000 × 128 bytes (1250KB) andZDATABUF_FILE_SIZE, set by default to 10000*16 bytes (roughly 156KB) of buffer space. To date, there have been neither any reports from users nor any results from our own extensive tests suggesting that either of these compile-time limits should be increased.

  • BatchSizePerLocalScan

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default256
    Range1 - 992
    DeprecatedYes (in NDB 8.0)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter is used to calculate the number of lock records used to handle concurrent scan operations.

    Deprecated.

    BatchSizePerLocalScan has a strong connection to theBatchSize defined in the SQL nodes.

  • LongMessageBuffer

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default64M
    Range512K - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This is an internal buffer used for passing messages within individual nodes and between nodes. The default is 64MB.

    This parameter seldom needs to be changed from the default.

  • MaxFKBuildBatchSize

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default64
    Range16 - 512
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Maximum scan batch size used for building foreign keys. Increasing the value set for this parameter may speed up building of foreign key builds at the expense of greater impact to ongoing traffic.

  • MaxNoOfConcurrentScans

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default256
    Range2 - 500
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter is used to control the number of parallel scans that can be performed in the cluster. Each transaction coordinator can handle the number of parallel scans defined for this parameter. Each scan query is performed by scanning all partitions in parallel. Each partition scan uses a scan record in the node where the partition is located, the number of records being the value of this parameter times the number of nodes. The cluster should be able to sustainMaxNoOfConcurrentScans scans concurrently from all nodes in the cluster.

    Scans are actually performed in two cases. The first of these cases occurs when no hash or ordered indexes exists to handle the query, in which case the query is executed by performing a full table scan. The second case is encountered when there is no hash index to support the query but there is an ordered index. Using the ordered index means executing a parallel range scan. The order is kept on the local partitions only, so it is necessary to perform the index scan on all partitions.

    The default value ofMaxNoOfConcurrentScans is 256. The maximum value is 500.

  • MaxNoOfLocalScans

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default4 * MaxNoOfConcurrentScans * [# of data nodes] + 2
    Range32 - 4294967039 (0xFFFFFEFF)
    DeprecatedYes (in NDB 8.0)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Specifies the number of local scan records if many scans are not fully parallelized. When the number of local scan records is not provided, it is calculated as shown here:

    4 * MaxNoOfConcurrentScans * [# data nodes] + 2

    This parameter is deprecated and subject to removal in a future NDB Cluster release. In addition, this parameter is incompatible with theTransactionMemory parameter; if you try to set values for both parameters in the cluster configuration file (config.ini), the management server refuses to start.

  • MaxParallelCopyInstances

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default0
    Range0 - 64
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter sets the parallelization used in the copy phase of a node restart or system restart, when a node that is currently just starting is synchronised with a node that already has current data by copying over any changed records from the node that is up to date. Because full parallelism in such cases can lead to overload situations,MaxParallelCopyInstances provides a means to decrease it. This parameter's default value 0. This value means that the effective parallelism is equal to the number of LDM instances in the node just starting as well as the node updating it.

  • MaxParallelScansPerFragment

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default256
    Range1 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    It is possible to configure the maximum number of parallel scans (TUP scans andTUX scans) allowed before they begin queuing for serial handling. You can increase this to take advantage of any unused CPU when performing large number of scans in parallel and improve their performance.

  • MaxReorgBuildBatchSize

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default64
    Range16 - 512
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Maximum scan batch size used for reorganization of table partitions. Increasing the value set for this parameter may speed up reorganization at the expense of greater impact to ongoing traffic.

  • MaxUIBuildBatchSize

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default64
    Range16 - 512
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Maximum scan batch size used for building unique keys. Increasing the value set for this parameter may speed up such builds at the expense of greater impact to ongoing traffic.

Memory Allocation

MaxAllocate

Version (or later)NDB 9.5.0
Type or unitsunsigned
Default32M
Range1M - 1G
DeprecatedYes (in NDB 8.0)
Restart Type

Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

This parameter was used in older versions of NDB Cluster, but has no effect in NDB 9.5. It is deprecated and subject to removal in a future release.

Multiple Transporters

NDB allocates multiple transporters for communication between pairs of data nodes. The number of transporters so allocated can be influenced by setting an appropriate value for theNodeGroupTransporters parameter introduced in that release.

NodeGroupTransporters

Version (or later)NDB 9.5.0
Type or unitsinteger
Default0
Range0 - 32
Restart Type

Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

This parameter determines the number of transporters used between nodes in the same node group. The default value (0) means that the number of transporters used is the same as the number of LDMs in the node. This should be sufficient for most use cases; thus it should seldom be necessary to change this value from its default.

SettingNodeGroupTransporters to a number greater than the number of LDM threads or the number of TC threads, whichever is higher, causesNDB to use the maximum of these two numbers of threads. This means that a value greater than this is effectively ignored.

Hash Map Size

DefaultHashMapSize

Version (or later)NDB 9.5.0
Type or unitsLDM threads
Default240
Range0 - 3840
Restart Type

Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

The original intended use for this parameter was to facilitate upgrades and especially downgrades to and from very old releases with differing default hash map sizes. This is not an issue when upgrading from NDB Cluster 7.3 (or later) to later versions.

Decreasing this parameter online after any tables have been created or modified withDefaultHashMapSize equal to 3840 is not currently supported.

Logging and checkpointing.  The following[ndbd] parameters control log and checkpoint behavior.

  • FragmentLogFileSize

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default16M
    Range4M - 1G
    Restart Type

    Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

    Setting this parameter enables you to control directly the size of redo log files. This can be useful in situations when NDB Cluster is operating under a high load and it is unable to close fragment log files quickly enough before attempting to open new ones (only 2 fragment log files can be open at one time); increasing the size of the fragment log files gives the cluster more time before having to open each new fragment log file. The default value for this parameter is 16M.

    For more information about fragment log files, see the description forNoOfFragmentLogFiles.

  • InitialNoOfOpenFiles

    Version (or later)NDB 9.5.0
    Type or unitsfiles
    Default27
    Range20 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter sets the initial number of internal threads to allocate for open files.

    The default value is 27.

  • InitFragmentLogFiles

    Version (or later)NDB 9.5.0
    Type or units[see values]
    DefaultSPARSE
    RangeSPARSE, FULL
    Restart Type

    Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

    By default, fragment log files are created sparsely when performing an initial start of a data node—that is, depending on the operating system and file system in use, not all bytes are necessarily written to disk. However, it is possible to override this behavior and force all bytes to be written, regardless of the platform and file system type being used, by means of this parameter.InitFragmentLogFiles takes either of two values:

    • SPARSE. Fragment log files are created sparsely. This is the default value.

    • FULL. Force all bytes of the fragment log file to be written to disk.

    Depending on your operating system and file system, settingInitFragmentLogFiles=FULL may help eliminate I/O errors on writes to the redo log.

  • EnablePartialLcp

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaulttrue
    Range...
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Whentrue, enable partial local checkpoints: This means that each LCP records only part of the full database, plus any records containing rows changed since the last LCP; if no rows have changed, the LCP updates only the LCP control file and does not update any data files.

    IfEnablePartialLcp is disabled (false), each LCP uses only a single file and writes a full checkpoint; this requires the least amount of disk space for LCPs, but increases the write load for each LCP. The default value is enabled (true). The proportion of space used by partial LCPS can be modified by the setting for theRecoveryWork configuration parameter.

    For more information about files and directories used for full and partial LCPs, seeNDB Cluster Data Node File System Directory.

    Setting this parameter tofalse also disables the calculation of disk write speed used by the adaptive LCP control mechanism.

  • LcpScanProgressTimeout

    Version (or later)NDB 9.5.0
    Type or unitssecond
    Default180
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    A local checkpoint fragment scan watchdog checks periodically for no progress in each fragment scan performed as part of a local checkpoint, and shuts down the node if there is no progress after a given amount of time has elapsed. This interval can be set using theLcpScanProgressTimeout data node configuration parameter, which sets the maximum time for which the local checkpoint can be stalled before the LCP fragment scan watchdog shuts down the node.

    The default value is 60 seconds (providing compatibility with previous releases). Setting this parameter to 0 disables the LCP fragment scan watchdog altogether.

  • MaxNoOfOpenFiles

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default0
    Range20 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter sets a ceiling on how many internal threads to allocate for open files.Any situation requiring a change in this parameter should be reported as a bug.

    The default value is 0. However, the minimum value to which this parameter can be set is 20.

  • MaxNoOfSavedMessages

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default25
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter sets the maximum number of errors written in the error log as well as the maximum number of trace files that are kept before overwriting the existing ones. Trace files are generated when, for whatever reason, the node crashes.

    The default is 25, which sets these maximums to 25 error messages and 25 trace files.

  • MaxLCPStartDelay

    Version (or later)NDB 9.5.0
    Type or unitsseconds
    Default0
    Range0 - 600
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    In parallel data node recovery, only table data is actually copied and synchronized in parallel; synchronization of metadata such as dictionary and checkpoint information is done in a serial fashion. In addition, recovery of dictionary and checkpoint information cannot be executed in parallel with performing of local checkpoints. This means that, when starting or restarting many data nodes concurrently, data nodes may be forced to wait while a local checkpoint is performed, which can result in longer node recovery times.

    It is possible to force a delay in the local checkpoint to permit more (and possibly all) data nodes to complete metadata synchronization; once each data node's metadata synchronization is complete, all of the data nodes can recover table data in parallel, even while the local checkpoint is being executed. To force such a delay, setMaxLCPStartDelay, which determines the number of seconds the cluster can wait to begin a local checkpoint while data nodes continue to synchronize metadata. This parameter should be set in the[ndbd default] section of theconfig.ini file, so that it is the same for all data nodes. The maximum value is 600; the default is 0.

  • NoOfFragmentLogFiles

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default16
    Range3 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

    This parameter sets the number of REDO log files for the node, and thus the amount of space allocated to REDO logging. Because the REDO log files are organized in a ring, it is extremely important that the first and last log files in the set (sometimes referred to as thehead andtail log files, respectively) do not meet. When these approach one another too closely, the node begins aborting all transactions encompassing updates due to a lack of room for new log records.

    AREDO log record is not removed until both required local checkpoints have been completed since that log record was inserted. Checkpointing frequency is determined by its own set of configuration parameters discussed elsewhere in this chapter.

    The default parameter value is 16, which by default means 16 sets of 4 16MB files for a total of 1024MB. The size of the individual log files is configurable using theFragmentLogFileSize parameter. In scenarios requiring a great many updates, the value forNoOfFragmentLogFiles may need to be set as high as 300 or even higher to provide sufficient space for REDO logs.

    If the checkpointing is slow and there are so many writes to the database that the log files are full and the log tail cannot be cut without jeopardizing recovery, all updating transactions are aborted with internal error code 410 (Out of log file space temporarily). This condition prevails until a checkpoint has completed and the log tail can be moved forward.

    Important

    This parameter cannot be changedon the fly; you must restart the node using--initial. If you wish to change this value for all data nodes in a running cluster, you can do so using a rolling node restart (using--initial when starting each data node).

  • RecoveryWork

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default60
    Range25 - 100
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Percentage of storage overhead for LCP files. This parameter has an effect only whenEnablePartialLcp is true, that is, only when partial local checkpoints are enabled. A higher value means:

    • Fewer records are written for each LCP, LCPs use more space

    • More work is needed during restarts

    A lower value forRecoveryWork means:

    • More records are written during each LCP, but LCPs require less space on disk.

    • Less work during restart and thus faster restarts, at the expense of more work during normal operations

    For example, settingRecoveryWork to 60 means that the total size of an LCP is roughly 1 + 0.6 = 1.6 times the size of the data to be checkpointed. This means that 60% more work is required during the restore phase of a restart compared to the work done during a restart that uses full checkpoints. (This is more than compensated for during other phases of the restart such that the restart as a whole is still faster when using partial LCPs than when using full LCPs.) In order not to fill up the redo log, it is necessary to write at 1 + (1 /RecoveryWork) times the rate of data changes during checkpoints—thus, whenRecoveryWork = 60, it is necessary to write at approximately 1 + (1 / 0.6 ) = 2.67 times the change rate. In other words, if changes are being written at 10 MByte per second, the checkpoint needs to be written at roughly 26.7 MByte per second.

    SettingRecoveryWork = 40 means that only 1.4 times the total LCP size is needed (and thus the restore phase takes 10 to 15 percent less time. In this case, the checkpoint write rate is 3.5 times the rate of change.

    The NDB source distribution includes a test program for simulating LCPs.lcp_simulator.cc can be found instorage/ndb/src/kernel/blocks/backup/. To compile and run it on Unix platforms, execute the commands shown here:

    $> gcc lcp_simulator.cc$> ./a.out

    This program has no dependencies other thanstdio.h, and does not require a connection to an NDB cluster or a MySQL server. By default, it simulates 300 LCPs (three sets of 100 LCPs, each consisting of inserts, updates, and deletes, in turn), reporting the size of the LCP after each one. You can alter the simulation by changing the values ofrecovery_work,insert_work, anddelete_work in the source and recompiling. For more information, see the source of the program.

  • InsertRecoveryWork

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default40
    Range0 - 70
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Percentage ofRecoveryWork used for inserted rows. A higher value increases the number of writes during a local checkpoint, and decreases the total size of the LCP. A lower value decreases the number of writes during an LCP, but results in more space being used for the LCP, which means that recovery takes longer. This parameter has an effect only whenEnablePartialLcp is true, that is, only when partial local checkpoints are enabled.

  • EnableRedoControl

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaulttrue
    Range...
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Enable adaptive checkpointing speed for controlling redo log usage.

    When enabled (the default),EnableRedoControl allows the data nodes greater flexibility with regard to the rate at which they write LCPs to disk. More specifically, enabling this parameter means that higher write rates can be employed, so that LCPs can complete and redo logs be trimmed more quickly, thereby reducing recovery time and disk space requirements. This functionality allows data nodes to make better use of the higher rate of I/O and greater bandwidth available from modern solid-state storage devices and protocols, such as solid-state drives (SSDs) using Non-Volatile Memory Express (NVMe).

    WhenNDB is deployed on systems whose I/O or bandwidth is constrained relative to those employing solid-state technology, such as those using conventional hard disks (HDDs), theEnableRedoControl mechanism can easily cause the I/O subsystem to become saturated, increasing wait times for data node input and output. In particular, this can cause issues with NDB Disk Data tables which have tablespaces or log file groups sharing a constrained I/O subsystem with data node LCP and redo log files; such problems potentially include node or cluster failure due to GCP stop errors. SetEnableRedoControl tofalse to disable it in such situations. SettingEnablePartialLcp tofalse also disables the adaptive calculation.

Metadata objects.  The next set of[ndbd] parameters defines pool sizes for metadata objects, used to define the maximum number of attributes, tables, indexes, and trigger objects used by indexes, events, and replication between clusters.

Note

These act merely assuggestions to the cluster, and any that are not specified revert to the default values shown.

  • MaxNoOfAttributes

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default1000
    Range32 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter sets a suggested maximum number of attributes that can be defined in the cluster; likeMaxNoOfTables, it is not intended to function as a hard upper limit.

    (In older NDB Cluster releases, this parameter was sometimes treated as a hard limit for certain operations. This caused problems with NDB Cluster Replication, when it was possible to create more tables than could be replicated, and sometimes led to confusion when it was possible [or not possible, depending on the circumstances] to create more thanMaxNoOfAttributes attributes.)

    The default value is 1000, with the minimum possible value being 32. The maximum is 4294967039. Each attribute consumes around 200 bytes of storage per node due to the fact that all metadata is fully replicated on the servers.

    When settingMaxNoOfAttributes, it is important to prepare in advance for anyALTER TABLE statements that you might want to perform in the future. This is due to the fact, during the execution ofALTER TABLE on a Cluster table, 3 times the number of attributes as in the original table are used, and a good practice is to permit double this amount. For example, if the NDB Cluster table having the greatest number of attributes (greatest_number_of_attributes) has 100 attributes, a good starting point for the value ofMaxNoOfAttributes would be6 *greatest_number_of_attributes = 600.

    You should also estimate the average number of attributes per table and multiply this byMaxNoOfTables. If this value is larger than the value obtained in the previous paragraph, you should use the larger value instead.

    Assuming that you can create all desired tables without any problems, you should also verify that this number is sufficient by trying an actualALTER TABLE after configuring the parameter. If this is not successful, increaseMaxNoOfAttributes by another multiple ofMaxNoOfTables and test it again.

  • MaxNoOfTables

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default128
    Range8 - 20320
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    A table object is allocated for each table and for each unique hash index in the cluster. This parameter sets a suggested maximum number of table objects for the cluster as a whole; likeMaxNoOfAttributes, it is not intended to function as a hard upper limit.

    (In older NDB Cluster releases, this parameter was sometimes treated as a hard limit for certain operations. This caused problems with NDB Cluster Replication, when it was possible to create more tables than could be replicated, and sometimes led to confusion when it was possible [or not possible, depending on the circumstances] to create more thanMaxNoOfTables tables.)

    For each attribute that has aBLOB data type an extra table is used to store most of theBLOB data. These tables also must be taken into account when defining the total number of tables.

    The default value of this parameter is 128. The minimum is 8 and the maximum is 20320. Each table object consumes approximately 20KB per node.

    Note

    The sum ofMaxNoOfTables,MaxNoOfOrderedIndexes, andMaxNoOfUniqueHashIndexes must not exceed232 − 2 (4294967294).

  • MaxNoOfOrderedIndexes

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default128
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    For each ordered index in the cluster, an object is allocated describing what is being indexed and its storage segments. By default, each index so defined also defines an ordered index. Each unique index and primary key has both an ordered index and a hash index.MaxNoOfOrderedIndexes sets the total number of ordered indexes that can be in use in the system at any one time.

    The default value of this parameter is 128. Each index object consumes approximately 10KB of data per node.

    Note

    The sum ofMaxNoOfTables,MaxNoOfOrderedIndexes, andMaxNoOfUniqueHashIndexes must not exceed232 − 2 (4294967294).

  • MaxNoOfUniqueHashIndexes

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default64
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    For each unique index that is not a primary key, a special table is allocated that maps the unique key to the primary key of the indexed table. By default, an ordered index is also defined for each unique index. To prevent this, you must specify theUSING HASH option when defining the unique index.

    The default value is 64. Each index consumes approximately 15KB per node.

    Note

    The sum ofMaxNoOfTables,MaxNoOfOrderedIndexes, andMaxNoOfUniqueHashIndexes must not exceed232 − 2 (4294967294).

  • MaxNoOfTriggers

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default768
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Internal update, insert, and delete triggers are allocated for each unique hash index. (This means that three triggers are created for each unique hash index.) However, anordered index requires only a single trigger object. Backups also use three trigger objects for each normal table in the cluster.

    Replication between clusters also makes use of internal triggers.

    This parameter sets the maximum number of trigger objects in the cluster.

    The default value is 768.

  • MaxNoOfSubscriptions

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    EachNDB table in an NDB Cluster requires a subscription in the NDB kernel. For some NDB API applications, it may be necessary or desirable to change this parameter. However, for normal usage with MySQL servers acting as SQL nodes, there is not any need to do so.

    The default value forMaxNoOfSubscriptions is 0, which is treated as equal toMaxNoOfTables. Each subscription consumes 108 bytes.

  • MaxNoOfSubscribers

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter is of interest only when using NDB Cluster Replication. The default value is 0. It is treated as2 * MaxNoOfTables + 2 * [number of API nodes]. There is one subscription perNDB table for each of two MySQL servers (one acting as the replication source and the other as the replica). Each subscriber uses 16 bytes of memory.

    When using circular replication, multi-source replication, and other replication setups involving more than 2 MySQL servers, you should increase this parameter to the number ofmysqld processes included in replication (this is often, but not always, the same as the number of clusters). For example, if you have a circular replication setup using three NDB Clusters, with onemysqld attached to each cluster, and each of thesemysqld processes acts as a source and as a replica, you should setMaxNoOfSubscribers equal to3 * MaxNoOfTables.

    For more information, seeSection 25.7, “NDB Cluster Replication”.

  • MaxNoOfConcurrentSubOperations

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default256
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter sets a ceiling on the number of operations that can be performed by all API nodes in the cluster at one time. The default value (256) is sufficient for normal operations, and might need to be adjusted only in scenarios where there are a great many API nodes each performing a high volume of operations concurrently.

Boolean parameters.  The behavior of data nodes is also affected by a set of[ndbd] parameters taking on boolean values. These parameters can each be specified asTRUE by setting them equal to1 orY, and asFALSE by setting them equal to0 orN.

  • CompressedLCP

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaultfalse
    Rangetrue, false
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Setting this parameter to1 causes local checkpoint files to be compressed. The compression used is equivalent togzip --fast, and can save 50% or more of the space required on the data node to store uncompressed checkpoint files. Compressed LCPs can be enabled for individual data nodes, or for all data nodes (by setting this parameter in the[ndbd default] section of theconfig.ini file).

    Important

    You cannot restore a compressed local checkpoint to a cluster running a MySQL version that does not support this feature.

    The default value is0 (disabled).

  • CrashOnCorruptedTuple

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaulttrue
    Rangetrue, false
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    When this parameter is enabled (the default), it forces a data node to shut down whenever it encounters a corrupted tuple.

  • Diskless

    Version (or later)NDB 9.5.0
    Type or unitstrue|false (1|0)
    Defaultfalse
    Rangetrue, false
    Restart Type

    Initial System Restart:Requires a complete shutdown of the cluster, wiping and restoring the cluster file system from abackup, and then restarting the cluster. (NDB 9.5.0)

    It is possible to specify NDB Cluster tables asdiskless, meaning that tables are not checkpointed to disk and that no logging occurs. Such tables exist only in main memory. A consequence of using diskless tables is that neither the tables nor the records in those tables survive a crash. However, when operating in diskless mode, it is possible to runndbd on a diskless computer.

    Important

    This feature causes theentire cluster to operate in diskless mode.

    When this feature is enabled, NDB Cluster online backup is disabled. In addition, a partial start of the cluster is not possible.

    Diskless is disabled by default.

  • EncryptedFileSystem

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default0
    Range0 - 1
    Restart Type

    Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

    Encrypt LCP and tablespace files, including undo logs and redo logs. Disabled by default (0); set to1 to enable.

    Important

    When file system encryption is enabled, you must supply a password to each data node when starting it, using one of the options--filesystem-password or--filesystem-password-from-stdin. Otherwise, the data node cannot start.

    For more information, seeSection 25.6.19.4, “File System Encryption for NDB Cluster”.

  • LateAlloc

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default1
    Range0 - 1
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Allocate memory for this data node after a connection to the management server has been established. Enabled by default.

  • LockPagesInMainMemory

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default0
    Range0 - 2
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    For a number of operating systems, including Solaris and Linux, it is possible to lock a process into memory and so avoid any swapping to disk. This can be used to help guarantee the cluster's real-time characteristics.

    This parameter takes one of the integer values0,1, or2, which act as shown in the following list:

    • 0: Disables locking. This is the default value.

    • 1: Performs the lock after allocating memory for the process.

    • 2: Performs the lock before memory for the process is allocated.

    If the operating system is not configured to permit unprivileged users to lock pages, then the data node process making use of this parameter may have to be run as system root. (LockPagesInMainMemory uses themlockall function. From Linux kernel 2.6.9, unprivileged users can lock memory as limited bymax locked memory. For more information, seeulimit -l andhttp://linux.die.net/man/2/mlock).

    Note

    In older NDB Cluster releases, this parameter was a Boolean.0 orfalse was the default setting, and disabled locking.1 ortrue enabled locking of the process after its memory was allocated. NDB Cluster 9.5 treatstrue orfalse for the value of this parameter as an error.

    Important

    Beginning withglibc 2.10,glibc uses per-thread arenas to reduce lock contention on a shared pool, which consumes real memory. In general, a data node process does not need per-thread arenas, since it does not perform any memory allocation after startup. (This difference in allocators does not appear to affect performance significantly.)

    Theglibc behavior is intended to be configurable via theMALLOC_ARENA_MAX environment variable, but a bug in this mechanism prior toglibc 2.16 meant that this variable could not be set to less than 8, so that the wasted memory could not be reclaimed. (Bug #15907219; see alsohttp://sourceware.org/bugzilla/show_bug.cgi?id=13137 for more information concerning this issue.)

    One possible workaround for this problem is to use theLD_PRELOAD environment variable to preload ajemalloc memory allocation library to take the place of that supplied withglibc.

  • ODirect

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaultfalse
    Rangetrue, false
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Enabling this parameter causesNDB to attempt usingO_DIRECT writes for LCP, backups, and redo logs, often loweringkswapd and CPU usage. When using NDB Cluster on Linux, enableODirect if you are using a 2.6 or later kernel.

    ODirect is disabled by default.

  • ODirectSyncFlag

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaultfalse
    Rangetrue, false
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    When this parameter is enabled, redo log writes are performed such that each completed file system write is handled as a call tofsync. The setting for this parameter is ignored if at least one of the following conditions is true:

    Disabled by default.

  • RequireCertificate

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaultfalse
    Range...
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    If this parameter is set totrue, the data node looks for a key and a valid and current certificate in the TLS search path, and cannot start if it does not find them.

  • RequireTls

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaultfalse
    Range...
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    If this parameter is set totrue, connections to this data node must be authenticated using TLS.

  • RestartOnErrorInsert

    Version (or later)NDB 9.5.0
    Type or unitserror code
    Default2
    Range0 - 4
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This feature is accessible only when building the debug version where it is possible to insert errors in the execution of individual blocks of code as part of testing.

    This feature is disabled by default.

  • StopOnError

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Default1
    Range0, 1
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter specifies whether a data node process should exit or perform an automatic restart when an error condition is encountered.

    This parameter's default value is 1; this means that, by default, an error causes the data node process to halt.

    When an error is encountered andStopOnError is 0, the data node process is restarted.

    Users of MySQL Cluster Manager should note that, whenStopOnError equals 1, this prevents the MySQL Cluster Manager agent from restarting any data nodes after it has performed its own restart and recovery. SeeStarting and Stopping the Agent on Linux, for more information.

  • UseShm

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaultfalse
    Rangetrue, false
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Enable a shared memory connection between this data node and the API node also running on this host. Set to 1 to enable.

Controlling Timeouts, Intervals, and Disk Paging

There are a number of[ndbd] parameters specifying timeouts and intervals between various actions in Cluster data nodes. Most of the timeout values are specified in milliseconds. Any exceptions to this are mentioned where applicable.

  • TimeBetweenWatchDogCheck

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default6000
    Range70 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    To prevent the main thread from getting stuck in an endless loop at some point, awatchdog thread checks the main thread. This parameter specifies the number of milliseconds between checks. If the process remains in the same state after three checks, the watchdog thread terminates it.

    This parameter can easily be changed for purposes of experimentation or to adapt to local conditions. It can be specified on a per-node basis although there seems to be little reason for doing so.

    The default timeout is 6000 milliseconds (6 seconds).

  • TimeBetweenWatchDogCheckInitial

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default6000
    Range70 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This is similar to theTimeBetweenWatchDogCheck parameter, except thatTimeBetweenWatchDogCheckInitial controls the amount of time that passes between execution checks inside a storage node in the early start phases during which memory is allocated.

    The default timeout is 6000 milliseconds (6 seconds).

  • StartPartialTimeout

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default30000
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter specifies how long the Cluster waits for all data nodes to come up before the cluster initialization routine is invoked. This timeout is used to avoid a partial Cluster startup whenever possible.

    This parameter is overridden when performing an initial start or initial restart of the cluster.

    The default value is 30000 milliseconds (30 seconds). 0 disables the timeout, in which case the cluster may start only if all nodes are available.

  • StartPartitionedTimeout

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    If the cluster is ready to start after waiting forStartPartialTimeout milliseconds but is still possibly in a partitioned state, the cluster waits until this timeout has also passed. IfStartPartitionedTimeout is set to 0, the cluster waits indefinitely (232−1 ms, or approximately 49.71 days).

    This parameter is overridden when performing an initial start or initial restart of the cluster.

  • StartFailureTimeout

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    If a data node has not completed its startup sequence within the time specified by this parameter, the node startup fails. Setting this parameter to 0 (the default value) means that no data node timeout is applied.

    For nonzero values, this parameter is measured in milliseconds. For data nodes containing extremely large amounts of data, this parameter should be increased. For example, in the case of a data node containing several gigabytes of data, a period as long as 10−15 minutes (that is, 600000 to 1000000 milliseconds) might be required to perform a node restart.

  • StartNoNodeGroupTimeout

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default15000
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    When a data node is configured withNodegroup = 65536, is regarded as not being assigned to any node group. When that is done, the cluster waitsStartNoNodegroupTimeout milliseconds, then treats such nodes as though they had been added to the list passed to the--nowait-nodes option, and starts. The default value is15000 (that is, the management server waits 15 seconds). Setting this parameter equal to0 means that the cluster waits indefinitely.

    StartNoNodegroupTimeout must be the same for all data nodes in the cluster; for this reason, you should always set it in the[ndbd default] section of theconfig.ini file, rather than for individual data nodes.

    SeeSection 25.6.7, “Adding NDB Cluster Data Nodes Online”, for more information.

  • HeartbeatIntervalDbDb

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default5000
    Range10 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    One of the primary methods of discovering failed nodes is by the use of heartbeats. This parameter states how often heartbeat signals are sent and how often to expect to receive them. Heartbeats cannot be disabled.

    After missing four heartbeat intervals in a row, the node is declared dead. Thus, the maximum time for discovering a failure through the heartbeat mechanism is five times the heartbeat interval.

    The default heartbeat interval is 5000 milliseconds (5 seconds). This parameter must not be changed drastically and should not vary widely between nodes. If one node uses 5000 milliseconds and the node watching it uses 1000 milliseconds, obviously the node is declared dead very quickly. This parameter can be changed during an online software upgrade, but only in small increments.

    See alsoNetwork communication and latency, as well as the description of theConnectCheckIntervalDelay configuration parameter.

  • HeartbeatIntervalDbApi

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default1500
    Range100 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Each data node sends heartbeat signals to each MySQL server (SQL node) to ensure that it remains in contact. If a MySQL server fails to send a heartbeat in time it is declareddead, in which case all ongoing transactions are completed and all resources released. The SQL node cannot reconnect until all activities initiated by the previous MySQL instance have been completed. The three-heartbeat criteria for this determination are the same as described forHeartbeatIntervalDbDb.

    The default interval is 1500 milliseconds (1.5 seconds). This interval can vary between individual data nodes because each data node watches the MySQL servers connected to it, independently of all other data nodes.

    For more information, seeNetwork communication and latency.

  • HeartbeatOrder

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default0
    Range0 - 65535
    Restart Type

    System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

    Data nodes send heartbeats to one another in a circular fashion whereby each data node monitors the previous one. If a heartbeat is not detected by a given data node, this node declares the previous data node in the circledead (that is, no longer accessible by the cluster). The determination that a data node is dead is done globally; in other words; once a data node is declared dead, it is regarded as such by all nodes in the cluster.

    It is possible for heartbeats between data nodes residing on different hosts to be too slow compared to heartbeats between other pairs of nodes (for example, due to a very low heartbeat interval or temporary connection problem), such that a data node is declared dead, even though the node can still function as part of the cluster. .

    In this type of situation, it may be that the order in which heartbeats are transmitted between data nodes makes a difference as to whether or not a particular data node is declared dead. If this declaration occurs unnecessarily, this can in turn lead to the unnecessary loss of a node group and as thus to a failure of the cluster.

    Consider a setup where there are 4 data nodes A, B, C, and D running on 2 host computershost1 andhost2, and that these data nodes make up 2 node groups, as shown in the following table:

    Table 25.9 Four data nodes A, B, C, D running on two host computers host1, host2; each data node belongs to one of two nodegroups.

    Node GroupNodes Running onhost1Nodes Running onhost2
    Node Group 0:Node ANode B
    Node Group 1:Node CNode D

    Suppose the heartbeats are transmitted in the order A->B->C->D->A. In this case, the loss of the heartbeat between the hosts causes node B to declare node A dead and node C to declare node B dead. This results in loss of Node Group 0, and so the cluster fails. On the other hand, if the order of transmission is A->B->D->C->A (and all other conditions remain as previously stated), the loss of the heartbeat causes nodes A and D to be declared dead; in this case, each node group has one surviving node, and the cluster survives.

    TheHeartbeatOrder configuration parameter makes the order of heartbeat transmission user-configurable. The default value forHeartbeatOrder is zero; allowing the default value to be used on all data nodes causes the order of heartbeat transmission to be determined byNDB. If this parameter is used, it must be set to a nonzero value (maximum 65535) for every data node in the cluster, and this value must be unique for each data node; this causes the heartbeat transmission to proceed from data node to data node in the order of theirHeartbeatOrder values from lowest to highest (and then directly from the data node having the highestHeartbeatOrder to the data node having the lowest value, to complete the circle). The values need not be consecutive. For example, to force the heartbeat transmission order A->B->D->C->A in the scenario outlined previously, you could set theHeartbeatOrder values as shown here:

    Table 25.10 HeartbeatOrder values to force a heartbeat transition order ofA->B->D->C->A.

    NodeHeartbeatOrder Value
    A10
    B20
    C30
    D25

    To use this parameter to change the heartbeat transmission order in a running NDB Cluster, you must first setHeartbeatOrder for each data node in the cluster in the global configuration (config.ini) file (or files). To cause the change to take effect, you must perform either of the following:

    • A complete shutdown and restart of the entire cluster.

    • 2 rolling restarts of the cluster in succession.All nodes must be restarted in the same order in both rolling restarts.

    You can useDUMP 908 to observe the effect of this parameter in the data node logs.

  • ConnectCheckIntervalDelay

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter enables connection checking between data nodes after one of them has failed heartbeat checks for 5 intervals of up toHeartbeatIntervalDbDb milliseconds.

    Such a data node that further fails to respond within an interval ofConnectCheckIntervalDelay milliseconds is considered suspect, and is considered dead after two such intervals. This can be useful in setups with known latency issues.

    The default value for this parameter is 0 (disabled).

  • TimeBetweenLocalCheckpoints

    Version (or later)NDB 9.5.0
    Type or unitsnumber of 4-byte words, as base-2 logarithm
    Default20
    Range0 - 31
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter is an exception in that it does not specify a time to wait before starting a new local checkpoint; rather, it is used to ensure that local checkpoints are not performed in a cluster where relatively few updates are taking place. In most clusters with high update rates, it is likely that a new local checkpoint is started immediately after the previous one has been completed.

    The size of all write operations executed since the start of the previous local checkpoints is added. This parameter is also exceptional in that it is specified as the base-2 logarithm of the number of 4-byte words, so that the default value 20 means 4MB (4 × 220) of write operations, 21 would mean 8MB, and so on up to a maximum value of 31, which equates to 8GB of write operations.

    All the write operations in the cluster are added together. SettingTimeBetweenLocalCheckpoints to 6 or less means that local checkpoints are executed continuously without pause, independent of the cluster's workload.

  • TimeBetweenGlobalCheckpoints

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default2000
    Range20 - 32000
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    When a transaction is committed, it is committed in main memory in all nodes on which the data is mirrored. However, transaction log records are not flushed to disk as part of the commit. The reasoning behind this behavior is that having the transaction safely committed on at least two autonomous host machines should meet reasonable standards for durability.

    It is also important to ensure that even the worst of cases—a complete crash of the cluster—is handled properly. To guarantee that this happens, all transactions taking place within a given interval are put into a global checkpoint, which can be thought of as a set of committed transactions that has been flushed to disk. In other words, as part of the commit process, a transaction is placed in a global checkpoint group. Later, this group's log records are flushed to disk, and then the entire group of transactions is safely committed to disk on all computers in the cluster.

    We recommended when you are using solid-state disks (especially those employing NVMe) with Disk Data tables that you reduce this value. In such cases, you should also ensure thatMaxDiskDataLatency is set to a proper level.

    This parameter defines the interval between global checkpoints. The default is 2000 milliseconds.

  • TimeBetweenGlobalCheckpointsTimeout

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default120000
    Range10 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter defines the minimum timeout between global checkpoints. The default is 120000 milliseconds.

  • TimeBetweenEpochs

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default100
    Range0 - 32000
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter defines the interval between synchronization epochs for NDB Cluster Replication. The default value is 100 milliseconds.

    TimeBetweenEpochs is part of the implementation ofmicro-GCPs, which can be used to improve the performance of NDB Cluster Replication.

  • TimeBetweenEpochsTimeout

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default0
    Range0 - 256000
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter defines a timeout for synchronization epochs for NDB Cluster Replication. If a node fails to participate in a global checkpoint within the time determined by this parameter, the node is shut down. The default value is 0; in other words, the timeout is disabled.

    TimeBetweenEpochsTimeout is part of the implementation ofmicro-GCPs, which can be used to improve the performance of NDB Cluster Replication.

    The current value of this parameter and a warning are written to the cluster log whenever a GCP save takes longer than 1 minute or a GCP commit takes longer than 10 seconds.

    Setting this parameter to zero has the effect of disabling GCP stops caused by save timeouts, commit timeouts, or both. The maximum possible value for this parameter is 256000 milliseconds.

  • MaxBufferedEpochs

    Version (or later)NDB 9.5.0
    Type or unitsepochs
    Default100
    Range0 - 100000
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The number of unprocessed epochs by which a subscribing node can lag behind. Exceeding this number causes a lagging subscriber to be disconnected.

    The default value of 100 is sufficient for most normal operations. If a subscribing node does lag enough to cause disconnections, it is usually due to network or scheduling issues with regard to processes or threads. (In rare circumstances, the problem may be due to a bug in theNDB client.) It may be desirable to set the value lower than the default when epochs are longer.

    Disconnection prevents client issues from affecting the data node service, running out of memory to buffer data, and eventually shutting down. Instead, only the client is affected as a result of the disconnect (by, for example gap events in the binary log), forcing the client to reconnect or restart the process.

  • MaxBufferedEpochBytes

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default26214400
    Range26214400 (0x01900000) - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The total number of bytes allocated for buffering epochs by this node.

  • TimeBetweenInactiveTransactionAbortCheck

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default1000
    Range1000 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Timeout handling is performed by checking a timer on each transaction once for every interval specified by this parameter. Thus, if this parameter is set to 1000 milliseconds, every transaction is checked for timing out once per second.

    The default value is 1000 milliseconds (1 second).

  • TransactionInactiveTimeout

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default4294967039 (0xFFFFFEFF)
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter states the maximum time that is permitted to lapse between operations in the same transaction before the transaction is aborted.

    The default for this parameter is4G (also the maximum). For a real-time database that needs to ensure that no transaction keeps locks for too long, this parameter should be set to a relatively small value. Setting it to 0 means that the application never times out. The unit is milliseconds.

  • TransactionDeadlockDetectionTimeout

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default1200
    Range50 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    When a node executes a query involving a transaction, the node waits for the other nodes in the cluster to respond before continuing. This parameter sets the amount of time that the transaction can spend executing within a data node, that is, the time that the transaction coordinator waits for each data node participating in the transaction to execute a request.

    A failure to respond can occur for any of the following reasons:

    • The node isdead

    • The operation has entered a lock queue

    • The node requested to perform the action could be heavily overloaded.

    This timeout parameter states how long the transaction coordinator waits for query execution by another node before aborting the transaction, and is important for both node failure handling and deadlock detection.

    The default timeout value is 1200 milliseconds (1.2 seconds).

    The minimum for this parameter is 50 milliseconds.

  • DiskSyncSize

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default4M
    Range32K - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This is the maximum number of bytes to store before flushing data to a local checkpoint file. This is done to prevent write buffering, which can impede performance significantly. This parameter isnot intended to take the place ofTimeBetweenLocalCheckpoints.

    Note

    WhenODirect is enabled, it is not necessary to setDiskSyncSize; in fact, in such cases its value is simply ignored.

    The default value is 4M (4 megabytes).

  • MaxDiskWriteSpeed

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default20M
    Range1M - 1024G
    Restart Type

    System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

    Set the maximum rate for writing to disk, in bytes per second, by local checkpoints and backup operations when no restarts (by this data node or any other data node) are taking place in this NDB Cluster.

    For setting the maximum rate of disk writes allowed while this data node is restarting, useMaxDiskWriteSpeedOwnRestart. For setting the maximum rate of disk writes allowed while other data nodes are restarting, useMaxDiskWriteSpeedOtherNodeRestart. The minimum speed for disk writes by all LCPs and backup operations can be adjusted by settingMinDiskWriteSpeed.

  • MaxDiskWriteSpeedOtherNodeRestart

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default50M
    Range1M - 1024G
    Restart Type

    System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

    Set the maximum rate for writing to disk, in bytes per second, by local checkpoints and backup operations when one or more data nodes in this NDB Cluster are restarting, other than this node.

    For setting the maximum rate of disk writes allowed while this data node is restarting, useMaxDiskWriteSpeedOwnRestart. For setting the maximum rate of disk writes allowed when no data nodes are restarting anywhere in the cluster, useMaxDiskWriteSpeed. The minimum speed for disk writes by all LCPs and backup operations can be adjusted by settingMinDiskWriteSpeed.

  • MaxDiskWriteSpeedOwnRestart

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default200M
    Range1M - 1024G
    Restart Type

    System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

    Set the maximum rate for writing to disk, in bytes per second, by local checkpoints and backup operations while this data node is restarting.

    For setting the maximum rate of disk writes allowed while other data nodes are restarting, useMaxDiskWriteSpeedOtherNodeRestart. For setting the maximum rate of disk writes allowed when no data nodes are restarting anywhere in the cluster, useMaxDiskWriteSpeed. The minimum speed for disk writes by all LCPs and backup operations can be adjusted by settingMinDiskWriteSpeed.

  • MinDiskWriteSpeed

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default10M
    Range1M - 1024G
    Restart Type

    System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

    Set the minimum rate for writing to disk, in bytes per second, by local checkpoints and backup operations.

    The maximum rates of disk writes allowed for LCPs and backups under various conditions are adjustable using the parametersMaxDiskWriteSpeed,MaxDiskWriteSpeedOwnRestart, andMaxDiskWriteSpeedOtherNodeRestart. See the descriptions of these parameters for more information.

  • ApiFailureHandlingTimeout

    Version (or later)NDB 9.5.0
    Type or unitsseconds
    Default600
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Specifies the maximum time (in seconds) that the data node waits for API node failure handling to complete before escalating it to data node failure handling.

  • ArbitrationTimeout

    Version (or later)NDB 9.5.0
    Type or unitsmilliseconds
    Default7500
    Range10 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter specifies how long data nodes wait for a response from the arbitrator to an arbitration message. If this is exceeded, the network is assumed to have split.

    The default value is 7500 milliseconds (7.5 seconds).

  • Arbitration

    Version (or later)NDB 9.5.0
    Type or unitsenumeration
    DefaultDefault
    RangeDefault, Disabled, WaitExternal
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    TheArbitration parameter enables a choice of arbitration schemes, corresponding to one of 3 possible values for this parameter:

    • Default.  This enables arbitration to proceed normally, as determined by theArbitrationRank settings for the management and API nodes. This is the default value.

    • Disabled.  SettingArbitration = Disabled in the[ndbd default] section of theconfig.ini file to accomplishes the same task as settingArbitrationRank to 0 on all management and API nodes. WhenArbitration is set in this way, anyArbitrationRank settings are ignored.

    • WaitExternal.  TheArbitration parameter also makes it possible to configure arbitration in such a way that the cluster waits until after the time determined byArbitrationTimeout has passed for an external cluster manager application to perform arbitration instead of handling arbitration internally. This can be done by settingArbitration = WaitExternal in the[ndbd default] section of theconfig.ini file. For best results with theWaitExternal setting, it is recommended thatArbitrationTimeout be 2 times as long as the interval required by the external cluster manager to perform arbitration.

    Important

    This parameter should be used only in the[ndbd default] section of the cluster configuration file. The behavior of the cluster is unspecified whenArbitration is set to different values for individual data nodes.

  • RestartSubscriberConnectTimeout

    Version (or later)NDB 9.5.0
    Type or unitsms
    Default12000
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter determines the time that a data node waits for subscribing API nodes to connect. Once this timeout expires, anymissing API nodes are disconnected from the cluster. To disable this timeout, setRestartSubscriberConnectTimeout to 0.

    While this parameter is specified in milliseconds, the timeout itself is resolved to the next-greatest whole second.

  • KeepAliveSendInterval

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default60000
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    You can enable and control the interval between keep-alive signals sent between data nodes by setting this parameter. The default forKeepAliveSendInterval is 60000 milliseconds (one minute); setting it to 0 disables keep-alive signals. Values between 1 and 10 inclusive are treated as 10.

    This parameter may prove useful in environments which monitor and disconnect idle TCP connections, possibly causing unnecessary data node failures when the cluster is idle.

The heartbeat interval between management nodes and data nodes is always 100 milliseconds, and is not configurable.

Buffering and logging.  Several[ndbd] configuration parameters enable the advanced user to have more control over the resources used by node processes and to adjust various buffer sizes at need.

These buffers are used as front ends to the file system when writing log records to disk. If the node is running in diskless mode, these parameters can be set to their minimum values without penalty due to the fact that disk writes arefaked by theNDB storage engine's file system abstraction layer.

  • UndoIndexBuffer

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default2M
    Range1M - 4294967039 (0xFFFFFEFF)
    DeprecatedYes (in NDB 8.0)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter formerly set the size of the undo index buffer, but has no effect in current versions of NDB Cluster.

    Use of this parameter in the cluster configuration file raises a deprecation warning; you should expect it to be removed in a future NDB Cluster release.

  • UndoDataBuffer

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default16M
    Range1M - 4294967039 (0xFFFFFEFF)
    DeprecatedYes (in NDB 8.0)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter formerly set the size of the undo data buffer, but has no effect in current versions of NDB Cluster.

    Use of this parameter in the cluster configuration file raises a deprecation warning; you should expect it to be removed in a future NDB Cluster release.

  • RedoBuffer

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default32M
    Range1M - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    All update activities also need to be logged. The REDO log makes it possible to replay these updates whenever the system is restarted. The NDB recovery algorithm uses afuzzy checkpoint of the data together with the UNDO log, and then applies the REDO log to play back all changes up to the restoration point.

    RedoBuffer sets the size of the buffer in which the REDO log is written. The default value is 32MB; the minimum value is 1MB.

    If this buffer is too small, theNDB storage engine issues error code 1221 (REDO log buffers overloaded). For this reason, you should exercise care if you attempt to decrease the value ofRedoBuffer as part of an online change in the cluster's configuration.

    ndbmtd allocates a separate buffer for each LDM thread (seeThreadConfig). For example, with 4 LDM threads, anndbmtd data node actually has 4 buffers and allocatesRedoBuffer bytes to each one, for a total of4 * RedoBuffer bytes.

  • EventLogBufferSize

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default8192
    Range0 - 64K
    Restart Type

    System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

    Controls the size of the circular buffer used for NDB log events within data nodes.

Controlling log messages.  In managing the cluster, it is very important to be able to control the number of log messages sent for various event types tostdout. For each event category, there are 16 possible event levels (numbered 0 through 15). Setting event reporting for a given event category to level 15 means all event reports in that category are sent tostdout; setting it to 0 means that no event reports in that category are made.

By default, only the startup message is sent tostdout, with the remaining event reporting level defaults being set to 0. The reason for this is that these messages are also sent to the management server's cluster log.

An analogous set of levels can be set for the management client to determine which event levels to record in the cluster log.

  • LogLevelStartup

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default1
    Range0 - 15
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The reporting level for events generated during startup of the process.

    The default level is 1.

  • LogLevelShutdown

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default0
    Range0 - 15
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The reporting level for events generated as part of graceful shutdown of a node.

    The default level is 0.

  • LogLevelStatistic

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default0
    Range0 - 15
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The reporting level for statistical events such as number of primary key reads, number of updates, number of inserts, information relating to buffer usage, and so on.

    The default level is 0.

  • LogLevelCheckpoint

    Version (or later)NDB 9.5.0
    Type or unitslog level
    Default0
    Range0 - 15
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The reporting level for events generated by local and global checkpoints.

    The default level is 0.

  • LogLevelNodeRestart

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default0
    Range0 - 15
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The reporting level for events generated during node restart.

    The default level is 0.

  • LogLevelConnection

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default0
    Range0 - 15
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The reporting level for events generated by connections between cluster nodes.

    The default level is 0.

  • LogLevelError

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default0
    Range0 - 15
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The reporting level for events generated by errors and warnings by the cluster as a whole. These errors do not cause any node failure but are still considered worth reporting.

    The default level is 0.

  • LogLevelCongestion

    Version (or later)NDB 9.5.0
    Type or unitslevel
    Default0
    Range0 - 15
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The reporting level for events generated by congestion. These errors do not cause node failure but are still considered worth reporting.

    The default level is 0.

  • LogLevelInfo

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default0
    Range0 - 15
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The reporting level for events generated for information about the general state of the cluster.

    The default level is 0.

  • MemReportFrequency

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter controls how often data node memory usage reports are recorded in the cluster log; it is an integer value representing the number of seconds between reports.

    Each data node's data memory and index memory usage is logged as both a percentage and a number of 32 KB pages ofDataMemory, as set in theconfig.ini file. For example, ifDataMemory is equal to 100 MB, and a given data node is using 50 MB for data memory storage, the corresponding line in the cluster log might look like this:

    2006-12-24 01:18:16 [MgmSrvr] INFO -- Node 2: Data usage is 50%(1280 32K pages of total 2560)

    MemReportFrequency is not a required parameter. If used, it can be set for all cluster data nodes in the[ndbd default] section ofconfig.ini, and can also be set or overridden for individual data nodes in the corresponding[ndbd] sections of the configuration file. The minimum value—which is also the default value—is 0, in which case memory reports are logged only when memory usage reaches certain percentages (80%, 90%, and 100%), as mentioned in the discussion of statistics events inSection 25.6.3.2, “NDB Cluster Log Events”.

  • StartupStatusReportFrequency

    Version (or later)NDB 9.5.0
    Type or unitsseconds
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    When a data node is started with the--initial, it initializes the redo log file during Start Phase 4 (seeSection 25.6.4, “Summary of NDB Cluster Start Phases”). When very large values are set forNoOfFragmentLogFiles,FragmentLogFileSize, or both, this initialization can take a long time. You can force reports on the progress of this process to be logged periodically, by means of theStartupStatusReportFrequency configuration parameter. In this case, progress is reported in the cluster log, in terms of both the number of files and the amount of space that have been initialized, as shown here:

    2009-06-20 16:39:23 [MgmSrvr] INFO -- Node 1: Local redo log file initialization status:#Total files: 80, Completed: 60#Total MBytes: 20480, Completed: 155572009-06-20 16:39:23 [MgmSrvr] INFO -- Node 2: Local redo log file initialization status:#Total files: 80, Completed: 60#Total MBytes: 20480, Completed: 15570

    These reports are logged eachStartupStatusReportFrequency seconds during Start Phase 4. IfStartupStatusReportFrequency is 0 (the default), then reports are written to the cluster log only when at the beginning and at the completion of the redo log file initialization process.

Data Node Debugging Parameters

The following parameters are intended for use during testing or debugging of data nodes, and not for use in production.

  • DictTrace

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Defaultundefined
    Range0 - 100
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    It is possible to cause logging of traces for events generated by creating and dropping tables usingDictTrace. This parameter is useful only in debugging NDB kernel code.DictTrace takes an integer value. 0 is the default, and means no logging is performed; 1 enables trace logging, and 2 enables logging of additionalDBDICT debugging output.

  • WatchDogImmediateKill

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaultfalse
    Rangetrue, false
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    You can cause threads to be killed immediately whenever watchdog issues occur by enabling theWatchDogImmediateKill data node configuration parameter. This parameter should be used only when debugging or troubleshooting, to obtain trace files reporting exactly what was occurring the instant that execution ceased.

Backup parameters.  The[ndbd] parameters discussed in this section define memory buffers set aside for execution of online backups.

  • BackupDataBufferSize

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default16M
    Range512K - 4294967039 (0xFFFFFEFF)
    DeprecatedYes (in NDB 7.6)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    In creating a backup, there are two buffers used for sending data to the disk. The backup data buffer is used to fill in data recorded by scanning a node's tables. Once this buffer has been filled to the level specified asBackupWriteSize, the pages are sent to disk. While flushing data to disk, the backup process can continue filling this buffer until it runs out of space. When this happens, the backup process pauses the scan and waits until some disk writes have completed freeing up memory so that scanning may continue.

    The default value for this parameter is 16MB. The minimum is 512K.

  • BackupDiskWriteSpeedPct

    Version (or later)NDB 9.5.0
    Type or unitspercent
    Default50
    Range0 - 90
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    BackupDiskWriteSpeedPct applies only when a backup is single-threaded; since NDB 9.5 supports multi-threaded backups, it is usually not necessary to adjust this parameter, which has no effect in the multi-threaded case. The discussion that follows is specific to single-threaded backups.

    During normal operation, data nodes attempt to maximize the disk write speed used for local checkpoints and backups while remaining within the bounds set byMinDiskWriteSpeed andMaxDiskWriteSpeed. Disk write throttling gives each LDM thread an equal share of the total budget. This allows parallel LCPs to take place without exceeding the disk I/O budget. Because a backup is executed by only one LDM thread, this effectively caused a budget cut, resulting in longer backup completion times, and—if the rate of change is sufficiently high—in failure to complete the backup when the backup log buffer fill rate is higher than the achievable write rate.

    This problem can be addressed by using theBackupDiskWriteSpeedPct configuration parameter, which takes a value in the range 0-90 (inclusive) which is interpreted as the percentage of the node's maximum write rate budget that is reserved prior to sharing out the remainder of the budget among LDM threads for LCPs. The LDM thread running the backup receives the whole write rate budget for the backup, plus its (reduced) share of the write rate budget for local checkpoints.

    The default value for this parameter is 50 (interpreted as 50%).

  • BackupLogBufferSize

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default16M
    Range2M - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    The backup log buffer fulfills a role similar to that played by the backup data buffer, except that it is used for generating a log of all table writes made during execution of the backup. The same principles apply for writing these pages as with the backup data buffer, except that when there is no more space in the backup log buffer, the backup fails. For that reason, the size of the backup log buffer must be large enough to handle the load caused by write activities while the backup is being made. SeeSection 25.6.8.3, “Configuration for NDB Cluster Backups”.

    The default value for this parameter should be sufficient for most applications. In fact, it is more likely for a backup failure to be caused by insufficient disk write speed than it is for the backup log buffer to become full. If the disk subsystem is not configured for the write load caused by applications, the cluster is unlikely to be able to perform the desired operations.

    It is preferable to configure cluster nodes in such a manner that the processor becomes the bottleneck rather than the disks or the network connections.

    The default value for this parameter is 16MB.

  • BackupMemory

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default32M
    Range0 - 4294967039 (0xFFFFFEFF)
    DeprecatedYes (in NDB 7.4)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter is deprecated, and subject to removal in a future version of NDB Cluster. Any setting made for it is ignored.

  • BackupReportFrequency

    Version (or later)NDB 9.5.0
    Type or unitsseconds
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter controls how often backup status reports are issued in the management client during a backup, as well as how often such reports are written to the cluster log (provided cluster event logging is configured to permit it—seeLogging and checkpointing).BackupReportFrequency represents the time in seconds between backup status reports.

    The default value is 0.

  • BackupWriteSize

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default256K
    Range32K - 4294967039 (0xFFFFFEFF)
    DeprecatedYes (in NDB 7.6)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter specifies the default size of messages written to disk by the backup log and backup data buffers.

    The default value for this parameter is 256KB.

  • BackupMaxWriteSize

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default1M
    Range256K - 4294967039 (0xFFFFFEFF)
    DeprecatedYes (in NDB 7.6)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter specifies the maximum size of messages written to disk by the backup log and backup data buffers.

    The default value for this parameter is 1MB.

  • CompressedBackup

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaultfalse
    Rangetrue, false
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Enabling this parameter causes backup files to be compressed. The compression used is equivalent togzip --fast, and can save 50% or more of the space required on the data node to store uncompressed backup files. Compressed backups can be enabled for individual data nodes, or for all data nodes (by setting this parameter in the[ndbd default] section of theconfig.ini file).

    Important

    You cannot restore a compressed backup to a cluster running a MySQL version that does not support this feature.

    The default value is0 (disabled).

  • RequireEncryptedBackup

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default0
    Range0 - 1
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    If set to 1, backups must be encrypted. While it is possible to set this parameter for each data node individually, it is recommended that you set it in the[ndbd default] section of theconfig.ini global configuration file. For more information about performing encrypted backups, seeSection 25.6.8.2, “Using The NDB Cluster Management Client to Create a Backup”.

Note

The location of the backup files is determined by theBackupDataDir data node configuration parameter.

Additional requirements.  When specifying these parameters, the following relationships must hold true. Otherwise, the data node cannot start.

  • BackupDataBufferSize >= BackupWriteSize + 188KB

  • BackupLogBufferSize >= BackupWriteSize + 16KB

  • BackupMaxWriteSize >= BackupWriteSize

NDB Cluster Realtime Performance Parameters

The[ndbd] parameters discussed in this section are used in scheduling and locking of threads to specific CPUs on multiprocessor data node hosts.

Note

To make use of these parameters, the data node process must be run as system root.

  • BuildIndexThreads

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default128
    Range0 - 128
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter determines the number of threads to create when rebuilding ordered indexes during a system or node start, as well as when runningndb_restore--rebuild-indexes. It is supported only when there is more than one fragment for the table per data node (for example, whenCOMMENT="NDB_TABLE=PARTITION_BALANCE=FOR_RA_BY_LDM_X_2" is used withCREATE TABLE).

    Setting this parameter to 0 (the default) disables multithreaded building of ordered indexes.

    This parameter is supported when usingndbd orndbmtd.

    You can enable multithreaded builds during data node initial restarts by setting theTwoPassInitialNodeRestartCopy data node configuration parameter toTRUE.

  • LockExecuteThreadToCPU

    Version (or later)NDB 9.5.0
    Type or unitsset of CPU IDs
    Default0
    Range...
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    When used withndbd, this parameter (now a string) specifies the ID of the CPU assigned to handle theNDBCLUSTER execution thread. When used withndbmtd, the value of this parameter is a comma-separated list of CPU IDs assigned to handle execution threads. Each CPU ID in the list should be an integer in the range 0 to 65535 (inclusive).

    The number of IDs specified should match the number of execution threads determined byMaxNoOfExecutionThreads. However, there is no guarantee that threads are assigned to CPUs in any given order when using this parameter. You can obtain more finely-grained control of this type usingThreadConfig.

    LockExecuteThreadToCPU has no default value.

  • LockMaintThreadsToCPU

    Version (or later)NDB 9.5.0
    Type or unitsCPU ID
    Default0
    Range0 - 64K
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter specifies the ID of the CPU assigned to handleNDBCLUSTER maintenance threads.

    The value of this parameter is an integer in the range 0 to 65535 (inclusive).There is no default value.

  • Numa

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default1
    Range...
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter determines whether Non-Uniform Memory Access (NUMA) is controlled by the operating system or by the data node process, whether the data node usesndbd orndbmtd. By default,NDB attempts to use an interleaved NUMA memory allocation policy on any data node where the host operating system provides NUMA support.

    SettingNuma = 0 means that the datanode process does not itself attempt to set a policy for memory allocation, and permits this behavior to be determined by the operating system, which may be further guided by the separatenumactl tool. That is,Numa = 0 yields the system default behavior, which can be customised bynumactl. For many Linux systems, the system default behavior is to allocate socket-local memory to any given process at allocation time. This can be problematic when usingndbmtd; this is becausenbdmtd allocates all memory at startup, leading to an imbalance, giving different access speeds for different sockets, especially when locking pages in main memory.

    SettingNuma = 1 means that the data node process useslibnuma to request interleaved memory allocation. (This can also be accomplished manually, on the operating system level, usingnumactl.) Using interleaved allocation in effect tells the data node process to ignore non-uniform memory access but does not attempt to take any advantage of fast local memory; instead, the data node process tries to avoid imbalances due to slow remote memory. If interleaved allocation is not desired, setNuma to 0 so that the desired behavior can be determined on the operating system level.

    TheNuma configuration parameter is supported only on Linux systems wherelibnuma.so is available.

  • RealtimeScheduler

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaultfalse
    Rangetrue, false
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Setting this parameter to 1 enables real-time scheduling of data node threads.

    The default is 0 (scheduling disabled).

  • SchedulerExecutionTimer

    Version (or later)NDB 9.5.0
    Type or unitsµs
    Default50
    Range0 - 11000
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter specifies the time in microseconds for threads to be executed in the scheduler before being sent. Setting it to 0 minimizes the response time; to achieve higher throughput, you can increase the value at the expense of longer response times.

    The default is 50 μsec, which our testing shows to increase throughput slightly in high-load cases without materially delaying requests.

  • SchedulerResponsiveness

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default5
    Range0 - 10
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Set the balance in theNDB scheduler between speed and throughput. This parameter takes an integer whose value is in the range 0-10 inclusive, with 5 as the default. Higher values provide better response times relative to throughput. Lower values provide increased throughput at the expense of longer response times.

  • SchedulerSpinTimer

    Version (or later)NDB 9.5.0
    Type or unitsµs
    Default0
    Range0 - 500
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter specifies the time in microseconds for threads to be executed in the scheduler before sleeping.

    Note

    IfSpinMethod is set, any setting for this parameter is ignored.

  • SpinMethod

    Version (or later)NDB 9.5.0
    Type or unitsenumeration
    DefaultStaticSpinning
    RangeCostBasedSpinning, LatencyOptimisedSpinning, DatabaseMachineSpinning, StaticSpinning
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter provides a simple interface to control adaptive spinning on data nodes, with four possible values furnishing presets for spin parameter values, as shown in the following list:

    1. StaticSpinning (default): SetsEnableAdaptiveSpinning tofalse andSchedulerSpinTimer to 0. (SetAllowedSpinOverhead is not relevant in this case.)

    2. CostBasedSpinning: SetsEnableAdaptiveSpinning totrue,SchedulerSpinTimer to 100, andSetAllowedSpinOverhead to 200.

    3. LatencyOptimisedSpinning: SetsEnableAdaptiveSpinning totrue,SchedulerSpinTimer to 200, andSetAllowedSpinOverhead to 1000.

    4. DatabaseMachineSpinning: SetsEnableAdaptiveSpinning totrue,SchedulerSpinTimer to 500, andSetAllowedSpinOverhead to 10000. This is intended for use in cases where threads own their own CPUs.

    The spin parameters modified bySpinMethod are described in the following list:

    • SchedulerSpinTimer: This is the same as the data node configuration parameter of that name. The setting applied to this parameter bySpinMethod overrides any value set in theconfig.ini file.

    • EnableAdaptiveSpinning: Enables or disables adaptive spinning. Disabling it causes spinning to be performed without making any checks for CPU resources. This parameter cannot be set directly in the cluster configuration file, and under most circumstances should not need to be, but can be enabled directly usingDUMP 104004 1 or disabled withDUMP 104004 0 in thendb_mgm management client.

    • SetAllowedSpinOverhead: Sets the amount of CPU time to allow for gaining latency. This parameter cannot be set directly in theconfig.ini file. In most cases, the setting applied by SpinMethod should be satisfactory, but if it is necessary to change it directly, you can useDUMP 104002overhead to do so, whereoverhead is a value ranging from 0 to 10000, inclusive; see the description of the indicatedDUMP command for details.

    On platforms lacking usable spin instructions, such as PowerPC and some SPARC platforms, spin time is set to 0 in all situations, and values forSpinMethod other thanStaticSpinning are ignored.

  • TwoPassInitialNodeRestartCopy

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaulttrue
    Rangetrue, false
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Multithreaded building of ordered indexes can be enabled for initial restarts of data nodes by setting this configuration parameter totrue (the default value), which enables two-pass copying of data during initial node restarts.

    You must also setBuildIndexThreads to a nonzero value.

Multi-Threading Configuration Parameters (ndbmtd). ndbmtd runs by default as a single-threaded process and must be configured to use multiple threads, using either of two methods, both of which require setting configuration parameters in theconfig.ini file. The first method is simply to set an appropriate value for theMaxNoOfExecutionThreads configuration parameter. A second method makes it possible to set up more complex rules forndbmtd multithreading usingThreadConfig. The next few paragraphs provide information about these parameters and their use with multithreaded data nodes.

Note

A backup using parallelism on the data nodes requires that multiple LDMs are in use on all data nodes in the cluster prior to taking the backup. For more information, seeSection 25.6.8.5, “Taking an NDB Backup with Parallel Data Nodes”, as well asRestoring from a backup taken in parallel.

  • AutomaticThreadConfig

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaultfalse
    Rangetrue, false
    Restart Type

    Initial System Restart:Requires a complete shutdown of the cluster, wiping and restoring the cluster file system from abackup, and then restarting the cluster. (NDB 9.5.0)

    When set to 1, enables automatic thread configuration employing the number of CPUs available to a data node taking into account any limits set bytaskset,numactl, virtual machines, Docker, and other such means of controlling which CPUs are available to a given application (on Windows platforms, automatic thread configuration uses all CPUs which are online); alternatively, you can setNumCPUs to the desired number of CPUs (up to 1024, the maximum number of CPUs that can be handled by automatic thread configuration). Any settings forThreadConfig andMaxNoOfExecutionThreads are ignored. In addition, enabling this parameter automatically disablesClassicFragmentation.

  • ClassicFragmentation

    Version (or later)NDB 9.5.0
    Type or unitsboolean
    Defaulttrue
    Rangetrue, false
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    When enabled (set totrue),NDB distributes fragments among LDMs such that the default number of partitions per node is equal to the minimum number of local data manager (LDM) threads per data node.

    For new clusters, settingClassicFragmentation tofalse when first setting up the cluster is preferable; doing so causes the number of partitions per node to be equal to the value ofPartitionsPerNode, ensuring that all partitions are spread out evenly between all LDMs.

    This parameter andAutomaticThreadConfig are mutually exclusive; enablingAutomaticThreadConfig automatically disablesClassicFragmentation.

  • EnableMultithreadedBackup

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default1
    Range0 - 1
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Enables multi-threaded backup. If each data node has at least 2 LDMs, all LDM threads participate in the backup, which is created using one subdirectory per LDM thread, and each subdirectory containing.ctl,.Data, and.log backup files.

    This parameter is normally enabled (set to 1) forndbmtd. To force a single-threaded backup that can be restored easily using older versions ofndb_restore, disable multi-threaded backup by setting this parameter to 0. This must be done for each data node in the cluster.

    SeeSection 25.6.8.5, “Taking an NDB Backup with Parallel Data Nodes”, andRestoring from a backup taken in parallel, for more information.

  • MaxNoOfExecutionThreads

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default2
    Range2 - 72
    Restart Type

    System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

    This parameter directly controls the number of execution threads used byndbmtd, up to a maximum of 72. Although this parameter is set in[ndbd] or[ndbd default] sections of theconfig.ini file, it is exclusive tondbmtd and does not apply tondbd.

    EnablingAutomaticThreadConfig causes any setting for this parameter to be ignored.

    SettingMaxNoOfExecutionThreads sets the number of threads for each type as determined by a matrix in the filestorage/ndb/src/common/mt_thr_config.cpp. This table shows these numbers of threads for possible values ofMaxNoOfExecutionThreads.

    Table 25.11 MaxNoOfExecutionThreads values and the corresponding number of threads by thread type(LQH, TC, Send, Receive).

    MaxNoOfExecutionThreads ValueLDM ThreadsTC ThreadsSend ThreadsReceive Threads
    0 .. 31001
    4 .. 62001
    7 .. 84001
    94201
    104211
    114311
    126211
    136311
    146312
    156322
    168312
    178412
    188422
    198522
    2010422
    2110522
    2210523
    2310623
    2412523
    2512623
    2612633
    2712733
    2812734
    2912834
    3012844
    3112944
    3216833
    3316834
    3416844
    3516944
    36161044
    37161045
    38161145
    39161155
    40201044
    41201045
    42201145
    43201155
    44201255
    45201256
    46201356
    47201366
    48241255
    49241256
    50241356
    51241366
    52241466
    53241467
    54241567
    55241577
    56241677
    57241678
    58241778
    59241788
    60241888
    61241889
    62241989
    63241999
    64321677
    65321678
    66321778
    67321788
    68321888
    69321889
    70321989
    71322089
    723220810

    There is always one SUMA (replication) thread.

    NoOfFragmentLogParts should be set equal to the number of LDM threads used byndbmtd, as determined by the setting for this parameter. This ratio should not be any greater than 4:1; a configuration in which this is the case is specifically disallowed.

    The number of LDM threads also determines the number of partitions used by anNDB table that is not explicitly partitioned; this is the number of LDM threads times the number of data nodes in the cluster. (Ifndbd is used on the data nodes rather thanndbmtd, then there is always a single LDM thread; in this case, the number of partitions created automatically is simply equal to the number of data nodes. SeeSection 25.2.2, “NDB Cluster Nodes, Node Groups, Fragment Replicas, and Partitions”, for more information.

    Adding large tablespaces for Disk Data tables when using more than the default number of LDM threads may cause issues with resource and CPU usage if the disk page buffer is insufficiently large; see the description of theDiskPageBufferMemory configuration parameter, for more information.

    The thread types are described later in this section (seeThreadConfig).

    Setting this parameter outside the permitted range of values causes the management server to abort on startup with the errorError linenumber: Illegal valuevalue for parameter MaxNoOfExecutionThreads.

    ForMaxNoOfExecutionThreads, a value of 0 or 1 is rounded up internally byNDB to 2, so that 2 is considered this parameter's default and minimum value.

    MaxNoOfExecutionThreads is generally intended to be set equal to the number of CPU threads available, and to allocate a number of threads of each type suitable to typical workloads. It does not assign particular threads to specified CPUs. For cases where it is desirable to vary from the settings provided, or to bind threads to CPUs, you should useThreadConfig instead, which allows you to allocate each thread directly to a desired type, CPU, or both.

    The multithreaded data node process always spawns, at a minimum, the threads listed here:

    • 1 local query handler (LDM) thread

    • 1 receive thread

    • 1 subscription manager (SUMA or replication) thread

    For aMaxNoOfExecutionThreads value of 8 or less, no TC threads are created, and TC handling is instead performed by the main thread.

    Changing the number of LDM threads normally requires a system restart, whether it is changed using this parameter orThreadConfig, but it is possible to effect the change using a node initial restart (NI) provided the following two conditions are met:

    • Each LDM thread handles a maximum of 8 fragments, and

    • The total number of table fragments is an integer multiple of the number of LDM threads.

  • MaxSendDelay

    Version (or later)NDB 9.5.0
    Type or unitsmicroseconds
    Default0
    Range0 - 11000
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter can be used to cause data nodes to wait momentarily before sending data to API nodes; in some circumstances, described in the following paragraphs, this can result in more efficient sending of larger volumes of data and higher overall throughput.

    MaxSendDelay can be useful when there are a great many API nodes at saturation point or close to it, which can result in waves of increasing and decreasing performance. This occurs when the data nodes are able to send results back to the API nodes relatively quickly, with many small packets to process, which can take longer to process per byte compared to large packets, thus slowing down the API nodes; later, the data nodes start sending larger packets again.

    To handle this type of scenario, you can setMaxSendDelay to a nonzero value, which helps to ensure that responses are not sent back to the API nodes so quickly. When this is done, responses are sent immediately when there is no other competing traffic, but when there is, settingMaxSendDelay causes the data nodes to wait long enough to ensure that they send larger packets. In effect, this introduces an artificial bottleneck into the send process, which can actually improve throughput significantly.

  • NoOfFragmentLogParts

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default4
    Range4, 6, 8, 10, 12, 16, 20, 24, 32
    Restart Type

    Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

    Set the number of log file groups for redo logs belonging to thisndbmtd. The value of this parameter should be set equal to the number of LDM threads used byndbmtd as determined by the setting forMaxNoOfExecutionThreads. A configuration using more than 4 redo log parts per LDM is disallowed.

    See the description ofMaxNoOfExecutionThreads for more information.

  • NumCPUs

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default0
    Range0 - 1024
    Restart Type

    Initial System Restart:Requires a complete shutdown of the cluster, wiping and restoring the cluster file system from abackup, and then restarting the cluster. (NDB 9.5.0)

    Cause automatic thread configuration to use only this many CPUs. Has no effect ifAutomaticThreadConfig is not enabled.

  • PartitionsPerNode

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default2
    Range1 - 32
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Sets the number of partitions used on each node when creating a newNDB table. This makes it possible to avoid splitting up tables into an excessive number of partitions when the number of local data managers (LDMs) grows high.

    While it is possible to set this parameter to different values on different data nodes and there are no known issues with doing so, this is also not likely to be of any advantage; for this reason, it is recommended simply to set it once, for all data nodes, in the[ndbd default] section of the globalconfig.ini file.

    IfClassicFragmentation is enabled, any setting for this parameter is ignored. (Remember that enablingAutomaticThreadConfig disablesClassicFragmentation.)

  • ThreadConfig

    Version (or later)NDB 9.5.0
    Type or unitsstring
    Default''
    Range...
    Restart Type

    System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

    This parameter is used withndbmtd to assign threads of different types to different CPUs. Its value is a string whose format has the following syntax:

    ThreadConfig :=entry[,entry[,...]]entry :=type={param[,param[,...]]}type := ldm | query | recover | main | recv | send | rep | io | tc | watchdog | idxbldparam := count=number  | cpubind=cpu_list  | cpuset=cpu_list  | spintime=number  | realtime={0|1}  | nosend={0|1}  | thread_prio={0..10}  | cpubind_exclusive=cpu_list  | cpuset_exclusive=cpu_list

    The curly braces ({...}) surrounding the list of parameters are required, even if there is only one parameter in the list.

    Aparam (parameter) specifies any or all of the following information:

    • The number of threads of the given type (count).

    • The set of CPUs to which the threads of the given type are to be nonexclusively bound. This is determined by either one ofcpubind orcpuset).cpubind causes each thread to be bound (nonexclusively) to a CPU in the set;cpuset means that each thread is bound (nonexclusively) to the set of CPUs specified.

      On Solaris, you can instead specify a set of CPUs to which the threads of the given type are to be bound exclusively.cpubind_exclusive causes each thread to be bound exclusively to a CPU in the set;cpuset_exclsuive means that each thread is bound exclusively to the set of CPUs specified.

      Only one ofcpubind,cpuset,cpubind_exclusive, orcpuset_exclusive can be provided in a single configuration.

    • spintime determines the wait time in microseconds the thread spins before going to sleep.

      The default value forspintime is the value of theSchedulerSpinTimer data node configuration parameter.

      spintime does not apply to I/O threads, watchdog, or offline index build threads, and so cannot be set for these thread types.

    • realtime can be set to 0 or 1. If it is set to 1, the threads run with real-time priority. This also means thatthread_prio cannot be set.

      Therealtime parameter is set by default to the value of theRealtimeScheduler data node configuration parameter.

      realtime cannot be set for offline index build threads.

    • By settingnosend to 1, you can prevent amain,ldm,rep, ortc thread from assisting the send threads. This parameter is 0 by default, and cannot be used with other types of threads.

    • thread_prio is a thread priority level that can be set from 0 to 10, with 10 representing the greatest priority. The default is 5. The precise effects of this parameter are platform-specific, and are described later in this section.

      The thread priority level cannot be set for offline index build threads.

    thread_prio settings and effects by platform.  The implementation ofthread_prio differs between Linux/FreeBSD, Solaris, and Windows. In the following list, we discuss its effects on each of these platforms in turn:

    • Linux and FreeBSD: We mapthread_prio to a value to be supplied to thenice system call. Since a lower niceness value for a process indicates a higher process priority, increasingthread_prio has the effect of lowering thenice value.

      Table 25.12 Mapping of thread_prio to nice values on Linuxand FreeBSD

      thread_prio valuenice value
      019
      116
      212
      38
      44
      50
      6-4
      7-8
      8-12
      9-16
      10-20

      Some operating systems may provide for a maximum process niceness level of 20, but this is not supported by all targeted versions; for this reason, we choose 19 as the maximumnice value that can be set.

    • Solaris: Settingthread_prio on Solaris sets the Solaris FX priority, with mappings as shown in the following table:

      Table 25.13 Mapping of thread_prio to FX priority onSolaris

      thread_prio valueSolarisFX priority
      015
      120
      225
      330
      435
      540
      645
      750
      855
      959
      1060

      Athread_prio setting of 9 is mapped on Solaris to the special FX priority value 59, which means that the operating system also attempts to force the thread to run alone on its own CPU core.

    • Windows: We mapthread_prio to a Windows thread priority value passed to the Windows APISetThreadPriority() function. This mapping is shown in the following table:

      Table 25.14 Mapping of thread_prio to Windows threadpriority

      thread_prio valueWindows thread priority
      0 - 1THREAD_PRIORITY_LOWEST
      2 - 3THREAD_PRIORITY_BELOW_NORMAL
      4 - 5THREAD_PRIORITY_NORMAL
      6 - 7THREAD_PRIORITY_ABOVE_NORMAL
      8 - 10THREAD_PRIORITY_HIGHEST

    Thetype attribute represents an NDB thread type. The thread types supported, and the range of permittedcount values for each, are provided in the following list:

    • ldm: Local query handler (DBLQH kernel block) that handles data. The more LDM threads that are used, the more highly partitioned the data becomes.

      WhenClassicFragmentation is set to 0, the number of partitions is independent of the number of LDM threads, and depends on the value ofPartitionsPerNode instead.) Each LDM thread maintains its own sets of data and index partitions, as well as its own redo log.ldm can be set to any value in the range 0 to 332 inclusive. When setting it to 0,main,rep, andtc must also be 0, andrecv must also be set to 1; doing this causesndbmtd to emulatendbd.

      Each LDM thread is normally grouped with 1 query thread to form an LDM group. A set of 4 to 8 LDM groups is grouped into a round robin groups. Each LDM thread can be assisted in execution by any query or threads in the same round robin group.NDB attempts to form round robin groups such that all threads in each round robin group are locked to CPUs that are attached to the same L3 cache, within the limits of the range stated for a round robin group's size.

      Changing the number of LDM threads normally requires a system restart to be effective and safe for cluster operations; this requirement is relaxed in certain cases, as explained later in this section. This is also true when this is done usingMaxNoOfExecutionThreads.

      Adding large tablespaces (hundreds of gigabytes or more) for Disk Data tables when using more than the default number of LDMs may cause issues with resource and CPU usage ifDiskPageBufferMemory is not sufficiently large.

      Ifldm is not included in theThreadConfig value string, oneldm thread is created.

    • query: A query thread is tied to an LDM and together with it forms an LDM group; acts only onREAD COMMITTED queries. The number of query threads must be set to 0, 1, 2, or 3 times the number of LDM threads. Query threads are not used, unless this is overridden by settingquery to a nonzero value, or by enabling theAutomaticThreadConfig parameter.

      A query thread also acts as a recovery thread (see next item), although the reverse is not true.

      Changing the number of query threads requires a node restart.

    • recover: A recovery thread restores data from a fragment as part of an LCP.

      Changing the number of recovery threads requires a node restart.

    • tc: Transaction coordinator thread (DBTC kernel block) containing the state of an ongoing transaction. The maximum number of TC threads is 128.

      Optimally, every new transaction can be assigned to a new TC thread. In most cases 1 TC thread per 2 LDM threads is sufficient to guarantee that this can happen. In cases where the number of writes is relatively small when compared to the number of reads, it is possible that only 1 TC thread per 4 LQH threads is required to maintain transaction states. Conversely, in applications that perform a great many updates, it may be necessary for the ratio of TC threads to LDM threads to approach 1 (for example, 3 TC threads to 4 LDM threads).

      Settingtc to 0 causes TC handling to be done by the main thread. In most cases, this is effectively the same as setting it to 1.

      Range: 0-64

    • main: Data dictionary and transaction coordinator (DBDIH andDBTC kernel blocks), providing schema management. It is also possible to specify zero or two main threads.

      Range: 0-2.

      Settingmain to 0 andrep to 1 causes themain blocks to be placed into therep thread; the combined thread is shown in thendbinfo.threads table asmain_rep. This is effectively the same as settingrep equal to 1 andmain equal to 0.

      It is also possible to set bothmain andrep to 0, in which case both threads are placed in the firstrecv thread; the resulting combined thread is namedmain_rep_recv in thethreads table.

      Ifmain is omitted from theThreadConfig value stringthis, onemain thread is created.

    • recv: Receive thread (CMVMI kernel block). Each receive thread handles one or more sockets for communicating with other nodes in an NDB Cluster, with one socket per node. NDB Cluster supports multiple receive threads; the maximum is 16 such threads.

      Range: 1 - 64.

      Ifrecv is omitted from theThreadConfig value string, onerecv thread is created.

    • send: Send thread (CMVMI kernel block). To increase throughput, it is possible to perform sends from one or more separate, dedicated threads (maximum 8).

      Using an excessive number of send threads can have an adverse effect on scalability.

      Previously, all threads handled their own sending directly; this can still be made to happen by setting the number of send threads to 0 (this also happens whenMaxNoOfExecutionThreads is set less than 10). While doing so can have an adverse impact on throughput, it can also in some cases provide decreased latency.

      Range:

      • 0 - 64

    • rep: Replication thread (SUMA kernel block). This thread can also be combined with the main thread (see range information).

      Range: 0-1.

      Settingrep to 0 andmain to 1 causes therep blocks to be placed into themain thread; the combined thread is shown in thendbinfo.threads table asmain_rep. This is effectively the same as settingmain equal to 1 andrep equal to 0.

      It is also possible to set bothmain andrep to 0, in which case both threads are placed in the firstrecv thread; the resulting combined thread is namedmain_rep_recv in thethreads table.

      Ifrep is omitted from theThreadConfig value string, onerep thread is created.

    • io: File system and other miscellaneous operations. These are not demanding tasks, and are always handled as a group by a single, dedicated I/O thread.

      Range: 1 only.

    • watchdog: Parameters settings associated with this type are actually applied to several threads, each having a specific use. These threads include theSocketServer thread, which receives connection setups from other nodes; theSocketClient thread, which attempts to set up connections to other nodes; and the thread watchdog thread that checks that threads are progressing.

      Range: 1 only.

    • idxbld: Offline index build threads. Unlike the other thread types listed previously, which are permanent, these are temporary threads which are created and used only during node or system restarts, or when runningndb_restore--rebuild-indexes. They may be bound to CPU sets which overlap with CPU sets bound to permanent thread types.

      thread_prio,realtime, andspintime values cannot be set for offline index build threads. In addition,count is ignored for this type of thread.

      Ifidxbld is not specified, the default behavior is as follows:

      • Offline index build threads are not bound if the I/O thread is also not bound, and these threads use any available cores.

      • If the I/O thread is bound, then the offline index build threads are bound to the entire set of bound threads, due to the fact that there should be no other tasks for these threads to perform.

      Range: 0 - 1.

    ChangingThreadCOnfig normally requires a system initial restart, but this requirement can be relaxed under certain circumstances:

    • If, following the change, the number of LDM threads remains the same as before, nothing more than a simple node restart (rolling restart, orN) is required to implement the change.

    • Otherwise (that is, if the number of LDM threads changes), it is still possible to effect the change using a node initial restart (NI) provided the following two conditions are met:

      1. Each LDM thread handles a maximum of 8 fragments, and

      2. The total number of table fragments is an integer multiple of the number of LDM threads.

    In any other case, a system initial restart is needed to change this parameter.

    NDB can distinguish between thread types by both of the following criteria:

    • Whether the thread is an execution thread. Threads of typemain,ldm,query,recv,rep,tc, andsend are execution threads;io,recover,watchdog, andidxbld threads are not considered execution threads.

    • Whether the allocation of threads to a given task is permanent or temporary. Currently all thread types exceptidxbld are considered permanent;idxbld threads are regarded as temporary threads.

    Simple examples:

    # Example 1.ThreadConfig=ldm={count=2,cpubind=1,2},main={cpubind=12},rep={cpubind=11}# Example 2.Threadconfig=main={cpubind=0},ldm={count=4,cpubind=1,2,5,6},io={cpubind=3}

    It is usually desirable when configuring thread usage for a data node host to reserve one or more number of CPUs for operating system and other tasks. Thus, for a host machine with 24 CPUs, you might want to use 20 CPU threads (leaving 4 for other uses), with 8 LDM threads, 4 TC threads (half the number of LDM threads), 3 send threads, 3 receive threads, and 1 thread each for schema management, asynchronous replication, and I/O operations. (This is almost the same distribution of threads used whenMaxNoOfExecutionThreads is set equal to 20.) The followingThreadConfig setting performs these assignments, additionally binding all of these threads to specific CPUs:

    ThreadConfig=ldm{count=8,cpubind=1,2,3,4,5,6,7,8},main={cpubind=9},io={cpubind=9}, \rep={cpubind=10},tc{count=4,cpubind=11,12,13,14},recv={count=3,cpubind=15,16,17}, \send{count=3,cpubind=18,19,20}

    It should be possible in most cases to bind the main (schema management) thread and the I/O thread to the same CPU, as we have done in the example just shown.

    The following example incorporates groups of CPUs defined using bothcpuset andcpubind, as well as use of thread prioritization.

    ThreadConfig=ldm={count=4,cpuset=0-3,thread_prio=8,spintime=200}, \ldm={count=4,cpubind=4-7,thread_prio=8,spintime=200}, \tc={count=4,cpuset=8-9,thread_prio=6},send={count=2,thread_prio=10,cpubind=10-11}, \main={count=1,cpubind=10},rep={count=1,cpubind=11}

    In this case we create two LDM groups; the first usescpubind and the second usescpuset.thread_prio andspintime are set to the same values for each group. This means there are eight LDM threads in total. (You should ensure thatNoOfFragmentLogParts is also set to 8.) The four TC threads use only two CPUs; it is possible when usingcpuset to specify fewer CPUs than threads in the group. (This is not true forcpubind.) The send threads use two threads usingcpubind to bind these threads to CPUs 10 and 11. The main and rep threads can reuse these CPUs.

    This example shows howThreadConfig andNoOfFragmentLogParts might be set up for a 24-CPU host with hyperthreading, leaving CPUs 10, 11, 22, and 23 available for operating system functions and interrupts:

    NoOfFragmentLogParts=10ThreadConfig=ldm={count=10,cpubind=0-4,12-16,thread_prio=9,spintime=200}, \tc={count=4,cpuset=6-7,18-19,thread_prio=8},send={count=1,cpuset=8}, \recv={count=1,cpuset=20},main={count=1,cpuset=9,21},rep={count=1,cpuset=9,21}, \io={count=1,cpuset=9,21,thread_prio=8},watchdog={count=1,cpuset=9,21,thread_prio=9}

    The next few examples include settings foridxbld. The first two of these demonstrate how a CPU set defined foridxbld can overlap those specified for other (permanent) thread types, the first usingcpuset and the second usingcpubind:

    ThreadConfig=main,ldm={count=4,cpuset=1-4},tc={count=4,cpuset=5,6,7}, \io={cpubind=8},idxbld={cpuset=1-8}ThreadConfig=main,ldm={count=1,cpubind=1},idxbld={count=1,cpubind=1}

    The next example specifies a CPU for the I/O thread, but not for the index build threads:

    ThreadConfig=main,ldm={count=4,cpuset=1-4},tc={count=4,cpuset=5,6,7}, \io={cpubind=8}

    Since theThreadConfig setting just shown locks threads to eight cores numbered 1 through 8, it is equivalent to the setting shown here:

    ThreadConfig=main,ldm={count=4,cpuset=1-4},tc={count=4,cpuset=5,6,7}, \io={cpubind=8},idxbld={cpuset=1,2,3,4,5,6,7,8}

    In order to take advantage of the enhanced stability that the use ofThreadConfig offers, it is necessary to insure that CPUs are isolated, and that they not subject to interrupts, or to being scheduled for other tasks by the operating system. On many Linux systems, you can do this by settingIRQBALANCE_BANNED_CPUS in/etc/sysconfig/irqbalance to0xFFFFF0, and by using theisolcpus boot option ingrub.conf. For specific information, see your operating system or platform documentation.

Disk Data Configuration Parameters.  Configuration parameters affecting Disk Data behavior include the following:

  • DiskPageBufferEntries

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default64MB
    Range4MB - 16TB
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This is the number of page entries (page references) to allocate. It is specified as a number of 32K pages inDiskPageBufferMemory. The default is sufficient for most cases but you may need to increase the value of this parameter if you encounter problems with very large transactions on Disk Data tables. Each page entry requires approximately 100 bytes.

  • DiskPageBufferMemory

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default64M
    Range4M - 16T
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This determines the amount of space, in bytes, used for caching pages on disk, and is set in the[ndbd] or[ndbd default] section of theconfig.ini file.

    If the value forDiskPageBufferMemory is set too low in conjunction with using more than the default number of LDM threads inThreadConfig (for example{ldm=6...}), problems can arise when trying to add a large (for example 500G) data file to a disk-basedNDB table, wherein the process takes indefinitely long while occupying one of the CPU cores.

    This is due to the fact that, as part of adding a data file to a tablespace, extent pages are locked into memory in an extra PGMAN worker thread, for quick metadata access. When adding a large file, this worker has insufficient memory for all of the data file metadata. In such cases, you should either increaseDiskPageBufferMemory, or add smaller tablespace files. You may also need to adjustDiskPageBufferEntries.

    You can query thendbinfo.diskpagebuffer table to help determine whether the value for this parameter should be increased to minimize unnecessary disk seeks. SeeSection 25.6.15.31, “The ndbinfo diskpagebuffer Table”, for more information.

  • SharedGlobalMemory

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default128M
    Range0 - 64T
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter determines the amount of memory that is used for log buffers, disk operations (such as page requests and wait queues), and metadata for tablespaces, log file groups,UNDO files, and data files. The shared global memory pool also provides memory used for satisfying the memory requirements of theUNDO_BUFFER_SIZE option used withCREATE LOGFILE GROUP andALTER LOGFILE GROUP statements, including any default value implied for this options by the setting of theInitialLogFileGroup data node configuration parameter.SharedGlobalMemory can be set in the[ndbd] or[ndbd default] section of theconfig.ini configuration file, and is measured in bytes.

    The default value is128M.

  • DiskIOThreadPool

    Version (or later)NDB 9.5.0
    Type or unitsthreads
    Default2
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter determines the number of unbound threads used for Disk Data file access. BeforeDiskIOThreadPool was introduced, exactly one thread was spawned for each Disk Data file, which could lead to performance issues, particularly when using very large data files. WithDiskIOThreadPool, you can—for example—access a single large data file using several threads working in parallel.

    This parameter applies to Disk Data I/O threads only.

    The optimum value for this parameter depends on your hardware and configuration, and includes these factors:

    • Physical distribution of Disk Data files.  You can obtain better performance by placing data files, undo log files, and the data node file system on separate physical disks. If you do this with some or all of these sets of files, then you can (and should) setDiskIOThreadPool higher to enable separate threads to handle the files on each disk.

      You should also disableDiskDataUsingSameDisk when using a separate disk or disks for Disk Data files; this increases the rate at which checkpoints of Disk Data tablespaces can be performed.

    • Disk performance and types.  The number of threads that can be accommodated for Disk Data file handling is also dependent on the speed and throughput of the disks. Faster disks and higher throughput allow for more disk I/O threads. Our test results indicate that solid-state disk drives can handle many more disk I/O threads than conventional disks, and thus higher values forDiskIOThreadPool.

      DecreasingTimeBetweenGlobalCheckpoints is also recommended when using solid-state disk drives, in particular those using NVMe. See alsoDisk Data latency parameters.

    The default value for this parameter is 2.

  • Disk Data file system parameters.  The parameters in the following list make it possible to place NDB Cluster Disk Data files in specific directories without the need for using symbolic links.

    • FileSystemPathDD

      Version (or later)NDB 9.5.0
      Type or unitsfilename
      DefaultFileSystemPath
      Range...
      Restart Type

      Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

      If this parameter is specified, then NDB Cluster Disk Data data files and undo log files are placed in the indicated directory. This can be overridden for data files, undo log files, or both, by specifying values forFileSystemPathDataFiles,FileSystemPathUndoFiles, or both, as explained for these parameters. It can also be overridden for data files by specifying a path in theADD DATAFILE clause of aCREATE TABLESPACE orALTER TABLESPACE statement, and for undo log files by specifying a path in theADD UNDOFILE clause of aCREATE LOGFILE GROUP orALTER LOGFILE GROUP statement. IfFileSystemPathDD is not specified, thenFileSystemPath is used.

      If aFileSystemPathDD directory is specified for a given data node (including the case where the parameter is specified in the[ndbd default] section of theconfig.ini file), then starting that data node with--initial causes all files in the directory to be deleted.

    • FileSystemPathDataFiles

      Version (or later)NDB 9.5.0
      Type or unitsfilename
      DefaultFileSystemPathDD
      Range...
      Restart Type

      Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

      If this parameter is specified, then NDB Cluster Disk Data data files are placed in the indicated directory. This overrides any value set forFileSystemPathDD. This parameter can be overridden for a given data file by specifying a path in theADD DATAFILE clause of aCREATE TABLESPACE orALTER TABLESPACE statement used to create that data file. IfFileSystemPathDataFiles is not specified, thenFileSystemPathDD is used (orFileSystemPath, ifFileSystemPathDD has also not been set).

      If aFileSystemPathDataFiles directory is specified for a given data node (including the case where the parameter is specified in the[ndbd default] section of theconfig.ini file), then starting that data node with--initial causes all files in the directory to be deleted.

    • FileSystemPathUndoFiles

      Version (or later)NDB 9.5.0
      Type or unitsfilename
      DefaultFileSystemPathDD
      Range...
      Restart Type

      Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

      If this parameter is specified, then NDB Cluster Disk Data undo log files are placed in the indicated directory. This overrides any value set forFileSystemPathDD. This parameter can be overridden for a given data file by specifying a path in theADD UNDO clause of aCREATE LOGFILE GROUP orALTER LOGFILE GROUP statement used to create that data file. IfFileSystemPathUndoFiles is not specified, thenFileSystemPathDD is used (orFileSystemPath, ifFileSystemPathDD has also not been set).

      If aFileSystemPathUndoFiles directory is specified for a given data node (including the case where the parameter is specified in the[ndbd default] section of theconfig.ini file), then starting that data node with--initial causes all files in the directory to be deleted.

    For more information, seeSection 25.6.11.1, “NDB Cluster Disk Data Objects”.

  • Disk Data object creation parameters.  The next two parameters enable you—when starting the cluster for the first time—to cause a Disk Data log file group, tablespace, or both, to be created without the use of SQL statements.

    • InitialLogFileGroup

      Version (or later)NDB 9.5.0
      Type or unitsstring
      Default[see documentation]
      Range...
      Restart Type

      System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

      This parameter can be used to specify a log file group that is created when performing an initial start of the cluster.InitialLogFileGroup is specified as shown here:

      InitialLogFileGroup = [name=name;] [undo_buffer_size=size;]file-specification-listfile-specification-list:file-specification[;file-specification[; ...]]file-specification:filename:size

      Thename of the log file group is optional and defaults toDEFAULT-LG. Theundo_buffer_size is also optional; if omitted, it defaults to64M. Eachfile-specification corresponds to an undo log file, and at least one must be specified in thefile-specification-list. Undo log files are placed according to any values that have been set forFileSystemPath,FileSystemPathDD, andFileSystemPathUndoFiles, just as if they had been created as the result of aCREATE LOGFILE GROUP orALTER LOGFILE GROUP statement.

      Consider the following:

      InitialLogFileGroup = name=LG1; undo_buffer_size=128M; undo1.log:250M; undo2.log:150M

      This is equivalent to the following SQL statements:

      CREATE LOGFILE GROUP LG1    ADD UNDOFILE 'undo1.log'    INITIAL_SIZE 250M    UNDO_BUFFER_SIZE 128M    ENGINE NDBCLUSTER;ALTER LOGFILE GROUP LG1    ADD UNDOFILE 'undo2.log'    INITIAL_SIZE 150M    ENGINE NDBCLUSTER;

      This logfile group is created when the data nodes are started with--initial.

      Resources for the initial log file group are added to the global memory pool along with those indicated by the value ofSharedGlobalMemory.

      This parameter, if used, should always be set in the[ndbd default] section of theconfig.ini file. The behavior of an NDB Cluster when different values are set on different data nodes is not defined.

    • InitialTablespace

      Version (or later)NDB 9.5.0
      Type or unitsstring
      Default[see documentation]
      Range...
      Restart Type

      System Restart:Requires a complete shutdown and restart of the cluster. (NDB 9.5.0)

      This parameter can be used to specify an NDB Cluster Disk Data tablespace that is created when performing an initial start of the cluster.InitialTablespace is specified as shown here:

      InitialTablespace = [name=name;] [extent_size=size;]file-specification-list

      Thename of the tablespace is optional and defaults toDEFAULT-TS. Theextent_size is also optional; it defaults to1M. Thefile-specification-list uses the same syntax as shown with theInitialLogfileGroup parameter, the only difference being that eachfile-specification used withInitialTablespace corresponds to a data file. At least one must be specified in thefile-specification-list. Data files are placed according to any values that have been set forFileSystemPath,FileSystemPathDD, andFileSystemPathDataFiles, just as if they had been created as the result of aCREATE TABLESPACE orALTER TABLESPACE statement.

      For example, consider the following line specifyingInitialTablespace in the[ndbd default] section of theconfig.ini file (as withInitialLogfileGroup, this parameter should always be set in the[ndbd default] section, as the behavior of an NDB Cluster when different values are set on different data nodes is not defined):

      InitialTablespace = name=TS1; extent_size=8M; data1.dat:2G; data2.dat:4G

      This is equivalent to the following SQL statements:

      CREATE TABLESPACE TS1    ADD DATAFILE 'data1.dat'    EXTENT_SIZE 8M    INITIAL_SIZE 2G    ENGINE NDBCLUSTER;ALTER TABLESPACE TS1    ADD DATAFILE 'data2.dat'    INITIAL_SIZE 4G    ENGINE NDBCLUSTER;

      This tablespace is created when the data nodes are started with--initial, and can be used whenever creating NDB Cluster Disk Data tables thereafter.

  • Disk Data latency parameters.  The two parameters listed here can be used to improve handling of latency issues with NDB Cluster Disk Data tables.

    • MaxDiskDataLatency

      Version (or later)NDB 9.5.0
      Type or unitsms
      Default0
      Range0 - 8000
      Restart Type

      Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

      This parameter controls the maximum allowed mean latency for disk access (maximum 8000 milliseconds). When this limit is reached,NDB begins to abort transactions in order to decrease pressure on the Disk Data I/O subsystem. Use0 to disable the latency check.

    • DiskDataUsingSameDisk

      Version (or later)NDB 9.5.0
      Type or unitsboolean
      Defaulttrue
      Range...
      Restart Type

      Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

      Set this parameter tofalse if your Disk Data tablespaces use one or more separate disks. Doing so allows checkpoints to tablespaces to be executed at a higher rate than normally used for when disks are shared.

      WhenDiskDataUsingSameDisk istrue,NDB decreases the rate of Disk Data checkpointing whenever an in-memory checkpoint is in progress to help ensure that disk load remains constant.

Disk Data and GCP Stop errors.  Errors encountered when using Disk Data tables such asNodenodeid killed this node because GCP stop was detected (error 2303) are often referred to asGCP stop errors. Such errors occur when the redo log is not flushed to disk quickly enough; this is usually due to slow disks and insufficient disk throughput.

You can help prevent these errors from occurring by using faster disks, and by placing Disk Data files on a separate disk from the data node file system. Reducing the value ofTimeBetweenGlobalCheckpoints tends to decrease the amount of data to be written for each global checkpoint, and so may provide some protection against redo log buffer overflows when trying to write a global checkpoint; however, reducing this value also permits less time in which to write the GCP, so this must be done with caution.

In addition to the considerations given forDiskPageBufferMemory as explained previously, it is also very important that theDiskIOThreadPool configuration parameter be set correctly; havingDiskIOThreadPool set too high is very likely to cause GCP stop errors (Bug #37227).

GCP stops can be caused by save or commit timeouts; theTimeBetweenEpochsTimeout data node configuration parameter determines the timeout for commits. However, it is possible to disable both types of timeouts by setting this parameter to 0.

Parameters for configuring send buffer memory allocation.  Send buffer memory is allocated dynamically from a memory pool shared between all transporters, which means that the size of the send buffer can be adjusted as necessary. (Previously, the NDB kernel used a fixed-size send buffer for every node in the cluster, which was allocated when the node started and could not be changed while the node was running.) TheTotalSendBufferMemory andOverLoadLimit data node configuration parameters permit the setting of limits on this memory allocation. For more information about the use of these parameters (as well asSendBufferMemory), seeSection 25.4.3.14, “Configuring NDB Cluster Send Buffer Parameters”.

See alsoSection 25.6.7, “Adding NDB Cluster Data Nodes Online”.

Redo log over-commit handling.  It is possible to control a data node's handling of operations when too much time is taken flushing redo logs to disk. This occurs when a given redo log flush takes longer thanRedoOverCommitLimit seconds, more thanRedoOverCommitCounter times, causing any pending transactions to be aborted. When this happens, the API node that sent the transaction can handle the operations that should have been committed either by queuing the operations and re-trying them, or by aborting them, as determined byDefaultOperationRedoProblemAction. The data node configuration parameters for setting the timeout and number of times it may be exceeded before the API node takes this action are described in the following list:

  • RedoOverCommitCounter

    Version (or later)NDB 9.5.0
    Type or unitsnumeric
    Default3
    Range1 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    WhenRedoOverCommitLimit is exceeded when trying to write a given redo log to disk this many times or more, any transactions that were not committed as a result are aborted, and an API node where any of these transactions originated handles the operations making up those transactions according to its value forDefaultOperationRedoProblemAction (by either queuing the operations to be re-tried, or aborting them).

  • RedoOverCommitLimit

    Version (or later)NDB 9.5.0
    Type or unitsseconds
    Default20
    Range1 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    This parameter sets an upper limit in seconds for trying to write a given redo log to disk before timing out. The number of times the data node tries to flush this redo log, but takes longer thanRedoOverCommitLimit, is kept and compared withRedoOverCommitCounter, and when flushing takes too long more times than the value of that parameter, any transactions that were not committed as a result of the flush timeout are aborted. When this occurs, the API node where any of these transactions originated handles the operations making up those transactions according to itsDefaultOperationRedoProblemAction setting (it either queues the operations to be re-tried, or aborts them).

Controlling restart attempts.  It is possible to exercise finely-grained control over restart attempts by data nodes when they fail to start using theMaxStartFailRetries andStartFailRetryDelay data node configuration parameters.

MaxStartFailRetries limits the total number of retries made before giving up on starting the data node,StartFailRetryDelay sets the number of seconds between retry attempts. These parameters are listed here:

  • StartFailRetryDelay

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default0
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Use this parameter to set the number of seconds between restart attempts by the data node in the event on failure on startup. The default is 0 (no delay).

    Both this parameter andMaxStartFailRetries are ignored unlessStopOnError is equal to 0.

  • MaxStartFailRetries

    Version (or later)NDB 9.5.0
    Type or unitsunsigned
    Default3
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Use this parameter to limit the number restart attempts made by the data node in the event that it fails on startup. The default is 3 attempts.

    Both this parameter andStartFailRetryDelay are ignored unlessStopOnError is equal to 0.

NDB index statistics parameters.  The parameters in the following list relate to NDB index statistics generation.

  • IndexStatAutoCreate

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default1
    Range0, 1
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Enable (set equal to 1) or disable (set equal to 0) automatic statistics collection when indexes are created.

  • IndexStatAutoUpdate

    Version (or later)NDB 9.5.0
    Type or unitsinteger
    Default1
    Range0, 1
    Restart Type

    Node Restart:Requires arolling restart of the cluster. (NDB 9.5.0)

    Enable (set equal to 1) or disable (set equal to 0) monitoring of indexes for changes, and trigger automatic statistics updates when these are detected. The degree of change needed to trigger the updates are determined by the settings for theIndexStatTriggerPct andIndexStatTriggerScale options.

  • IndexStatSaveSize

    Version (or later)NDB 9.5.0
    Type or unitsbytes
    Default32768
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

    Maximum space in bytes allowed for the saved statistics of any given index in theNDB system tables and in themysqld memory cache.

    At least one sample is always produced, regardless of any size limit. This size is scaled byIndexStatSaveScale.

    The size specified byIndexStatSaveSize is scaled by the value ofIndexStatTriggerPct for a large index, times 0.01. This is further multiplied by the logarithm to the base 2 of the index size. SettingIndexStatTriggerPct equal to 0 disables the scaling effect.

  • IndexStatSaveScale

    Version (or later)NDB 9.5.0
    Type or unitspercentage
    Default100
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

    The size specified byIndexStatSaveSize is scaled by the value ofIndexStatTriggerPct for a large index, times 0.01. This is further multiplied by the logarithm to the base 2 of the index size. SettingIndexStatTriggerPct equal to 0 disables the scaling effect.

  • IndexStatTriggerPct

    Version (or later)NDB 9.5.0
    Type or unitspercentage
    Default100
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

    Percentage change in updates that triggers an index statistics update. The value is scaled byIndexStatTriggerScale. You can disable this trigger altogether by settingIndexStatTriggerPct to 0.

  • IndexStatTriggerScale

    Version (or later)NDB 9.5.0
    Type or unitspercentage
    Default100
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

    ScaleIndexStatTriggerPct by this amount times 0.01 for a large index. A value of 0 disables scaling.

  • IndexStatUpdateDelay

    Version (or later)NDB 9.5.0
    Type or unitsseconds
    Default60
    Range0 - 4294967039 (0xFFFFFEFF)
    Restart Type

    Initial Node Restart:Requires arolling restart of the cluster; each data node must be restarted with--initial. (NDB 9.5.0)

    Minimum delay in seconds between automatic index statistics updates for a given index. Setting this variable to 0 disables any delay. The default is 60 seconds.

Restart types.  Information about the restart types used by the parameter descriptions in this section is shown in the following table:

Table 25.15 NDB Cluster restart types

SymbolRestart TypeDescription
NNodeThe parameter can be updated using a rolling restart (seeSection 25.6.5, “Performing a Rolling Restart of an NDB Cluster”)
SSystemAll cluster nodes must be shut down completely, then restarted, to effect a change in this parameter
IInitialData nodes must be restarted using the--initial option