PDF (A4) - 40.9Mb
Man Pages (TGZ) - 259.7Kb
Man Pages (Zip) - 366.9Kb
Info (Gzip) - 4.1Mb
Info (Zip) - 4.1Mb
NDB replication in NDB 9.2 supports the use of the generic MySQL Server Multithreaded Applier mechanism (MTA), which allows independent binary log transactions to be applied in parallel on a replica, increasing peak replication throughput.
Requirements
The MySQL Server MTA implementation delegates the processing of separate binary log transactions to a pool of worker threads (whose size is configurable), and coordinates the worker threads to ensure that transaction dependencies encoded in the binary log are respected, and that commit ordering is maintained if required (seeSection 19.2.3, “Replication Threads”). To use this functionality with NDB Cluster, it is necessary that the replica be configured to use multiple worker threads. To do this, setreplica_parallel_workers to control the number of worker threads on the replica. The default is 4.
MTA Configuration: Source
If set on the sourcemysqld,replica_parallel_type must beLOGICAL_CLOCK (the default value).
NDB does not supportreplica_parallel_type=DATABASE.
In addition, it is recommended that you set the amount of memory used to track binary log transaction writesets on the source (binlog_transaction_dependency_history_size) to, whereE *PE is the average epoch size (as the number of operations per epoch) andP is the maximum expected parallelism. SeeWriteset Tracking Memory Usage, for more information.
MTA Configuration: Replica
Replicamysqld configuration for theNDB MTA requires thatreplica_parallel_workers is greater than 1. The recommended starting value when first enabling MTA is 4, which is the default.
In addition,replica_preserve_commit_order must beON. This is also the default value.
Transaction Dependency and Writeset Handling
Transaction dependencies are detected using analysis of each transaction's writeset, that is, the set of rows (table, key values) written by the transaction. Where two transactions modify the same row they are considered to be dependent, and must be applied in order (in other words, serially) to avoid deadlocks or incorrect results. Where a table has secondary unique keys, these values are also added to the transaction's writeset to detect the case where there are transaction dependencies implied by different transactions affecting the same unique key value, and so requiring ordering. Where dependencies cannot be efficiently determined,mysqld falls back to considering transactions dependent for reasons of safety.
Transaction dependencies are encoded in the binary log by the sourcemysqld. Dependencies are encoded in anANONYMOUS_GTID event using a scheme called 'Logical clock'. (SeeSection 19.1.4.1, “Replication Mode Concepts”.)
The writeset implementation employed by MySQL (and NDB Cluster) uses hash-based conflict detection based on matching 64-bit row hashes of relevant table and index values. This detects reliably when the same key is seen twice, but can also produce false positives if different table and index values hash to the same 64-bit value; this may result in artificial dependencies which can reduce the available parallelism.
Transaction dependencies are forced by any of the following:
DDL statements
Binary log rotation or encountering binary log file boundaries
Writeset history size limitations
Writes which reference parent foreign keys in the target table
More specifically, transactions which perform inserts, updates, and deletes on foreign keyparent tables are serialized relative to all preceding and following transactions, and not just to those transactions affecting tables involved in a constraint relationship. Conversely, transactions performing inserts, updates and deletes on foreign keychild tables (referencing) are not especially serialized with regard to one another.
The MySQL MTA implementation attempts to apply independent binary log transactions in parallel.NDB records all changes occurring in all user transactions committing in an epoch (TimeBetweenEpochs, default 100 milliseconds), in one binary log transaction, referred to as an epoch transaction. Therefore, for two consecutive epoch transactions to be independent, and possible to apply in parallel, it is required that no row is modified in both epochs. If any single row is modified in both epochs, then they are dependent, and are applied serially, which can limit the expolitable parallelism available.
Epoch transactions are considered independent based on the set of rows modified on the source cluster in the epoch, but not including the generatedmysql.ndb_apply_statusWRITE_ROW events that convey epoch metadata. This avoids every epoch transaction being trivially dependent on the preceding epoch, but does require that the binlog is applied at the replica with the commit order preserved. This also implies that an NDB binary log with writeset dependencies is not suitable for use by a replica database using a different MySQL storage engine.
It may be possible or desirable to modify application transaction behavior to avoid patterns of repeated modifications to the same rows, in separate transactions over a short time period, to increase exploitable apply parallelism.
Writeset Tracking Memory Usage
The amount of memory used to track binary log transaction writesets can be set using thebinlog_transaction_dependency_history_size server system variable, which defaults to 25000 row hashes.
If an average binary log transaction modifiesN rows, then to be able to identify independent (parallelizable) transactions up to a parallelism level ofP, we needbinlog_transaction_dependency_history_size to be at least. (The maximum is 1000000.)N *P
The finite size of the history results in a finite maximum dependency length that can be reliably determined, giving a finite parallelism that can be expressed. Any row not found in the history may be dependent on the last transaction purged from the history.
Writeset history does not act like a sliding window over the lastN transactions; rather, it is a finite buffer which is allowed to fill up completely, then its contents entirely discarded when it becomes full. This means that the history size follows a sawtooth pattern over time, and therefore the maximum detectable dependency length also follows a sawtooth pattern over time, such that independent transactions may still be marked as dependent if the writeset history buffer has been reset between their being processed.
In this scheme, each transaction in a binary log file is annotated with asequence_number (1, 2, 3, ...), and as well as the sequence number of the most recent binary log transaction that it depends on, to which we refer aslast_committed.
Within a given binary log file, the first transaction hassequence_number 1 andlast_committed 0.
Where a binary log transaction depends on its immediate predecessor, its application is serialized. If the dependency is on an earlier transaction then it may be possible to apply the transaction in parallel with the preceding independent transactions.
The content ofANONYMOUS_GTID events, includingsequence_number andlast_committed (and thus the transaction dependencies), can be seen usingmysqlbinlog.
TheANONYMOUS_GTID events generated on the source are handled separately from the compressed transaction payload with bulkBEGIN,TABLE_MAP*,WRITE_ROW*,UPDATE_ROW*,DELETE_ROW*, andCOMMIT events, allowing dependencies to be determined prior to decompression. This means that the replica coordinator thread can delegate transaction payload decompression to a worker thread, providing automatic parallel decompression of independent transactions on the replica.
Known Limitations
Secondary unique columns. Tables with secondary unique columns (that is, unique keys other than the primary key) have all columns sent to the source so that unique-key related conflicts can be detected.
Where the current binary logging mode does not include all columns, but only changed columns (--ndb-log-updated-only=OFF,--ndb-log-update-minimal=ON,--ndb-log-update-as-write=OFF), this can increase the volume of data sent from data nodes to SQL nodes.
The impact depends on both the rate of modification (update or delete) of rows in such tables and the volume of data in columns which are not actually modified.
Replicating NDB to InnoDB. NDB binary log injector transaction dependency tracking intentionally ignores the inter-transaction dependencies created by generatedmysql.ndb_apply_status metadata events, which are handled separately as part of the commit of the epoch transaction on the replica applier. For replication toInnoDB, there is no special handling; this may result in reduced performance or other issues when using anInnoDB multithreaded applier to consume anNDB MTA binary log.
PDF (A4) - 40.9Mb
Man Pages (TGZ) - 259.7Kb
Man Pages (Zip) - 366.9Kb
Info (Gzip) - 4.1Mb
Info (Zip) - 4.1Mb