Documentation Home
MySQL 8.0 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 43.3Mb
PDF (A4) - 43.4Mb
Man Pages (TGZ) - 297.1Kb
Man Pages (Zip) - 402.3Kb
Info (Gzip) - 4.3Mb
Info (Zip) - 4.3Mb
Excerpts from this Manual

MySQL 8.0 Reference Manual  / ...  / The InnoDB Storage Engine  / InnoDB On-Disk Structures  / Tables  /  Converting Tables from MyISAM to InnoDB

17.6.1.5 Converting Tables from MyISAM to InnoDB

If you haveMyISAM tables that you want to convert toInnoDB for better reliability and scalability, review the following guidelines and tips before converting.

Note

PartitionedMyISAM tables created in previous versions of MySQL are not compatible with MySQL 8.0. Such tables must be prepared prior to upgrade, either by removing the partitioning, or by converting them toInnoDB. SeeSection 26.6.2, “Partitioning Limitations Relating to Storage Engines”, for more information.

Adjusting Memory Usage for MyISAM and InnoDB

As you transition away fromMyISAM tables, lower the value of thekey_buffer_size configuration option to free memory no longer needed for caching results. Increase the value of theinnodb_buffer_pool_size configuration option, which performs a similar role of allocating cache memory forInnoDB tables. TheInnoDBbuffer pool caches both table data and index data, speeding up lookups for queries and keeping query results in memory for reuse. For guidance regarding buffer pool size configuration, seeSection 10.12.3.1, “How MySQL Uses Memory”.

Handling Too-Long Or Too-Short Transactions

BecauseMyISAM tables do not supporttransactions, you might not have paid much attention to theautocommit configuration option and theCOMMIT andROLLBACK statements. These keywords are important to allow multiple sessions to read and writeInnoDB tables concurrently, providing substantial scalability benefits in write-heavy workloads.

While a transaction is open, the system keeps a snapshot of the data as seen at the beginning of the transaction, which can cause substantial overhead if the system inserts, updates, and deletes millions of rows while a stray transaction keeps running. Thus, take care to avoid transactions that run for too long:

  • If you are using amysql session for interactive experiments, alwaysCOMMIT (to finalize the changes) orROLLBACK (to undo the changes) when finished. Close down interactive sessions rather than leave them open for long periods, to avoid keeping transactions open for long periods by accident.

  • Make sure that any error handlers in your application alsoROLLBACK incomplete changes orCOMMIT completed changes.

  • ROLLBACK is a relatively expensive operation, becauseINSERT,UPDATE, andDELETE operations are written toInnoDB tables prior to theCOMMIT, with the expectation that most changes are committed successfully and rollbacks are rare. When experimenting with large volumes of data, avoid making changes to large numbers of rows and then rolling back those changes.

  • When loading large volumes of data with a sequence ofINSERT statements, periodicallyCOMMIT the results to avoid having transactions that last for hours. In typical load operations for data warehousing, if something goes wrong, you truncate the table (usingTRUNCATE TABLE) and start over from the beginning rather than doing aROLLBACK.

The preceding tips save memory and disk space that can be wasted during too-long transactions. When transactions are shorter than they should be, the problem is excessive I/O. With eachCOMMIT, MySQL makes sure each change is safely recorded to disk, which involves some I/O.

  • For most operations onInnoDB tables, you should use the settingautocommit=0. From an efficiency perspective, this avoids unnecessary I/O when you issue large numbers of consecutiveINSERT,UPDATE, orDELETE statements. From a safety perspective, this allows you to issue aROLLBACK statement to recover lost or garbled data if you make a mistake on themysql command line, or in an exception handler in your application.

  • autocommit=1 is suitable forInnoDB tables when running a sequence of queries for generating reports or analyzing statistics. In this situation, there is no I/O penalty related toCOMMIT orROLLBACK, andInnoDB canautomatically optimize the read-only workload.

  • If you make a series of related changes, finalize all the changes at once with a singleCOMMIT at the end. For example, if you insert related pieces of information into several tables, do a singleCOMMIT after making all the changes. Or if you run many consecutiveINSERT statements, do a singleCOMMIT after all the data is loaded; if you are doing millions ofINSERT statements, perhaps split up the huge transaction by issuing aCOMMIT every ten thousand or hundred thousand records, so the transaction does not grow too large.

  • Remember that even aSELECT statement opens a transaction, so after running some report or debugging queries in an interactivemysql session, either issue aCOMMIT or close themysql session.

For related information, seeSection 17.7.2.2, “autocommit, Commit, and Rollback”.

Handling Deadlocks

You might see warning messages referring todeadlocks in the MySQL error log, or the output ofSHOW ENGINE INNODB STATUS. Adeadlock is not a serious issue forInnoDB tables, and often does not require any corrective action. When two transactions start modifying multiple tables, accessing the tables in a different order, they can reach a state where each transaction is waiting for the other and neither can proceed. Whendeadlock detection is enabled (the default), MySQL immediately detects this condition and cancels (rolls back) thesmaller transaction, allowing the other to proceed. If deadlock detection is disabled using theinnodb_deadlock_detect configuration option,InnoDB relies on theinnodb_lock_wait_timeout setting to roll back transactions in case of a deadlock.

Either way, your applications need error-handling logic to restart a transaction that is forcibly cancelled due to a deadlock. When you re-issue the same SQL statements as before, the original timing issue no longer applies. Either the other transaction has already finished and yours can proceed, or the other transaction is still in progress and your transaction waits until it finishes.

If deadlock warnings occur constantly, you might review the application code to reorder the SQL operations in a consistent way, or to shorten the transactions. You can test with theinnodb_print_all_deadlocks option enabled to see all deadlock warnings in the MySQL error log, rather than only the last warning in theSHOW ENGINE INNODB STATUS output.

For more information, seeSection 17.7.5, “Deadlocks in InnoDB”.

Storage Layout

To get the best performance fromInnoDB tables, you can adjust a number of parameters related to storage layout.

When you convertMyISAM tables that are large, frequently accessed, and hold vital data, investigate and consider theinnodb_file_per_table andinnodb_page_size variables, and theROW_FORMAT andKEY_BLOCK_SIZE clauses of theCREATE TABLE statement.

During your initial experiments, the most important setting isinnodb_file_per_table. When this setting is enabled, which is the default, newInnoDB tables are implicitly created infile-per-table tablespaces. In contrast with theInnoDB system tablespace, file-per-table tablespaces allow disk space to be reclaimed by the operating system when a table is truncated or dropped. File-per-table tablespaces also supportDYNAMIC andCOMPRESSED row formats and associated features such as table compression, efficient off-page storage for long variable-length columns, and large index prefixes. For more information, seeSection 17.6.3.2, “File-Per-Table Tablespaces”.

You can also storeInnoDB tables in a shared general tablespace, which support multiple tables and all row formats. For more information, seeSection 17.6.3.3, “General Tablespaces”.

Converting an Existing Table

To convert a non-InnoDB table to useInnoDB useALTER TABLE:

ALTER TABLEtable_name ENGINE=InnoDB;
Cloning the Structure of a Table

You might make anInnoDB table that is a clone of a MyISAM table, rather than usingALTER TABLE to perform conversion, to test the old and new table side-by-side before switching.

Create an emptyInnoDB table with identical column and index definitions. UseSHOW CREATE TABLEtable_name\G to see the fullCREATE TABLE statement to use. Change theENGINE clause toENGINE=INNODB.

Transferring Data

To transfer a large volume of data into an emptyInnoDB table created as shown in the previous section, insert the rows withINSERT INTOinnodb_table SELECT * FROMmyisam_table ORDER BYprimary_key_columns.

You can also create the indexes for theInnoDB table after inserting the data. Historically, creating new secondary indexes was a slow operation forInnoDB, but now you can create the indexes after the data is loaded with relatively little overhead from the index creation step.

If you haveUNIQUE constraints on secondary keys, you can speed up a table import by turning off the uniqueness checks temporarily during the import operation:

SET unique_checks=0;... import operation ...SET unique_checks=1;

For big tables, this saves disk I/O becauseInnoDB can use itschange buffer to write secondary index records as a batch. Be certain that the data contains no duplicate keys.unique_checks permits but does not require storage engines to ignore duplicate keys.

For better control over the insertion process, you can insert big tables in pieces:

INSERT INTO newtable SELECT * FROM oldtable   WHERE yourkey >something AND yourkey <=somethingelse;

After all records are inserted, you can rename the tables.

During the conversion of big tables, increase the size of theInnoDB buffer pool to reduce disk I/O. Typically, the recommended buffer pool size is 50 to 75 percent of system memory. You can also increase the size ofInnoDB log files.

Storage Requirements

If you intend to make several temporary copies of your data inInnoDB tables during the conversion process, it is recommended that you create the tables in file-per-table tablespaces so that you can reclaim the disk space when you drop the tables. When theinnodb_file_per_table configuration option is enabled (the default), newly createdInnoDB tables are implicitly created in file-per-table tablespaces.

Whether you convert theMyISAM table directly or create a clonedInnoDB table, make sure that you have sufficient disk space to hold both the old and new tables during the process.InnoDB tables require more disk space thanMyISAM tables. If anALTER TABLE operation runs out of space, it starts a rollback, and that can take hours if it is disk-bound. For inserts,InnoDB uses the insert buffer to merge secondary index records to indexes in batches. That saves a lot of disk I/O. For rollback, no such mechanism is used, and the rollback can take 30 times longer than the insertion.

In the case of a runaway rollback, if you do not have valuable data in your database, it may be advisable to kill the database process rather than wait for millions of disk I/O operations to complete. For the complete procedure, seeSection 17.21.3, “Forcing InnoDB Recovery”.

Defining Primary Keys

ThePRIMARY KEY clause is a critical factor affecting the performance of MySQL queries and the space usage for tables and indexes. The primary key uniquely identifies a row in a table. Every row in the table should have a primary key value, and no two rows can have the same primary key value.

These are guidelines for the primary key, followed by more detailed explanations.

  • Declare aPRIMARY KEY for each table. Typically, it is the most important column that you refer to inWHERE clauses when looking up a single row.

  • Declare thePRIMARY KEY clause in the originalCREATE TABLE statement, rather than adding it later through anALTER TABLE statement.

  • Choose the column and its data type carefully. Prefer numeric columns over character or string ones.

  • Consider using an auto-increment column if there is not another stable, unique, non-null, numeric column to use.

  • An auto-increment column is also a good choice if there is any doubt whether the value of the primary key column could ever change. Changing the value of a primary key column is an expensive operation, possibly involving rearranging data within the table and within each secondary index.

Consider adding aprimary key to any table that does not already have one. Use the smallest practical numeric type based on the maximum projected size of the table. This can make each row slightly more compact, which can yield substantial space savings for large tables. The space savings are multiplied if the table has anysecondary indexes, because the primary key value is repeated in each secondary index entry. In addition to reducing data size on disk, a small primary key also lets more data fit into thebuffer pool, speeding up all kinds of operations and improving concurrency.

If the table already has a primary key on some longer column, such as aVARCHAR, consider adding a new unsignedAUTO_INCREMENT column and switching the primary key to that, even if that column is not referenced in queries. This design change can produce substantial space savings in the secondary indexes. You can designate the former primary key columns asUNIQUE NOT NULL to enforce the same constraints as thePRIMARY KEY clause, that is, to prevent duplicate or null values across all those columns.

If you spread related information across multiple tables, typically each table uses the same column for its primary key. For example, a personnel database might have several tables, each with a primary key of employee number. A sales database might have some tables with a primary key of customer number, and other tables with a primary key of order number. Because lookups using the primary key are very fast, you can construct efficient join queries for such tables.

If you leave thePRIMARY KEY clause out entirely, MySQL creates an invisible one for you. It is a 6-byte value that might be longer than you need, thus wasting space. Because it is hidden, you cannot refer to it in queries.

Application Performance Considerations

The reliability and scalability features ofInnoDB require more disk storage than equivalentMyISAM tables. You might change the column and index definitions slightly, for better space utilization, reduced I/O and memory consumption when processing result sets, and better query optimization plans making efficient use of index lookups.

If you set up a numeric ID column for the primary key, use that value to cross-reference with related values in any other tables, particularly forjoin queries. For example, rather than accepting a country name as input and doing queries searching for the same name, do one lookup to determine the country ID, then do other queries (or a single join query) to look up relevant information across several tables. Rather than storing a customer or catalog item number as a string of digits, potentially using up several bytes, convert it to a numeric ID for storing and querying. A 4-byte unsignedINT column can index over 4 billion items (with the US meaning of billion: 1000 million). For the ranges of the different integer types, seeSection 13.1.2, “Integer Types (Exact Value) - INTEGER, INT, SMALLINT, TINYINT, MEDIUMINT, BIGINT”.

Understanding Files Associated with InnoDB Tables

InnoDB files require more care and planning thanMyISAM files do.