▲Home Download Cheat Sheet Documentation Quickstart Installation Tutorial Features Security Performance Advanced Reference Commands Functions • Aggregate• Window Data Types SQL Grammar System Tables Javadoc PDF (2 MB) Support FAQ Error Analyzer Google Group Appendix History License Build Links MVStore Architecture Migration to 2.0 | PerformancePerformance ComparisonPolePosition Benchmark Database Performance Tuning Using the Built-In Profiler Application Profiling Database Profiling Statement Execution Plans How Data is Stored and How Indexes Work Fast Database Import Performance ComparisonIn many cases H2 is faster than other(open source and not open source) database engines.Please note this is mostly a single connection benchmark run on one computer,with many very simple operations running against the database.This benchmark does not include very complex queries.The embedded mode of H2 is faster than the client-server mode becausethe per-statement overhead is greatly reduced. Embedded
Client-Server
Benchmark Results and CommentsH2Version 2.0.202 (2021-11-25) was used for the test.For most operations, the performance of H2 is about the same as for HSQLDB.One situation where H2 is slow is large result sets, because they are buffered todisk if more than a certain number of records are returned.The advantage of buffering is: there is no limit on the result set size. HSQLDBVersion 2.5.1 was used for the test.Cached tables are used in this test ( DerbyVersion 10.14.2.0 was used for the test. Derby is clearly the slowest embedded database in this test.This seems to be a structural problem, because all operations are really slow.It will be hard for the developers of Derby to improve the performance to a reasonable level.A few problems have been identified: leaving autocommit on is a problem for Derby.If it is switched off during the whole test, the results are about 20% better for Derby.Derby calls PostgreSQLVersion 13.4 was used for the test.The following options where changed in MySQLVersion 8.0.27 was used for the test.MySQL was run with the InnoDB backend. The setting SQLiteSQLite 3.36.0.3, configured to useWAL and with The benchmarks used include multi-threaded scenarios, and we were not able to get the SQLite JDBC driver we used to workwith them. Help with configuring the driver for multi-threaded usage is welcome. FirebirdFirebird 3.0 (default installation) was tested, but failed on multi-threaded part of the test.It is likely possible to run the performance test with the Firebird database,and any information on how to configure Firebird for this are welcome. Why Oracle / MS SQL Server / DB2 are Not ListedThe license of these databases does not allow to publish benchmark results.This doesn't mean that they are fast. They are in fact quite slow,and need a lot of memory. But you will need to test this yourself. About this BenchmarkHow to RunThis test was as follows: build benchmark Separate Process per DatabaseFor each database, a new process is started, to ensure the previous test does not impactthe current test. Number of ConnectionsThis is mostly a single-connection benchmark.BenchB uses multiple connections; the other tests use one connection. Real-World TestsGood benchmarks emulate real-world use cases. This benchmark includes 4 test cases:BenchSimple uses one table and many small updates / deletes.BenchA is similar to the TPC-A test, but single connection / single threaded (see also: www.tpc.org).BenchB is similar to the TPC-B test, using multiple connections (one thread per connection).BenchC is similar to the TPC-C test, but single connection / single threaded. Comparing Embedded with Server DatabasesThis is mainly a benchmark for embedded databases (where the application runs in the samevirtual machine as the database engine). However MySQL and PostgreSQL are not Javadatabases and cannot be embedded into a Java application.For the Java databases, both embedded and server modes are tested. Test PlatformThis test is run on Fedora v.34 with Oracle JVM 1.8 and SSD drive. Multiple RunsWhen a Java benchmark is run first, the code is not fully compiled andtherefore runs slower than when running multiple times. A benchmarkshould always run the same test multiple times and ignore the first run(s).This benchmark runs three times, but only the last run is measured. Memory UsageIt is not enough to measure the time taken, the memory usage is important as well.Performance can be improved by using a bigger cache, but the amount of memory is limited.HSQLDB tables are kept fully in memory by default; this benchmarkuses 'disk based' tables for all databases.Unfortunately, it is not so easy to calculate the memory usage of PostgreSQLand MySQL, because they run in a different process than the test. This benchmark currentlydoes not print memory usage of those databases. Delayed OperationsSome databases delay some operations (for example flushing the buffers)until after the benchmark is run. This benchmark waits betweeneach database tested, and each database runs in a different process (sequentially). Transaction Commit / DurabilityDurability means transaction committed to the database will not be lost.Some databases (for example MySQL) try to enforce this by default bycalling Using Prepared StatementsWherever possible, the test cases use prepared statements. Currently Not Tested: Startup TimeThe startup time of a database engine is important as well for embedded use.This time is not measured currently.Also, not tested is the time used to create a database and open an existing database.Here, one (wrapper) connection is opened at the start,and for each step a new connection is opened and then closed. PolePosition BenchmarkThe PolePosition is an open source benchmark. The algorithms are all quite simple.It was developed / sponsored by db4o.This test was not run for a longer time, so please be aware that the results beloware for older database versions (H2 version 1.1, HSQLDB 1.8, Java 1.4).
There are a few problems with the PolePosition test:
Database Performance TuningKeep Connections Open or Use a Connection PoolIf your application opens and closes connections a lot (for example, for each request),you should consider using aconnection pool.Opening a connection using If you open and close connections a lot but don't want to use a connection pool,consider keeping a 'sentinel' connection open for as long as the application runs,or use delayed database closing. See alsoClosing a database. Use a Modern JVMNewer JVMs are faster. Upgrading to the latest version of your JVM can provide a "free" boost to performance.Switching from the default Client JVM to the Server JVM using the Virus ScannersSome virus scanners scan files every time they are accessed.It is very important for performance that database files are not scanned for viruses.The database engine never interprets the data stored in the files as programs,that means even if somebody would store a virus in a database file, this wouldbe harmless (when the virus does not run, it cannot spread).Some virus scanners allow to exclude files by suffix. Ensure files ending with Using the Trace OptionsIf the performance hot spots are in the database engine, in many cases the performancecan be optimized by creating additional indexes, or changing the schema. Sometimes theapplication does not directly generate the SQL statements, for example if an O/R mapping toolis used. To view the SQL statements and JDBC API calls, you can use the trace options.For more information, seeUsing the Trace Options. Index UsageThis database uses indexes to improve the performance of Index HintsIf you have determined that H2 is not using the optimal index for your query, you can use index hints to forceH2 to use specific indexes. SELECT * FROM TEST USE INDEX (index_name_1, index_name_2) WHERE X=1 Only indexes in the list will be used when choosing an index to use on the given table. Thereis no significance to order in this list. It is possible that no index in the list is chosen, in which case a full table scan will be used. An empty list of index names forces a full table scan to be performed. Each index in the list must exist. How Data is Stored InternallyFor persistent databases, if a table is created with a single column primary key of type H2 internally stores table data and indexes in the form of b-trees.Each b-tree stores entries as a list of unique keys (one or more columns) and data (zero or more columns).The table data is always organized in the form of a "data b-tree" with a single column key of type For each additional index, one new "index b-tree" is created. The key of this b-tree consists of the indexed columns,plus the key of the data b-tree. If a primary key is created after the table has been created, or if the primary keycontains multiple column, or if the primary key is not of the data types listed above, then the primary keyis stored in a new index b-tree. OptimizerThis database uses a cost based optimizer. For simple and queries and queries with medium complexity(less than 7 tables in the join), the expected cost (running time) of all possible plans is calculated,and the plan with the lowest cost is used. For more complex queries, the algorithm first triesall possible combinations for the first few tables, and the remaining tables added using a greedy algorithm(this works well for most joins). Afterwards a genetic algorithm is used to test at most 2000 distinct plans.Only left-deep plans are evaluated. Expression OptimizationAfter the statement is parsed, all expressions are simplified automatically if possible. Operationsare evaluated only once if all parameters are constant. Functions are also optimized, but onlyif the function is constant (always returns the same result for the same parameter values).If the COUNT(*) OptimizationIf the query only counts all rows of a table, then the data is not accessed.However, this is only possible if no Updating Optimizer Statistics / Column SelectivityWhen executing a query, at most one index per join can be used.If the same table is joined multiple times, for each join only one index is used(the same index could be used for both joins, or each join could use a different index).Example: for the query If a table has multiple indexes, sometimes more than one index could be used.Example: if there is a table The SQL statement In-Memory (Hash) IndexesUsing in-memory indexes, specially in-memory hash indexes, can speed upqueries and data manipulation. In-memory indexes are automatically usedfor in-memory databases, but can also be created for persistent databasesusing In-memory hash indexes are backed by a hash table and are usually faster thanregular indexes. However, hash indexes only supports direct lookup ( Use Prepared StatementsIf possible, use prepared statements with parameters. Prepared Statements and IN(...)Avoid generating SQL statements with a variable size IN(...) list.Instead, use a prepared statement with arrays as in the following example: PreparedStatement prep = conn.prepareStatement( "SELECT * FROM TEST WHERE ID = ANY(?)");prep.setObject(1, new Long[] { 1L, 2L });ResultSet rs = prep.executeQuery();Optimization ExamplesSee Cache Size and TypeBy default the cache size of H2 is quite small. Consider using a larger cache size, or enablethe second level soft reference cache. See alsoCache Settings. Data TypesEach data type has different storage and performance characteristics:
Sorted Insert OptimizationTo reduce disk space usage and speed up table creation, anoptimization for sorted inserts is available. When used, b-tree pagesare split at the insertion point. To use this optimization, add CREATE TABLE TEST(ID INT PRIMARY KEY, NAME VARCHAR) AS SORTED SELECT X, SPACE(100) FROM SYSTEM_RANGE(1, 100);INSERT INTO TEST SORTED SELECT X, SPACE(100) FROM SYSTEM_RANGE(101, 200); Using the Built-In ProfilerA very simple Java profiler is built-in. To use it, use the following template: import org.h2.util.Profiler;Profiler prof = new Profiler();prof.startCollecting();// .... some long running process, at least a few secondsprof.stopCollecting();System.out.println(prof.getTop(3)); Application ProfilingAnalyze FirstBefore trying to optimize performance, it is important to understand where the problem is (what part of the application is slow).Blind optimization or optimization based on guesses should be avoided, because usually it is not an efficient strategy.There are various ways to analyze an application. Sometimes two implementations can be compared using A simple way to profile an application is to use the built-in profiling tool of java. Example: java -Xrunhprof:cpu=samples,depth=16 com.acme.Test Unfortunately, it is only possible to profile the application from start to end. Another solution is to createa number of full thread dumps. To do that, first run A simple profiling tool is included in H2. To use it, the application needs to be changed slightly. Example: import org.h2.util;...Profiler profiler = new Profiler();profiler.startCollecting();// application codeSystem.out.println(profiler.getTop(3)); The profiler is built into the H2 Console tool, to analyze databases that open slowly.To use it, run the H2 Console, and then click on 'Test Connection'.Afterwards, click on "Test successful" and you get the most common stack traces,which helps to find out why it took so long to connect. You will only get the stack tracesif opening the database took more than a few seconds. Database ProfilingThe SET TRACE_LEVEL_FILE 2;DROP TABLE IF EXISTS TEST;CREATE TABLE TEST(ID INT PRIMARY KEY, NAME VARCHAR(255));@LOOP 1000 INSERT INTO TEST VALUES(?, ?);SET TRACE_LEVEL_FILE 0; After running the test case, convert the java -cp h2*.jar org.h2.tools.ConvertTraceFile -traceFile "~/test.trace.db" -script "~/test.sql" The generated file ------------------------------------------- SQL Statement Statistics-- time: total time in milliseconds (accumulated)-- count: how many times the statement ran-- result: total update count or row count------------------------------------------- self accu time count result sql-- 62% 62% 158 1000 1000 INSERT INTO TEST VALUES(?, ?);-- 37% 100% 93 1 0 CREATE TABLE TEST(ID INT PRIMARY KEY...-- 0% 100% 0 1 0 DROP TABLE IF EXISTS TEST;-- 0% 100% 0 1 0 SET TRACE_LEVEL_FILE 3; Statement Execution PlansThe SQL statement EXPLAIN SELECT * FROM TEST WHERE ID=1;SELECT TEST.ID, TEST.NAMEFROM PUBLIC.TEST /* PUBLIC.PRIMARY_KEY_2: ID = 1 */WHERE ID = 1 For joins, the tables in the execution plan are sorted in the order they are processed.The following query shows the database first processes the table CREATE TABLE CUSTOMER(ID IDENTITY, NAME VARCHAR);CREATE TABLE INVOICE(ID IDENTITY, CUSTOMER_ID INT REFERENCES CUSTOMER(ID), AMOUNT NUMBER);EXPLAIN SELECT I.ID, C.NAME FROM CUSTOMER C, INVOICE IWHERE I.ID=10 AND AMOUNT>0 AND C.ID=I.CUSTOMER_ID;SELECT I.ID, C.NAMEFROM PUBLIC.INVOICE I /* PUBLIC.PRIMARY_KEY_9: ID = 10 */ /* WHERE (I.ID = 10) AND (AMOUNT > 0) */INNER JOIN PUBLIC.CUSTOMER C /* PUBLIC.PRIMARY_KEY_5: ID = I.CUSTOMER_ID */ ON 1=1WHERE (C.ID = I.CUSTOMER_ID) AND ((I.ID = 10) AND (AMOUNT > 0)) Displaying the Scan Count
EXPLAIN ANALYZE SELECT * FROM TEST;SELECT TEST.ID, TEST.NAMEFROM PUBLIC.TEST /* PUBLIC.TEST.tableScan */ /* scanCount: 1000 *//*total: 85TEST.TEST_DATA read: 85 (100%)*/ The cache will prevent the pages are read twice. H2 reads all columns of the rowunless only the columns in the index are read. Except for large CLOB and BLOB, which are not store in the table. Special OptimizationsFor certain queries, the database doesn't need to read all rows, or doesn't need to sort the result even if For queries of the form For queries of the form For queries of the form For queries of the form How Data is Stored and How Indexes WorkInternally, each row in a table is identified by a unique number, the row id.The rows of a table are stored with the row id as the key.The row id is a number of type long.If a table has a single column primary key of type CREATE TABLE ADDRESS(FIRST_NAME VARCHAR, NAME VARCHAR, CITY VARCHAR, PHONE VARCHAR);INSERT INTO ADDRESS VALUES('John', 'Miller', 'Berne', '123 456 789');INSERT INTO ADDRESS VALUES('Philip', 'Jones', 'Berne', '123 012 345');SELECT _ROWID_, * FROM ADDRESS;The data is stored in the database as follows:
Access by row id is fast because the data is sorted by this key.Please note the row id is not available until after the row was added(that means, it can not be used in generated columns or constraints).If the query condition does not contain the row id (and if no other index can be used), then all rows of the table are scanned.A table scan iterates over all rows in the table, in the order of the row id.To find out what strategy the database uses to retrieve the data, use SELECT * FROM ADDRESS WHERE NAME = 'Miller';EXPLAIN SELECT PHONE FROM ADDRESS WHERE NAME = 'Miller';SELECT PHONEFROM PUBLIC.ADDRESS /* PUBLIC.ADDRESS.tableScan */WHERE NAME = 'Miller'; IndexesAn index internally is basically just a table that contains the indexed column(s), plus the row id: CREATE INDEX INDEX_PLACE ON ADDRESS(CITY, NAME, FIRST_NAME); In the index, the data is sorted by the indexed columns.So this index contains the following data:
When the database uses an index to query the data, it searches the index for the given data,and (if required) reads the remaining columns in the main data table (retrieved using the row id).An index on city, name, and first name (multi-column index) allows to quickly search for rows when the city, name, and first name are known.If only the city and name, or only the city is known, then this index is also used (so creating an additional index on just the city is not needed).This index is also used when reading all rows, sorted by the indexed columns.However, if only the first name is known, then this index is not used: EXPLAIN SELECT PHONE FROM ADDRESS WHERE CITY = 'Berne' AND NAME = 'Miller' AND FIRST_NAME = 'John';SELECT PHONEFROM PUBLIC.ADDRESS /* PUBLIC.INDEX_PLACE: FIRST_NAME = 'John' AND CITY = 'Berne' AND NAME = 'Miller' */WHERE (FIRST_NAME = 'John') AND ((CITY = 'Berne') AND (NAME = 'Miller'));EXPLAIN SELECT PHONE FROM ADDRESS WHERE CITY = 'Berne';SELECT PHONEFROM PUBLIC.ADDRESS /* PUBLIC.INDEX_PLACE: CITY = 'Berne' */WHERE CITY = 'Berne';EXPLAIN SELECT * FROM ADDRESS ORDER BY CITY, NAME, FIRST_NAME;SELECT ADDRESS.FIRST_NAME, ADDRESS.NAME, ADDRESS.CITY, ADDRESS.PHONEFROM PUBLIC.ADDRESS /* PUBLIC.INDEX_PLACE */ORDER BY 3, 2, 1/* index sorted */;EXPLAIN SELECT PHONE FROM ADDRESS WHERE FIRST_NAME = 'John';SELECT PHONEFROM PUBLIC.ADDRESS /* PUBLIC.ADDRESS.tableScan */WHERE FIRST_NAME = 'John'; If your application often queries the table for a phone number, then it makes sense to createan additional index on it: CREATE INDEX IDX_PHONE ON ADDRESS(PHONE); This index contains the phone number, and the row id:
Using Multiple IndexesWithin a query, only one index per logical table is used.Using the condition EXPLAIN SELECT NAME FROM ADDRESS WHERE PHONE = '123 567 789'UNION SELECT NAME FROM ADDRESS WHERE CITY = 'Berne';(SELECT NAMEFROM PUBLIC.ADDRESS /* PUBLIC.IDX_PHONE: PHONE = '123 567 789' */WHERE PHONE = '123 567 789')UNION(SELECT NAMEFROM PUBLIC.ADDRESS /* PUBLIC.INDEX_PLACE: CITY = 'Berne' */WHERE CITY = 'Berne') Fast Database ImportIf you have to import a lot of rows, use a PreparedStatement or use CSV import.Please note that |