Documentation Home
MySQL 8.4 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 40.2Mb
PDF (A4) - 40.3Mb
Man Pages (TGZ) - 262.0Kb
Man Pages (Zip) - 367.6Kb
Info (Gzip) - 4.0Mb
Info (Zip) - 4.0Mb


10.2.1.16 ORDER BY Optimization

This section describes when MySQL can use an index to satisfy anORDER BY clause, thefilesort operation used when an index cannot be used, and execution plan information available from the optimizer aboutORDER BY.

AnORDER BY with and withoutLIMIT may return rows in different orders, as discussed inSection 10.2.1.19, “LIMIT Query Optimization”.

Use of Indexes to Satisfy ORDER BY

In some cases, MySQL may use an index to satisfy anORDER BY clause and avoid the extra sorting involved in performing afilesort operation.

The index may also be used even if theORDER BY does not match the index exactly, as long as all unused portions of the index and all extraORDER BY columns are constants in theWHERE clause. If the index does not contain all columns accessed by the query, the index is used only if index access is cheaper than other access methods.

Assuming that there is an index on(key_part1,key_part2), the following queries may use the index to resolve theORDER BY part. Whether the optimizer actually does so depends on whether reading the index is more efficient than a table scan if columns not in the index must also be read.

  • In this query, the index on(key_part1,key_part2) enables the optimizer to avoid sorting:

    SELECT * FROM t1  ORDER BYkey_part1,key_part2;

    However, the query usesSELECT *, which may select more columns thankey_part1 andkey_part2. In that case, scanning an entire index and looking up table rows to find columns not in the index may be more expensive than scanning the table and sorting the results. If so, the optimizer probably does not use the index. IfSELECT * selects only the index columns, the index is used and sorting avoided.

    Ift1 is anInnoDB table, the table primary key is implicitly part of the index, and the index can be used to resolve theORDER BY for this query:

    SELECTpk,key_part1,key_part2 FROM t1  ORDER BYkey_part1,key_part2;
  • In this query,key_part1 is constant, so all rows accessed through the index are inkey_part2 order, and an index on(key_part1,key_part2) avoids sorting if theWHERE clause is selective enough to make an index range scan cheaper than a table scan:

    SELECT * FROM t1  WHEREkey_part1 =constant  ORDER BYkey_part2;
  • In the next two queries, whether the index is used is similar to the same queries withoutDESC shown previously:

    SELECT * FROM t1  ORDER BYkey_part1 DESC,key_part2 DESC;SELECT * FROM t1  WHEREkey_part1 =constant  ORDER BYkey_part2 DESC;
  • Two columns in anORDER BY can sort in the same direction (bothASC, or bothDESC) or in opposite directions (oneASC, oneDESC). A condition for index use is that the index must have the same homogeneity, but need not have the same actual direction.

    If a query mixesASC andDESC, the optimizer can use an index on the columns if the index also uses corresponding mixed ascending and descending columns:

    SELECT * FROM t1  ORDER BYkey_part1 DESC,key_part2 ASC;

    The optimizer can use an index on (key_part1,key_part2) ifkey_part1 is descending andkey_part2 is ascending. It can also use an index on those columns (with a backward scan) ifkey_part1 is ascending andkey_part2 is descending. SeeSection 10.3.13, “Descending Indexes”.

  • In the next two queries,key_part1 is compared to a constant. The index is used if theWHERE clause is selective enough to make an index range scan cheaper than a table scan:

    SELECT * FROM t1  WHEREkey_part1 >constant  ORDER BYkey_part1 ASC;SELECT * FROM t1  WHEREkey_part1 <constant  ORDER BYkey_part1 DESC;
  • In the next query, theORDER BY does not namekey_part1, but all rows selected have a constantkey_part1 value, so the index can still be used:

    SELECT * FROM t1  WHEREkey_part1 =constant1 ANDkey_part2 >constant2  ORDER BYkey_part2;

In some cases, MySQLcannot use indexes to resolve theORDER BY, although it may still use indexes to find the rows that match theWHERE clause. Examples:

  • The query usesORDER BY on different indexes:

    SELECT * FROM t1 ORDER BYkey1,key2;
  • The query usesORDER BY on nonconsecutive parts of an index:

    SELECT * FROM t1 WHEREkey2=constant ORDER BYkey1_part1,key1_part3;
  • The index used to fetch the rows differs from the one used in theORDER BY:

    SELECT * FROM t1 WHEREkey2=constant ORDER BYkey1;
  • The query usesORDER BY with an expression that includes terms other than the index column name:

    SELECT * FROM t1 ORDER BY ABS(key);SELECT * FROM t1 ORDER BY -key;
  • The query joins many tables, and the columns in theORDER BY are not all from the first nonconstant table that is used to retrieve rows. (This is the first table in theEXPLAIN output that does not have aconst join type.)

  • The query has differentORDER BY andGROUP BY expressions.

  • There is an index on only a prefix of a column named in theORDER BY clause. In this case, the index cannot be used to fully resolve the sort order. For example, if only the first 10 bytes of aCHAR(20) column are indexed, the index cannot distinguish values past the 10th byte and afilesort is needed.

  • The index does not store rows in order. For example, this is true for aHASH index in aMEMORY table.

Availability of an index for sorting may be affected by the use of column aliases. Suppose that the columnt1.a is indexed. In this statement, the name of the column in the select list isa. It refers tot1.a, as does the reference toa in theORDER BY, so the index ont1.a can be used:

SELECT a FROM t1 ORDER BY a;

In this statement, the name of the column in the select list is alsoa, but it is the alias name. It refers toABS(a), as does the reference toa in theORDER BY, so the index ont1.a cannot be used:

SELECT ABS(a) AS a FROM t1 ORDER BY a;

In the following statement, theORDER BY refers to a name that is not the name of a column in the select list. But there is a column int1 nameda, so theORDER BY refers tot1.a and the index ont1.a can be used. (The resulting sort order may be completely different from the order forABS(a), of course.)

SELECT ABS(a) AS b FROM t1 ORDER BY a;

Previously (MySQL 8.3 and lower),GROUP BY sorted implicitly under certain conditions. In MySQL 8.4, that no longer occurs, so specifyingORDER BY NULL at the end to suppress implicit sorting (as was done previously) is no longer necessary. However, query results may differ from previous MySQL versions. To produce a given sort order, provide anORDER BY clause.

Use of filesort to Satisfy ORDER BY

If an index cannot be used to satisfy anORDER BY clause, MySQL performs afilesort operation that reads table rows and sorts them. Afilesort constitutes an extra sorting phase in query execution.

To obtain memory forfilesort operations, the optimizer allocates memory buffers incrementally as needed, up to the size indicated by thesort_buffer_size system variable. This enables users to setsort_buffer_size to larger values to speed up larger sorts, without concern for excessive memory use for small sorts. (This benefit may not occur for multiple concurrent sorts on Windows, which has a weak multithreadedmalloc.)

Afilesort operation uses temporary disk files as necessary if the result set is too large to fit in memory. Some types of queries are particularly suited to completely in-memoryfilesort operations. For example, the optimizer can usefilesort to efficiently handle in memory, without temporary files, theORDER BY operation for queries (and subqueries) of the following form:

SELECT ... FROMsingle_table ... ORDER BYnon_index_column [DESC] LIMIT [M,]N;

Such queries are common in web applications that display only a few rows from a larger result set. Examples:

SELECT col1, ... FROM t1 ... ORDER BY name LIMIT 10;SELECT col1, ... FROM t1 ... ORDER BY RAND() LIMIT 15;
Influencing ORDER BY Optimization

To increaseORDER BY speed, check whether you can get MySQL to use indexes rather than an extra sorting phase. If this is not possible, try the following strategies:

  • Increase thesort_buffer_size variable value. Ideally, the value should be large enough for the entire result set to fit in the sort buffer (to avoid writes to disk and merge passes).

    Take into account that the size of column values stored in the sort buffer is affected by themax_sort_length system variable value. For example, if tuples store values of long string columns and you increase the value ofmax_sort_length, the size of sort buffer tuples increases as well and may require you to increasesort_buffer_size.

    To monitor the number of merge passes (to merge temporary files), check theSort_merge_passes status variable.

  • Increase theread_rnd_buffer_size variable value so that more rows are read at a time.

  • Change thetmpdir system variable to point to a dedicated file system with large amounts of free space. The variable value can list several paths that are used in round-robin fashion; you can use this feature to spread the load across several directories. Separate the paths by colon characters (:) on Unix and semicolon characters (;) on Windows. The paths should name directories in file systems located on differentphysical disks, not different partitions on the same disk.

ORDER BY Execution Plan Information Available

WithEXPLAIN (seeSection 10.8.1, “Optimizing Queries with EXPLAIN”), you can check whether MySQL can use indexes to resolve anORDER BY clause:

  • If theExtra column ofEXPLAIN output does not containUsing filesort, the index is used and afilesort is not performed.

  • If theExtra column ofEXPLAIN output containsUsing filesort, the index is not used and afilesort is performed.

In addition, if afilesort is performed, optimizer trace output includes afilesort_summary block. For example:

"filesort_summary": {  "rows": 100,  "examined_rows": 100,  "number_of_tmp_files": 0,  "peak_memory_used": 25192,  "sort_mode": "<sort_key, packed_additional_fields>"}

peak_memory_used indicates the maximum memory used at any one time during the sort. This is a value up to but not necessarily as large as the value of thesort_buffer_size system variable. The optimizer allocates sort-buffer memory incrementally, beginning with a small amount and adding more as necessary, up tosort_buffer_size bytes.)

Thesort_mode value provides information about the contents of tuples in the sort buffer:

  • <sort_key, rowid>: This indicates that sort buffer tuples are pairs that contain the sort key value and row ID of the original table row. Tuples are sorted by sort key value and the row ID is used to read the row from the table.

  • <sort_key, additional_fields>: This indicates that sort buffer tuples contain the sort key value and columns referenced by the query. Tuples are sorted by sort key value and column values are read directly from the tuple.

  • <sort_key, packed_additional_fields>: Like the previous variant, but the additional columns are packed tightly together instead of using a fixed-length encoding.

EXPLAIN does not distinguish whether the optimizer does or does not perform afilesort in memory. Use of an in-memoryfilesort can be seen in optimizer trace output. Look forfilesort_priority_queue_optimization. For information about the optimizer trace, seeSection 10.15, “Tracing the Optimizer”.