Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Tomas
Tomas

Posted on

Slash Slow Queries by Rewriting OR to UNION

Problem Definition

If two query conditions are connected byOR, the database optimizer may not be able to use indexes to execute the query. For example, for the following SQL statement:

select*fromcustomerwherec_phonelike'139%'orc_name='Ray'
Enter fullscreen modeExit fullscreen mode

Even if both fields have indexes, they can be rewritten as aUNION query to leverage the indexes and improve query performance:

select*fromcustomerwherec_phonelike'139%'unionselect*fromcustomerwherec_name='Ray'
Enter fullscreen modeExit fullscreen mode

However, this transformation does not always improve performance. It requires certain conditions and cost-based estimation.

If the database supportsINDEX MERGING, you can also enable theINDEX MERGING optimization strategy by adjusting database parameters to improve database performance.

Applicable Conditions

  1. The OR-connected conditions must be indexable.
  2. The estimated cost of the rewrittenUNION statement must be lower than the original SQL.
  3. If theOR branch conditions are mutually exclusive, rewrite them asUNION ALL instead ofUNION.

Case Analysis

Case 1. Conditions cannot utilize indexes, no rewrite optimization

select*fromcustomerwherec_phone='1'orc_phonelike'%139%'
Enter fullscreen modeExit fullscreen mode

Analysis: One conditionc_phone like '%139%' cannot use indexes. Even after rewriting, a full table scan is still required. PawSQL does not rewrite.

Case 2. Filter condition selectivity is low enough, no rewrite optimization

select*fromcustomerwherecustkey=1and(c_phone='1'orc_phonelike'%139%')
Enter fullscreen modeExit fullscreen mode

Analysis:custkey is the primary key.custkey = 1 uniquely locates one record, so no need to rewrite.

Case 3. Conditions met, use rewrite optimization

selectdistinct*fromcustomerwherec_phonelike'139%'orc_name='Ray'
Enter fullscreen modeExit fullscreen mode

Analysis: Both conditions can use indexes and the selectivity is less than 10%. Can be rewritten.distinct can be removed after rewriting byunion.

select*fromcustomerwherec_phone='1'unionselect*fromcustomerwherec_phonelike'139%'
Enter fullscreen modeExit fullscreen mode

Case 4.OR condition branches are mutually exclusive, rewrite toUNION ALL

select*fromcustomerwherecustkey=1or(custkey=2andc_phonelike'139%')
Enter fullscreen modeExit fullscreen mode

Analysis: The two condition branchescustkey = 2 and c_phone like '139%' andcustkey = 1 are mutually exclusive. Rewrite toUNION ALL:

select*fromcustomerwherec_custkey=2andc_phonelike'139%'unionallselect*fromcustomerwherec_custkey=1
Enter fullscreen modeExit fullscreen mode

Case 5. Containsorder by andlimit, use rewrite optimization

select*fromordersowhereO_ORDERDATE>='2021-01-01'and(O_ORDERPRIORITY=1orO_SHIPPRIORITY=1)orderbyO_ORDERDATEdesclimit10
Enter fullscreen modeExit fullscreen mode

Analysis: Although the selectivity ofO_ORDERPRIORITY = 1 andO_SHIPPRIORITY = 1 is high, the total cost is lower by leveraging indexes to avoid sorting. PawSQL rewrites the optimization.

select*from((select/*QB_2*/*fromordersasowhereo.O_ORDERDATE>='2021-01-01'ando.O_SHIPPRIORITY=1orderbyo.O_ORDERDATEdesclimit10)union(select/*QB_1*/*fromordersasowhereo.O_ORDERDATE>='2021-01-01'ando.O_ORDERPRIORITY='1'orderbyo.O_ORDERDATEdesclimit10))asPawDT_1702555889039orderbyPawDT_1702555889039.O_ORDERDATEdesclimit10
Enter fullscreen modeExit fullscreen mode

Case 6. Contains grouping and aggregates, use rewrite optimization

selectO_ORDERDATE,count(1)fromordersowhereO_ORDERDATE>='2021-01-01'and(O_ORDERPRIORITY=1orO_SHIPPRIORITY=1)groupbyO_ORDERDATE
Enter fullscreen modeExit fullscreen mode

Analysis: Although the selectivity ofO_ORDERPRIORITY = 1 andO_SHIPPRIORITY = 1 is high, the total cost is lower by leveraging indexes to avoid sorting. PawSQL rewrites the optimization.

selectPawDT_1702884016144.O_ORDERDATE,count(1)from(select/*QB_2*/o.O_ORDERDATE,o.O_ORDERKEYfromtpch_pkfk.ordersasowhereo.O_ORDERDATE>='2021-01-01'ando.O_SHIPPRIORITY=1unionselect/*QB_1*/o.O_ORDERDATE,o.O_ORDERKEYfromtpch_pkfk.ordersasowhereo.O_ORDERDATE>='2021-01-01'ando.O_ORDERPRIORITY='1')asPawDT_1702884016144groupbyPawDT_1702884016144.O_ORDERDATE
Enter fullscreen modeExit fullscreen mode

Performance Validation

Case 5. 900x performance improvement

  • Before optimization (execution time 432.322ms)

Image description

  • After optimization (execution time 0.189ms)

Image description

Case 6. 15x performance improvement

  • Before optimization (2.816ms)

Image description

  • After optimization (0.204 ms)

Image description

About PawSQL

PawSQL focuses on automatic and intelligent SQL optimization for databases. It supports MySQL, PostgreSQL, openGauss, Oracle and more. PawSQL products include:

  • PawSQL Cloud, an online automated SQL optimization tool that provides SQL review, intelligent query rewriting, cost-based index recommendations for DBAs and developers.
  • PawSQL Advisor, an IntelliJ plugin for developers, can be installed from IDEA/DataGrip marketplace by searching "PawSQL Advisor".
  • PawSQL Engine, the backend optimization engine of PawSQL products, can be deployed independently and provides SQL optimization services via HTTP/JSON interfaces. PawSQL Engine is provided as a Docker image.

source

Top comments(1)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss
CollapseExpand
 
linuxguist profile image
Nathan S.R.
I develop open source softwares that can be used by all users freely.
  • Education
    Electronics & Communication Engineer
  • Work
    Former Database Team Manager - 24x7 , DBA
  • Joined

Thanks for this very useful post. I just wanted to add that, there is a very easy way now, to test all the SQLs described here, using the free & portable tools, mentioned in my latest post here :dev.to/linuxguist/learn-sql-quickl...

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

I am the founder of PawSQL, a startup revolutionizing SQL optimization. I share database tuning insights and PawSQL's tech, empowering data pros with effective strategies.
  • Work
    Chief Architect@PawSQL
  • Joined

More fromTomas

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp