Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Aditya Agrawal
Aditya Agrawal

Posted on • Originally published atadiagr.com

     

Navigating PostgreSQL - Join Strategies

Nested Loops

Nested loops join is one of the simplest join strategies that PostgreSQL uses. It works by taking each row from one table (outer table) and finding matching rows in another table (inner table) by scanning or using an index.

How Nested Loops Join Works

When PostgreSQL performs a nested loops join:

  1. It first gets a row from the outer table (usually the table with fewer matching rows after filters)
  2. For each outer row, it scans the inner table (either sequentially or using an index) to find matching rows
  3. When matches are found, it combines the rows according to the join condition
  4. This process repeats for every row from the outer table

Nested loops are most efficient when:

  • The outer table has very few rows after filtering
  • The inner table has an index on the join columns
  • The join needs to return only a small number of rows

think of this like a double for loop.

for(orderinorders){for(productinproducts){if(order.product_id==product.id){//jointheorderandproduct}}}
Enter fullscreen modeExit fullscreen mode

Nested join

For example, in this query plan:

postgres=#explainselect*fromordersojoinproductsponp.id=o.product_idwhereo.id<107219;QUERYPLAN---------------------------------------------------------------------------------------NestedLoop(cost=0.70..14.81rows=11width=50)->IndexScanusingorders_pkeyonorderso(cost=0.42..2.62rows=11width=27)IndexCond:(id<107219)->IndexScanusingproducts_pkeyonproductsp(cost=0.27..1.11rows=1width=23)IndexCond:(id=o.product_id)(5rows)
Enter fullscreen modeExit fullscreen mode

Note: order is the outer table and product is the inner table.

Understanding the Query Plan

  • The index scan on orders returns11 rows.
  • For each of these rows, the query performs an index scan on products.
  • The index scan on products returns1 row for each of the11 orders.

Hash Join / Hash

Hash join is another join strategy that PostgreSQL uses. It works by creating a hash table from one of the tables and then using that hash table to find matching rows in another table.

How Hash Join Works

When PostgreSQL performs a hash join:

  1. It creates a hash table from one of the tables (usually the smaller one)
  2. It then uses that hash table to find matching rows in another table
  3. When matches are found, it combines the rows according to the join condition

Hash join is most efficient when:

  • The join needs to return only a small number of rows
  • The inner table has an index on the join columns

think of this like a hash map.

hash_map={}for(productinproducts){hash_map[product.id]=product}for(orderinorders){if(hash_map[order.product_id]){//jointheorderandproduct}}
Enter fullscreen modeExit fullscreen mode

Hash Join

postgres=#explainselect*fromordersojoinproductsponp.id=o.product_idwhereo.id<117219;QUERYPLAN-----------------------------------------------------------------------------------------HashJoin(cost=15.68..339.95rows=10536width=50)HashCond:(o.product_id=p.id)->IndexScanusingorders_pkeyonorderso(cost=0.42..296.81rows=10536width=27)IndexCond:(id<117219)->Hash(cost=9.00..9.00rows=500width=23)->SeqScanonproductsp(cost=0.00..9.00rows=500width=23)(6rows)
Enter fullscreen modeExit fullscreen mode

Note: orders is the outer table and products is the inner table.

Understanding the Query Plan

  • The index scan on orders returns10,536 rows.
  • The hash join creates a hash table from the products table (smaller table).
  • For each row in the orders table, the query uses the hash table to find matching rows in the products table.
  • The hash join returns10,536 rows.

Why Did PostgreSQL Choose a Hash Join Instead of a Nested Loop?

Key Differences from Previous Query(id < 107219):

  • The previous query had only11 rows from orders, so a Nested Loop Join was efficient.
  • This query has10,536 rows, so a Nested Loop would require10,536 index lookups in products, which would be slow.
  • Instead, PostgreSQL builds a hash table once (cheap) and does fast lookups(O(1)) instead of repeated index scans(O(log N)).

Merge Join

Merge join is another join strategy that PostgreSQL uses. It works by sorting the two tables and then merging them together.

How Merge Join Works

When PostgreSQL performs a merge join:

  1. It sorts the two tables by the join columns
  2. It then merges the two tables together
  3. When matches are found, it combines the rows according to the join condition

Merge join is most efficient when:

  • Both tables are already sorted (or can be sorted efficiently).
  • There's no suitable index for a Nested Loop.
  • The tables are large, making Hash Join expensive.

think of this like a merge sort.

merge_sort(left,right){if(left.length==0)returnright;if(right.length==0)returnleft;}
Enter fullscreen modeExit fullscreen mode

Merge Join

explainSELECT*FROMemployeeseJOINsalarysONe.id=s.employee_idWHEREe.age<40;QUERYPLAN--------------------------------------------------------------MergeJoin(cost=198.11..268.19rows=10width=488)MergeCond:(e.id=s.employee_id)->IndexScanusingemployees_idonemployeese(cost=0.29..656.28rows=101width=244)Filter:(age<40)->Sort(cost=197.83..200.33rows=1000width=244)SortKey:s.employee_id->SeqScanonsalarys(cost=0.00..148.00rows=1000width=244)
Enter fullscreen modeExit fullscreen mode

Understanding the Query Plan

  • The index scan on employees returns101 rows.
  • The sort operation sorts the salary table by the join column (employee_id).
  • The merge join then merges the two sorted tables together.
  • The merge join returns10 rows.

Why Did PostgreSQL Choose a Merge Join Instead of Nested Loop or Hash Join?

Key Differences from Previous Queries:

  • The employees table is filtered (age < 40), reducing rows to 101.
  • The salary table is not indexed on employee_id, so a Hash Join would require creating a large hash table.
  • The tables can be efficiently sorted and merged, making a Merge Join the best option.

Summary

Join TypeWhen It's UsedHow It Works
Nested Loop JoinBest for small outer tables (<1000 rows) or when inner table has indexes on join columns. Ideal when outer table is heavily filtered.Iterates over each row of the outer table and checks for matches in the inner table using either sequential scan or index scan. Works like a double for-loop.
Hash JoinOptimal for medium to large tables with equality joins. Used when no useful indexes exist and memory is sufficient.Builds an in-memory hash table from the smaller table, then probes it using rows from larger table. Hash table enables O(1) lookups.
Merge JoinBest when data is already sorted (e.g. clustered index) or when sort cost is acceptable. Good for large tables that exceed memory.First sorts both inputs on join key if needed, then walks through them in parallel, matching rows. Similar to merge phase in merge sort.
Join TypeTime ComplexitySpace Complexity
Nested Loop JoinO(n * m) without index, O(n * log m) with index where n=outer rows, m=inner rowsO(1) - Minimal memory usage
Hash JoinO(n + m) for building and probingO(n) where n = size of smaller table
Merge JoinO(n log n + m log m) if sorting needed, O(n + m) if pre-sortedO(1) if pre-sorted, O(n + m) if sorting needed

References

Next Steps

  • Apply all our learnings to a real query and optimize it.

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

I'm a software engineer who builds reliable systems. I enjoy mentoring others, love learning new things, value simplicity, stay positive, keep up with new technologies, and have a good sense of humor.
  • Location
    Bengaluru, India
  • Joined

More fromAditya Agrawal

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp