Movatterモバイル変換


[0]ホーム

URL:


Add Neon Auth to your app without leaving Cursor or Claude
/PostgreSQL Indexes/JSON Index

PostgreSQL JSON Index

Summary: in this tutorial, you will learn how to create a PostgreSQL JSON index for a JSONB column to improve query performance.

Introduction to PostgreSQL JSON index

JSONB (binary JSON) is a data type that allows you to store JSON data and query it efficiently.

When a JSONB column has a complex JSON structure, utilizing an index can significantly improve query performance.

PostgreSQL uses theGIN index type for indexing a column with JSONB data type.GIN stands for Generalized Inverted Index.

Note that you can utilize the GIN index for tsvector or array columns.

To create aGIN index for a JSONB column, you can use the followingCREATE INDEX statement:

CREATE INDEX index_nameON table_nameUSING GIN(jsonb_column);

This statement creates aGIN index on thejsonb_column. ThisGIN index is suitable for general-purpose queries on JSONB data.

When creating aGIN index on a JSONB column, you can use a specificGIN operator class.

The operator class determines how PostgreSQL builds the index and how it optimizes the queries on the indexed column.

For example, The followingCREATE INDEX statement creates aGIN index on thejsonb_column withjsonb_path_ops operator class:

CREATE INDEX index_nameON table_nameUSING GIN(jsonb_column jsonb_path_ops);

This index is optimized for the queries that use the @> (contains), ? (exists), and @@ JSONB operators. It can be useful for searches involving keys or values within JSONB documents.

The following table displays theGIN operator classes:

NameIndexable Operators
array_ops&& (anyarray,anyarray)
@> (anyarray,anyarray)
<@ (anyarray,anyarray)
= (anyarray,anyarray)
jsonb_ops@> (jsonb,jsonb)
@? (jsonb,jsonpath)
@@ (jsonb,jsonpath)
? (jsonb,text)
?| (jsonb,text[])
?& (jsonb,text[])
jsonb_path_ops@> (jsonb,jsonb)
@? (jsonb,jsonpath)
@@ (jsonb,jsonpath)
tsvector_ops@@ (tsvector,tsquery)

Note that if you don’t explicitly specify aGIN operator class, the statement will use thejsonb_ops operator by default, which is suitable for most cases.

Additionally, PostgreSQL allows you to create aGIN index for a specific field in JSON documents as follows:

CREATE INDEX index_nameON table_nameUSING GIN ((data->'field_name') jsonb_path_ops);

This index can improve the queries that involve searching values within thefield_name of JSON documents stored in the JSONB column (data).

PostgreSQL JSON index examples

We’ll use the tables in thesample database.

1) Setting up a sample table

First,create a new table calledcustomer_json that stores the customer information in JSON format:

CREATE TABLE customer_json(   idSERIAL PRIMARY KEY,   data JSONBNOT NULL);

Second, insert data from thecustomer,address,city, andcountry tables into thecustomer_json table:

WITH json_cteAS(  SELECT    jsonb_build_object(      'first_name',  first_name,      'last_name',  last_name,      'email',  email,      'phone',  a.phone,      'address',      jsonb_build_object(        'address', a.address,        'city', i.city,        'postal_code', a.postal_code,        'district',  a.district,        'country', o.country      )    ):: jsonbAS data  FROM    customer c    INNER JOIN address aON a.address_id= c.address_id    INNER JOIN city iON i.city_id= a.city_id    INNER JOIN country oON o.country_id= i.country_id)INSERT INTO customer_json(data)SELECT  dataFROM  json_cte;

Third, retrieve the email of the customer whose first name isJohn:

SELECT   data ->> 'first_name' first_name,   data ->> 'last_name' last_name,   data ->> 'phone' phoneFROM   customer_jsonWHERE   data @> '{"first_name": "John"}';

Output:

first_name | last_name  |    phone------------+------------+------------- John       | Farnsworth | 51917807050(1 row)

Finally, explain and analyze the above query:

EXPLAIN ANALYZESELECT   data ->> 'first_name' first_name,   data ->> 'last_name' last_name,   data ->> 'phone' phoneFROM   customer_jsonWHERE   data @> '{"first_name": "John"}';

Output:

QUERY PLAN--------------------------------------------------------------------------------------------------------- Seq Scan on customer_json  (cost=0.00..31.50 rows=1 width=96) (actual time=0.063..0.118 rows=1 loops=1)   Filter: (data @> '{"first_name": "John"}'::jsonb)   Rows Removed by Filter: 598 Planning Time: 1.109 ms Execution Time: 0.128 ms(5 rows)

The output indicates that PostgreSQL has to scan the entirecustomer_json table to search for the customer.

To improve the performance of the query, you can create aGIN index on the data column of thecustomer_json table.

2) Creating an index on the JSONB column

First,create an index on thedata column of thecustomer_json table:

CREATE INDEX customer_json_indexON customer_jsonUSING GIN(data);

Second, execute the query that searches for the customer whose first name isJohn:

EXPLAIN ANALYZESELECT   data ->> 'first_name' first_name,   data ->> 'last_name' last_name,   data ->> 'phone' phoneFROM   customer_jsonWHERE   data @> '{"first_name": "John"}';

Output:

QUERY PLAN----------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on customer_json  (cost=21.51..25.53 rows=1 width=96) (actual time=0.024..0.024 rows=1 loops=1)   Recheck Cond: (data @> '{"first_name": "John"}'::jsonb)   Heap Blocks: exact=1   ->  Bitmap Index Scan on customer_json_index  (cost=0.00..21.51 rows=1 width=0) (actual time=0.014..0.014 rows=1 loops=1)         Index Cond: (data @> '{"first_name": "John"}'::jsonb) Planning Time: 0.164 ms Execution Time: 0.045 ms(7 rows)

The query plan indicates that PostgreSQL uses thecustomer_json_index to improve the performance.

This time, the execution time is significantly smaller0.045ms vs.0.128 ms, about 2 – 3 times faster than a query without using theGIN index.

3) Creating an index on the JSONB column with the GIN operator class

First,drop thecustomer_json_index index:

DROP INDEX customer_json_index;

Second, create aGIN index on the data column of thecustomer_json table with aGIN operator class:

CREATE INDEX customer_json_indexON customer_jsonUSING GIN(data jsonb_path_ops);

Third, explain the query that finds the customer whose first name isJohn:

EXPLAIN ANALYZESELECT   data ->> 'first_name' first_name,   data ->> 'last_name' last_name,   data ->> 'phone' phoneFROM   customer_jsonWHERE   data @> '{"first_name": "John"}';

Output:

QUERY PLAN----------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on customer_json  (cost=12.82..16.84 rows=1 width=96) (actual time=0.014..0.015 rows=1 loops=1)   Recheck Cond: (data @> '{"first_name": "John"}'::jsonb)   Heap Blocks: exact=1   ->  Bitmap Index Scan on customer_json_index  (cost=0.00..12.82 rows=1 width=0) (actual time=0.008..0.008 rows=1 loops=1)         Index Cond: (data @> '{"first_name": "John"}'::jsonb) Planning Time: 0.120 ms Execution Time: 0.034 ms(7 rows)

The query plan shows that the query does use thecustomer_json_index for improved performance.

Finally, explain the query that searches for the customer where the value in thefirst_name field within the data column is John:

EXPLAIN ANALYZESELECT * FROM customer_jsonWHERE data->>'first_name' = 'John';

Output:

QUERY PLAN---------------------------------------------------------------------------------------------------------- Seq Scan on customer_json  (cost=0.00..32.98 rows=3 width=275) (actual time=0.161..0.284 rows=1 loops=1)   Filter: ((data ->> 'first_name'::text) = 'John'::text)   Rows Removed by Filter: 598 Planning Time: 0.085 ms Execution Time: 0.298 ms(5 rows)

In this plan, the query cannot fully utilize theGIN indexcustomer_json_index. The reason is that the query does not use the JSONB operator (@,@?,@@) that thejsonb_path_ops operator class is optimized for.

4) Creating an index on a specific field of a JSONB column

First, drop thecustomer_json_index index:

DROP INDEX customer_json_index;

Second, create aGIN index on thefirst_name field of thecustomer_json table using theGIN operator class:

CREATE INDEX customer_json_indexON customer_jsonUSING GIN((data->'first_name'));

Third, explain the query that finds the rows where the “first_name” field in thedata JSONB column contains the value"John":

EXPLAIN ANALYZESELECT   data ->> 'first_name' first_name,   data ->> 'last_name' last_name,   data ->> 'phone' phoneFROM   customer_jsonWHERE   data->'first_name' @> '"John"';

Output:

QUERY PLAN---------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on customer_json  (cost=8.58..23.72 rows=6 width=96) (actual time=0.031..0.032 rows=1 loops=1)   Recheck Cond: ((data -> 'first_name'::text) @> '"John"'::jsonb)   Heap Blocks: exact=1   ->  Bitmap Index Scan on customer_json_index  (cost=0.00..8.58 rows=6 width=0) (actual time=0.015..0.015 rows=1 loops=1)         Index Cond: ((data -> 'first_name'::text) @> '"John"'::jsonb) Planning Time: 0.167 ms Execution Time: 0.133 ms(7 rows)

The output indicates that the query uses thecustomer_json_index index.

Summary

  • Use theGIN index to create an index for a JSONB column of a table to improve query performance.

Last updated on

Was this page helpful?
Thank you for your feedback!

[8]ページ先頭

©2009-2025 Movatter.jp