Posted onJun 4 • Edited onJul 1

JSONB DeTOASTing (read amplification) when using PostgreSQL as a Document Database

#postgres #jsonb #document #database

PostgreSQL limits tuple sizes to a quarter of the block size, generally capping at 2KB. In document data modeling, where documents represent business transactions, sizes often exceed this limit. Storing the entire transaction in large documents as a single JSONB can lead to compression and splitting via TOAST (The Oversized-Attribute Storage Technique). While this is suitable for static documents that are infrequently accessed, it is less optimal for queries on documents. Let's take an example to detect this issue.

Using Postgres as a Document Database with JSONB

I create a user profile table similar to the previous post, but adding a bio field with large text:

createtableusers(idbigserialprimarykey,datajsonbnotnull);INSERTINTOusers(data)SELECTjsonb_build_object('name','u'||n::text,'bio',(SELECTstring_agg(chr(trunc(random()*(126-33)+33)::int),'')FROMgenerate_series(1,5000)),'email',jsonb_build_array('u'||n::text||'@compuserve.com'),'login',jsonb_build_object('last',to_char(current_timestamp,'YYYY-MM-DD'),'count',1))FROMgenerate_series(1,100000)n;vacuumanalyzeusers;

I check the size of the table, and also the TOAST overflow:

withusers_cteas(select*frompg_classwhereoid='users'::regclass)selectoid,relname,relpages,reltuples,reltoastrelidfromusers_cteunionallselectoid,relname,relpages,reltuples,reltoastrelidfrompg_classwhereoid=(selectreltoastrelidfromusers_cte);oid|relname|relpages|reltuples|reltoastrelid-------+----------------+----------+-----------+---------------32314|users|736|100000|3231832318|pg_toast_32314|71430|300000|0

The table contains 100000 rows across 736 pages, with three chunks per row stored externally in the TOAST table. This results in a total of 736 + 71430 = 72166 pages. Each chunk is approximately (71430 * 8192) / 300000 = 1951 bytes, ensuring that tuples remain under 2KB.

I useEXPLAIN ANALYZE to query all documents, which shows the number of pages accessed. To retrieve the JSON document, I applySERIALIZE, asEXPLAIN does not fetch it by default:

explain(analyze,verbose,buffers,wal,serializetext,costsoff)selectid,datafromusers;QUERYPLAN--------------------------------------------------------------------------SeqScanonpublic.users(actualtime=0.040..10.973rows=100000loops=1)Output:id,dataBuffers:sharedread=736PlanningTime:0.038msSerialization:time=2482.359msoutput=509831kBformat=textBuffers:sharedhit=384887read=72253ExecutionTime:2504.164ms

The table scan read 736 pages from the base table (shared read=736), while reading the JSONB content accessed 72253 pages from the TOAST (shared read=72253). Each TOAST page was read an average of 384887 / 72253 = 5 times, fortunately staying in the shared buffer cache (shared hit=384887) but accessing to shared buffers costs CPU and lightweight locks.

We observe a fivefold read amplification when querying the JSONB column, as it requires de-toasting.

Reading JSON fields with JSON path

In a document database, you can retrieve the entire document or access specific fields for filtering and sorting. For example, I include a projection of the "name" field.

explain(analyze,verbose,buffers,wal,serializetext,costsoff)selectid,data,(data->>'name')fromusers;QUERYPLAN---------------------------------------------------------------------------SeqScanonpublic.users(actualtime=0.085..532.367rows=100000loops=1)Output:id,data,(data->>'name'::text)Buffers:sharedhit=384887read=72989PlanningTime:0.039msSerialization:time=2276.025msoutput=510222kBformat=textBuffers:sharedhit=457140ExecutionTime:2819.235ms

PostgreSQL lacks optimization for accessing JSONB, which may lead to multiple de-toasting. As indicated in theOutput of the EXPLAIN VERBOSE, the scan has two projections in addition to the whole document, and access many times to the TOAST pages as indicated byshared hit=384887 read=72989. Retrieving the whole document causes an additionalshared hit=457140.

I continue by projecting one more field, "login.last":

explain(analyze,verbose,buffers,wal,serializetext,costsoff)selectid,data,(data->'login'->>'last'),(data->>'name')fromusers;QUERYPLAN---------------------------------------------------------------------------------------------SeqScanonpublic.users(actualtime=0.079..855.246rows=100000loops=1)Output:id,data,((data->'login'::text)->>'last'::text),(data->>'name'::text)Buffers:sharedhit=842027read=72989PlanningTime:0.040msSerialization:time=2261.679msoutput=511589kBformat=textBuffers:sharedhit=457140ExecutionTime:3128.017ms

Even when both projections occur in the same scan, the JSONB document is de-toasted twice: once for each field, resulting inshared hit=842027 read=72989.

To avoid retrieving the entire document, I project only the fields I need. I also run the query with an additional field, "login.count", but exclude "data" from the SELECT statement:

explain(analyze,verbose,buffers,wal,serializetext,costsoff)selectid,(data->'login'->>'count')::int,(data->'login'->>'last'),(data->>'name')fromusers;QUERYPLAN-----------------------------------------------------------------------------------------------------------------------------------------------SeqScanonpublic.users(actualtime=0.087..1149.260rows=100000loops=1)Output:id,(((data->'login'::text)->>'count'::text))::integer,((data->'login'::text)->>'last'::text),(data->>'name'::text)Buffers:sharedhit=1299167read=72989PlanningTime:0.042msSerialization:time=21.837mso

I saved the reads for serializing the result, but the scan indicatedshared hit=1299167 read=72989. It accessed only the necessary data from disk: 72989 pages from the base and TOAST tables. However, to read only three fields, it accessed the buffers 1299167 / 72989 = 18 times, leading to excessive CPU usage and potential lightweight lock contention during concurrent access.

Workaround: JSONB operators to transform JSON

To workaround this problem, do not use the JSON operators to access fields, but use JSON functions to extract the required fields to SQL columns like this:

select"user"."name","login".last,"login"."count"from"users",-- the table with JSONBlateraljsonb_to_record("data")-- first level fieldsas"user"("name"text,loginjsonb),lateraljsonb_to_record("login")-- sub-document fieldsas"login"("last"date,countint)

This will detoast the JSONB document once and get back to the same number of shared buffer hits and reads as when reading only one field:

QUERYPLAN---------------------------------------------------------------------------------------------------------------NestedLoop(actualtime=0.092..793.016rows=100000loops=1)Output:"user".name,login.last,login.countBuffers:sharedhit=384887read=72989->NestedLoop(actualtime=0.085..656.744rows=100000loops=1)Output:"user".name,"user".loginBuffers:sharedhit=384887read=72989->SeqScanonpublic.users(actualtime=0.041..11.807rows=100000loops=1)Output:users.id,users.dataBuffers:sharedread=736->FunctionScanonpg_catalog.jsonb_to_record"user"(actualtime=0.006..0.006rows=1loops=100000)Output:"user".name,"user".loginFunctionCall:jsonb_to_record(users.data)Buffers:sharedhit=384887read=72253->FunctionScanonpg_catalog.jsonb_to_recordlogin(actualtime=0.001..0.001rows=1loops=100000)Output:login.last,login.countFunctionCall:jsonb_to_record("user".login)PlanningTime:0.068msSerialization:time=25.103msoutput=3017kBformat=textExecutionTime:828.917ms

Usingjsonb_to_record(), the JSONB of each row is accessed as a single document, allowing for efficient retrieval and manipulation of user information from the output in the process memory. However, this approach cannot be used if field access belongs to an index condition.

In PostgreSQL, JSONB is a datatype for a single column, to store and get the entire document. In contrast, a document database like MongoDB reads and writes individual fields, keeping the document in memory for quick access, like an object cache. ItsWiredTiger storage engine decompresses the on-disk representation when loading to memory and compresses it during cache eviction or checkpoint.

I do not recommend to use PostgreSQL blindly as a document database. Instead, fields requiring individual access should be normalized into SQL columns rather than being embedded in a JSONB document. To identify suboptimal designs, examine the shared buffer hits using EXPLAIN ANALYZE with SERIALIZATION. If you choose a document model, then use a document database that was build to access and index the document fields efficiently.

Top comments(1)

Franck Pachot

🥑 Developer Advocate at 🍃 MongoDB, 🔶 AWS Data Hero, 🐘 PostgreSQL fan,▝▞ YugabyteDB expert, 🅾️ Oracle Certified Master, and 💚 loving all databases 🛢️

Location
Lausanne, Switzerland
Education
Master MIAGE, Université Paris-Sud, France
Work
Developer Advocate at MongoDB
Joined
Nov 12, 2018

• Jun 5

Copy link

There was a discussion about this onLinkedIn saying that it is not specific to the JSON data type, and that's true, I can observe the same with a ARRAY:dbfiddle.uk/Sl2ipvEV (I thought it could use an extended format in memory to avoid that, but apparently not). However, I see more developers accessing multiple JSON fields because the API looks like accessing an in-memory object.

For further actions, you may consider blocking this person and/orreporting abuse

Movatterモバイル変換

DEV Community

JSONB DeTOASTing (read amplification) when using PostgreSQL as a Document Database

Using Postgres as a Document Database with JSONB

Reading JSON fields with JSON path

Workaround: JSONB operators to transform JSON

Top comments(1)

More fromMongoDB