Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

MongoDB profile imageFranck Pachot
Franck Pachot forMongoDB

Posted on • Edited on

     

JSONB DeTOASTing (read amplification) when using PostgreSQL as a Document Database

PostgreSQL limits tuple sizes to a quarter of the block size, generally capping at 2KB. In document data modeling, where documents represent business transactions, sizes often exceed this limit. Storing the entire transaction in large documents as a single JSONB can lead to compression and splitting via TOAST (The Oversized-Attribute Storage Technique). While this is suitable for static documents that are infrequently accessed, it is less optimal for queries on documents. Let's take an example to detect this issue.

Using Postgres as a Document Database with JSONB

I create a user profile table similar to the previous post, but adding a bio field with large text:

createtableusers(idbigserialprimarykey,datajsonbnotnull);INSERTINTOusers(data)SELECTjsonb_build_object('name','u'||n::text,'bio',(SELECTstring_agg(chr(trunc(random()*(126-33)+33)::int),'')FROMgenerate_series(1,5000)),'email',jsonb_build_array('u'||n::text||'@compuserve.com'),'login',jsonb_build_object('last',to_char(current_timestamp,'YYYY-MM-DD'),'count',1))FROMgenerate_series(1,100000)n;vacuumanalyzeusers;
Enter fullscreen modeExit fullscreen mode

I check the size of the table, and also the TOAST overflow:

withusers_cteas(select*frompg_classwhereoid='users'::regclass)selectoid,relname,relpages,reltuples,reltoastrelidfromusers_cteunionallselectoid,relname,relpages,reltuples,reltoastrelidfrompg_classwhereoid=(selectreltoastrelidfromusers_cte);oid|relname|relpages|reltuples|reltoastrelid-------+----------------+----------+-----------+---------------32314|users|736|100000|3231832318|pg_toast_32314|71430|300000|0
Enter fullscreen modeExit fullscreen mode

The table contains 100000 rows across 736 pages, with three chunks per row stored externally in the TOAST table. This results in a total of 736 + 71430 = 72166 pages. Each chunk is approximately (71430 * 8192) / 300000 = 1951 bytes, ensuring that tuples remain under 2KB.

I useEXPLAIN ANALYZE to query all documents, which shows the number of pages accessed. To retrieve the JSON document, I applySERIALIZE, asEXPLAIN does not fetch it by default:

explain(analyze,verbose,buffers,wal,serializetext,costsoff)selectid,datafromusers;QUERYPLAN--------------------------------------------------------------------------SeqScanonpublic.users(actualtime=0.040..10.973rows=100000loops=1)Output:id,dataBuffers:sharedread=736PlanningTime:0.038msSerialization:time=2482.359msoutput=509831kBformat=textBuffers:sharedhit=384887read=72253ExecutionTime:2504.164ms
Enter fullscreen modeExit fullscreen mode

The table scan read 736 pages from the base table (shared read=736), while reading the JSONB content accessed 72253 pages from the TOAST (shared read=72253). Each TOAST page was read an average of 384887 / 72253 = 5 times, fortunately staying in the shared buffer cache (shared hit=384887) but accessing to shared buffers costs CPU and lightweight locks.

We observe a fivefold read amplification when querying the JSONB column, as it requires de-toasting.

Reading JSON fields with JSON path

In a document database, you can retrieve the entire document or access specific fields for filtering and sorting. For example, I include a projection of the "name" field.

explain(analyze,verbose,buffers,wal,serializetext,costsoff)selectid,data,(data->>'name')fromusers;QUERYPLAN---------------------------------------------------------------------------SeqScanonpublic.users(actualtime=0.085..532.367rows=100000loops=1)Output:id,data,(data->>'name'::text)Buffers:sharedhit=384887read=72989PlanningTime:0.039msSerialization:time=2276.025msoutput=510222kBformat=textBuffers:sharedhit=457140ExecutionTime:2819.235ms
Enter fullscreen modeExit fullscreen mode

PostgreSQL lacks optimization for accessing JSONB, which may lead to multiple de-toasting. As indicated in theOutput of the EXPLAIN VERBOSE, the scan has two projections in addition to the whole document, and access many times to the TOAST pages as indicated byshared hit=384887 read=72989. Retrieving the whole document causes an additionalshared hit=457140.

I continue by projecting one more field, "login.last":

explain(analyze,verbose,buffers,wal,serializetext,costsoff)selectid,data,(data->'login'->>'last'),(data->>'name')fromusers;QUERYPLAN---------------------------------------------------------------------------------------------SeqScanonpublic.users(actualtime=0.079..855.246rows=100000loops=1)Output:id,data,((data->'login'::text)->>'last'::text),(data->>'name'::text)Buffers:sharedhit=842027read=72989PlanningTime:0.040msSerialization:time=2261.679msoutput=511589kBformat=textBuffers:sharedhit=457140ExecutionTime:3128.017ms
Enter fullscreen modeExit fullscreen mode

Even when both projections occur in the same scan, the JSONB document is de-toasted twice: once for each field, resulting inshared hit=842027 read=72989.

To avoid retrieving the entire document, I project only the fields I need. I also run the query with an additional field, "login.count", but exclude "data" from the SELECT statement:

explain(analyze,verbose,buffers,wal,serializetext,costsoff)selectid,(data->'login'->>'count')::int,(data->'login'->>'last'),(data->>'name')fromusers;QUERYPLAN-----------------------------------------------------------------------------------------------------------------------------------------------SeqScanonpublic.users(actualtime=0.087..1149.260rows=100000loops=1)Output:id,(((data->'login'::text)->>'count'::text))::integer,((data->'login'::text)->>'last'::text),(data->>'name'::text)Buffers:sharedhit=1299167read=72989PlanningTime:0.042msSerialization:time=21.837mso
Enter fullscreen modeExit fullscreen mode

I saved the reads for serializing the result, but the scan indicatedshared hit=1299167 read=72989. It accessed only the necessary data from disk: 72989 pages from the base and TOAST tables. However, to read only three fields, it accessed the buffers 1299167 / 72989 = 18 times, leading to excessive CPU usage and potential lightweight lock contention during concurrent access.

Workaround: JSONB operators to transform JSON

To workaround this problem, do not use the JSON operators to access fields, but use JSON functions to extract the required fields to SQL columns like this:

select"user"."name","login".last,"login"."count"from"users",-- the table with JSONBlateraljsonb_to_record("data")-- first level fieldsas"user"("name"text,loginjsonb),lateraljsonb_to_record("login")-- sub-document fieldsas"login"("last"date,countint)
Enter fullscreen modeExit fullscreen mode

This will detoast the JSONB document once and get back to the same number of shared buffer hits and reads as when reading only one field:

QUERYPLAN---------------------------------------------------------------------------------------------------------------NestedLoop(actualtime=0.092..793.016rows=100000loops=1)Output:"user".name,login.last,login.countBuffers:sharedhit=384887read=72989->NestedLoop(actualtime=0.085..656.744rows=100000loops=1)Output:"user".name,"user".loginBuffers:sharedhit=384887read=72989->SeqScanonpublic.users(actualtime=0.041..11.807rows=100000loops=1)Output:users.id,users.dataBuffers:sharedread=736->FunctionScanonpg_catalog.jsonb_to_record"user"(actualtime=0.006..0.006rows=1loops=100000)Output:"user".name,"user".loginFunctionCall:jsonb_to_record(users.data)Buffers:sharedhit=384887read=72253->FunctionScanonpg_catalog.jsonb_to_recordlogin(actualtime=0.001..0.001rows=1loops=100000)Output:login.last,login.countFunctionCall:jsonb_to_record("user".login)PlanningTime:0.068msSerialization:time=25.103msoutput=3017kBformat=textExecutionTime:828.917ms
Enter fullscreen modeExit fullscreen mode

Usingjsonb_to_record(), the JSONB of each row is accessed as a single document, allowing for efficient retrieval and manipulation of user information from the output in the process memory. However, this approach cannot be used if field access belongs to an index condition.


In PostgreSQL, JSONB is a datatype for a single column, to store and get the entire document. In contrast, a document database like MongoDB reads and writes individual fields, keeping the document in memory for quick access, like an object cache. ItsWiredTiger storage engine decompresses the on-disk representation when loading to memory and compresses it during cache eviction or checkpoint.

I do not recommend to use PostgreSQL blindly as a document database. Instead, fields requiring individual access should be normalized into SQL columns rather than being embedded in a JSONB document. To identify suboptimal designs, examine the shared buffer hits using EXPLAIN ANALYZE with SERIALIZATION. If you choose a document model, then use a document database that was build to access and index the document fields efficiently.

Top comments(1)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss
CollapseExpand
 
franckpachot profile image
Franck Pachot
🥑 Developer Advocate at 🍃 MongoDB, 🔶 AWS Data Hero, 🐘 PostgreSQL fan,▝▞ YugabyteDB expert, 🅾️ Oracle Certified Master, and 💚 loving all databases 🛢️
  • Location
    Lausanne, Switzerland
  • Education
    Master MIAGE, Université Paris-Sud, France
  • Work
    Developer Advocate at MongoDB
  • Joined

There was a discussion about this onLinkedIn saying that it is not specific to the JSON data type, and that's true, I can observe the same with a ARRAY:dbfiddle.uk/Sl2ipvEV (I thought it could use an extended format in memory to avoid that, but apparently not). However, I see more developers accessing multiple JSON fields because the API looks like accessing an in-memory object.

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

Modern database platform

More fromMongoDB

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp