Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit3838fa2

Browse files
committed
Build de-escaped JSON strings in larger chunks during lexing
During COPY BINARY with large JSONB blobs, it was found that halfthe time was spent parsing JSON, with much of that spent in separateappendStringInfoChar() calls for each input byte.Add lookahead loop to json_lex_string() to allow batching multiple bytesvia appendBinaryStringInfo(). Also use this same logic when de-escapingis not done, to avoid code duplication.Report and proof of concept patch by Jelte Fennema, reworked by AndresFreund and John NaylorDiscussion:https://www.postgresql.org/message-id/CAGECzQQuXbies_nKgSiYifZUjBk6nOf2%3DTSXqRjj2BhUh8CTeA%40mail.gmail.comDiscussion:https://www.postgresql.org/message-id/flat/PR3PR83MB0476F098CBCF68AF7A1CA89FF7B49@PR3PR83MB0476.EURPRD83.prod.outlook.com
1 parenta6434b9 commit3838fa2

File tree

1 file changed

+39
-19
lines changed

1 file changed

+39
-19
lines changed

‎src/common/jsonapi.c

Lines changed: 39 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -686,15 +686,6 @@ json_lex_string(JsonLexContext *lex)
686686
lex->token_terminator=s;
687687
returnJSON_INVALID_TOKEN;
688688
}
689-
elseif (*s=='"')
690-
break;
691-
elseif ((unsignedchar)*s<32)
692-
{
693-
/* Per RFC4627, these characters MUST be escaped. */
694-
/* Since *s isn't printable, exclude it from the context string */
695-
lex->token_terminator=s;
696-
returnJSON_ESCAPING_REQUIRED;
697-
}
698689
elseif (*s=='\\')
699690
{
700691
/* OK, we have an escape character. */
@@ -849,22 +840,51 @@ json_lex_string(JsonLexContext *lex)
849840
returnJSON_ESCAPING_INVALID;
850841
}
851842
}
852-
elseif (lex->strval!=NULL)
843+
else
853844
{
845+
char*p;
846+
854847
if (hi_surrogate!=-1)
855848
returnJSON_UNICODE_LOW_SURROGATE;
856849

857-
appendStringInfoChar(lex->strval,*s);
858-
}
859-
}
850+
/*
851+
* Skip to the first byte that requires special handling, so we
852+
* can batch calls to appendBinaryStringInfo.
853+
*/
854+
for (p=s;p<end;p++)
855+
{
856+
if (*p=='\\'||*p=='"')
857+
break;
858+
elseif ((unsignedchar)*p<32)
859+
{
860+
/* Per RFC4627, these characters MUST be escaped. */
861+
/*
862+
* Since *p isn't printable, exclude it from the context
863+
* string
864+
*/
865+
lex->token_terminator=p;
866+
returnJSON_ESCAPING_REQUIRED;
867+
}
868+
}
860869

861-
if (hi_surrogate!=-1)
862-
returnJSON_UNICODE_LOW_SURROGATE;
870+
if (lex->strval!=NULL)
871+
appendBinaryStringInfo(lex->strval,s,p-s);
863872

864-
/* Hooray, we found the end of the string! */
865-
lex->prev_token_terminator=lex->token_terminator;
866-
lex->token_terminator=s+1;
867-
returnJSON_SUCCESS;
873+
if (*p=='"')
874+
{
875+
/* Hooray, we found the end of the string! */
876+
lex->prev_token_terminator=lex->token_terminator;
877+
lex->token_terminator=p+1;
878+
returnJSON_SUCCESS;
879+
}
880+
881+
/*
882+
* s will be incremented at the top of the loop, so set it to just
883+
* behind our lookahead position
884+
*/
885+
s=p-1;
886+
}
887+
}
868888
}
869889

870890
/*

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp