- Notifications
You must be signed in to change notification settings - Fork371
A blazingly fast JSON serializing & deserializing library
License
bytedance/sonic
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
English |中文
A blazingly fast JSON serializing & deserializing library, accelerated by JIT (just-in-time compiling) and SIMD (single-instruction-multiple-data).
- Go: 1.17~1.24
- Notice: Go1.24.0 is not supported due to theissue, please use higher go version or add build tag
--ldflags="-checklinkname=0"
- Notice: Go1.24.0 is not supported due to theissue, please use higher go version or add build tag
- OS: Linux / MacOS / Windows
- CPU: AMD64 / (ARM64, need go1.20 above)
- Runtime object binding without code generation
- Complete APIs for JSON value manipulation
- Fast, fast, fast!
seego.dev
Forall sizes of json andall scenarios of usage,Sonic performs best.
- Medium (13KB, 300+ key, 6 layers)
goversion:1.17.1goos: darwingoarch: amd64cpu: Intel(R) Core(TM) i9-9880H CPU @2.30GHzBenchmarkEncoder_Generic_Sonic-1632393 ns/op402.40 MB/s11965 B/op4 allocs/opBenchmarkEncoder_Generic_Sonic_Fast-1621668 ns/op601.57 MB/s10940 B/op4 allocs/opBenchmarkEncoder_Generic_JsonIter-1642168 ns/op309.12 MB/s14345 B/op115 allocs/opBenchmarkEncoder_Generic_GoJson-1665189 ns/op199.96 MB/s23261 B/op16 allocs/opBenchmarkEncoder_Generic_StdLib-16106322 ns/op122.60 MB/s49136 B/op789 allocs/opBenchmarkEncoder_Binding_Sonic-166269 ns/op2079.26 MB/s14173 B/op4 allocs/opBenchmarkEncoder_Binding_Sonic_Fast-165281 ns/op2468.16 MB/s12322 B/op4 allocs/opBenchmarkEncoder_Binding_JsonIter-1620056 ns/op649.93 MB/s9488 B/op2 allocs/opBenchmarkEncoder_Binding_GoJson-168311 ns/op1568.32 MB/s9481 B/op1 allocs/opBenchmarkEncoder_Binding_StdLib-1616448 ns/op792.52 MB/s9479 B/op1 allocs/opBenchmarkEncoder_Parallel_Generic_Sonic-166681 ns/op1950.93 MB/s12738 B/op4 allocs/opBenchmarkEncoder_Parallel_Generic_Sonic_Fast-164179 ns/op3118.99 MB/s10757 B/op4 allocs/opBenchmarkEncoder_Parallel_Generic_JsonIter-169861 ns/op1321.84 MB/s14362 B/op115 allocs/opBenchmarkEncoder_Parallel_Generic_GoJson-1618850 ns/op691.52 MB/s23278 B/op16 allocs/opBenchmarkEncoder_Parallel_Generic_StdLib-1645902 ns/op283.97 MB/s49174 B/op789 allocs/opBenchmarkEncoder_Parallel_Binding_Sonic-161480 ns/op8810.09 MB/s13049 B/op4 allocs/opBenchmarkEncoder_Parallel_Binding_Sonic_Fast-161209 ns/op10785.23 MB/s11546 B/op4 allocs/opBenchmarkEncoder_Parallel_Binding_JsonIter-166170 ns/op2112.58 MB/s9504 B/op2 allocs/opBenchmarkEncoder_Parallel_Binding_GoJson-163321 ns/op3925.52 MB/s9496 B/op1 allocs/opBenchmarkEncoder_Parallel_Binding_StdLib-163739 ns/op3486.49 MB/s9480 B/op1 allocs/opBenchmarkDecoder_Generic_Sonic-1666812 ns/op195.10 MB/s57602 B/op723 allocs/opBenchmarkDecoder_Generic_Sonic_Fast-1654523 ns/op239.07 MB/s49786 B/op313 allocs/opBenchmarkDecoder_Generic_StdLib-16124260 ns/op104.90 MB/s50869 B/op772 allocs/opBenchmarkDecoder_Generic_JsonIter-1691274 ns/op142.81 MB/s55782 B/op1068 allocs/opBenchmarkDecoder_Generic_GoJson-1688569 ns/op147.17 MB/s66367 B/op973 allocs/opBenchmarkDecoder_Binding_Sonic-1632557 ns/op400.38 MB/s28302 B/op137 allocs/opBenchmarkDecoder_Binding_Sonic_Fast-1628649 ns/op455.00 MB/s24999 B/op34 allocs/opBenchmarkDecoder_Binding_StdLib-16111437 ns/op116.97 MB/s10576 B/op208 allocs/opBenchmarkDecoder_Binding_JsonIter-1635090 ns/op371.48 MB/s14673 B/op385 allocs/opBenchmarkDecoder_Binding_GoJson-1628738 ns/op453.59 MB/s22039 B/op49 allocs/opBenchmarkDecoder_Parallel_Generic_Sonic-1612321 ns/op1057.91 MB/s57233 B/op723 allocs/opBenchmarkDecoder_Parallel_Generic_Sonic_Fast-1610644 ns/op1224.64 MB/s49362 B/op313 allocs/opBenchmarkDecoder_Parallel_Generic_StdLib-1657587 ns/op226.35 MB/s50874 B/op772 allocs/opBenchmarkDecoder_Parallel_Generic_JsonIter-1638666 ns/op337.12 MB/s55789 B/op1068 allocs/opBenchmarkDecoder_Parallel_Generic_GoJson-1630259 ns/op430.79 MB/s66370 B/op974 allocs/opBenchmarkDecoder_Parallel_Binding_Sonic-165965 ns/op2185.28 MB/s27747 B/op137 allocs/opBenchmarkDecoder_Parallel_Binding_Sonic_Fast-165170 ns/op2521.31 MB/s24715 B/op34 allocs/opBenchmarkDecoder_Parallel_Binding_StdLib-1627582 ns/op472.58 MB/s10576 B/op208 allocs/opBenchmarkDecoder_Parallel_Binding_JsonIter-1613571 ns/op960.51 MB/s14685 B/op385 allocs/opBenchmarkDecoder_Parallel_Binding_GoJson-1610031 ns/op1299.51 MB/s22111 B/op49 allocs/opBenchmarkGetOne_Sonic-163276 ns/op3975.78 MB/s24 B/op1 allocs/opBenchmarkGetOne_Gjson-169431 ns/op1380.81 MB/s0 B/op0 allocs/opBenchmarkGetOne_Jsoniter-1651178 ns/op254.46 MB/s27936 B/op647 allocs/opBenchmarkGetOne_Parallel_Sonic-16216.7 ns/op60098.95 MB/s24 B/op1 allocs/opBenchmarkGetOne_Parallel_Gjson-161076 ns/op12098.62 MB/s0 B/op0 allocs/opBenchmarkGetOne_Parallel_Jsoniter-1617741 ns/op734.06 MB/s27945 B/op647 allocs/opBenchmarkSetOne_Sonic-169571 ns/op1360.61 MB/s1584 B/op17 allocs/opBenchmarkSetOne_Sjson-1636456 ns/op357.22 MB/s52180 B/op9 allocs/opBenchmarkSetOne_Jsoniter-1679475 ns/op163.86 MB/s45862 B/op964 allocs/opBenchmarkSetOne_Parallel_Sonic-16850.9 ns/op15305.31 MB/s1584 B/op17 allocs/opBenchmarkSetOne_Parallel_Sjson-1618194 ns/op715.77 MB/s52247 B/op9 allocs/opBenchmarkSetOne_Parallel_Jsoniter-1633560 ns/op388.05 MB/s45892 B/op964 allocs/opBenchmarkLoadNode/LoadAll()-1611384 ns/op1143.93 MB/s6307 B/op25 allocs/opBenchmarkLoadNode_Parallel/LoadAll()-165493 ns/op2370.68 MB/s7145 B/op25 allocs/opBenchmarkLoadNode/Interface()-1617722 ns/op734.85 MB/s13323 B/op88 allocs/opBenchmarkLoadNode_Parallel/Interface()-1610330 ns/op1260.70 MB/s15178 B/op88 allocs/op
Seebench.sh for benchmark codes.
SeeINTRODUCTION.md.
Default behaviors are mostly consistent withencoding/json
, except HTML escaping form (seeEscape HTML) andSortKeys
feature (optional support seeSort Keys) that isNOT in conformity toRFC8259.
import"github.com/bytedance/sonic"vardataYourSchema// Marshaloutput,err:=sonic.Marshal(&data)// Unmarshalerr:=sonic.Unmarshal(output,&data)
Sonic supports decoding json fromio.Reader
or encoding objects intoio.Writer
, aims at handling multiple values as well as reducing memory consumption.
- encoder
varo1=map[string]interface{}{"a":"b",}varo2=1varw=bytes.NewBuffer(nil)varenc=sonic.ConfigDefault.NewEncoder(w)enc.Encode(o1)enc.Encode(o2)fmt.Println(w.String())// Output:// {"a":"b"}// 1
- decoder
varo=map[string]interface{}{}varr=strings.NewReader(`{"a":"b"}{"1":"2"}`)vardec=sonic.ConfigDefault.NewDecoder(r)dec.Decode(&o)dec.Decode(&o)fmt.Printf("%+v",o)// Output:// map[1:2 a:b]
import"github.com/bytedance/sonic/decoder"varinput=`1`vardatainterface{}// default float64dc:=decoder.NewDecoder(input)dc.Decode(&data)// data == float64(1)// use json.Numberdc=decoder.NewDecoder(input)dc.UseNumber()dc.Decode(&data)// data == json.Number("1")// use int64dc=decoder.NewDecoder(input)dc.UseInt64()dc.Decode(&data)// data == int64(1)root,err:=sonic.GetFromString(input)// Get json.Numberjn:=root.Number()jm:=root.InterfaceUseNumber().(json.Number)// jn == jm// Get float64fn:=root.Float64()fm:=root.Interface().(float64)// jn == jm
On account of the performance loss from sorting (roughly 10%), sonic doesn't enable this feature by default. If your component depends on it to work (likezstd), Use it like this:
import"github.com/bytedance/sonic"import"github.com/bytedance/sonic/encoder"// Binding map onlym:=map[string]interface{}{}v,err:=encoder.Encode(m,encoder.SortMapKeys)// Or ast.Node.SortKeys() before marshalvarroot:= sonic.Get(JSON)err:=root.SortKeys()
On account of the performance loss (roughly 15%), sonic doesn't enable this feature by default. You can useencoder.EscapeHTML
option to open this feature (align withencoding/json.HTMLEscape
).
import"github.com/bytedance/sonic"v:=map[string]string{"&&":"<>"}ret,err:=Encode(v,EscapeHTML)// ret == `{"\u0026\u0026":{"X":"\u003c\u003e"}}`
Sonic encodes primitive objects (struct/map...) as compact-format JSON by default, except marshalingjson.RawMessage
orjson.Marshaler
: sonic ensures validating their output JSON butDO NOT compacting them for performance concerns. We provide the optionencoder.CompactMarshaler
to add compacting process.
If there invalid syntax in input JSON, sonic will returndecoder.SyntaxError
, which supports pretty-printing of error position
import"github.com/bytedance/sonic"import"github.com/bytedance/sonic/decoder"vardatainterface{}err:=sonic.UnmarshalString("[[[}]]",&data)iferr!=nil {/* One line by default */println(e.Error())// "Syntax error at index 3: invalid char\n\n\t[[[}]]\n\t...^..\n"/* Pretty print */ife,ok:=err.(decoder.SyntaxError);ok {/*Syntax error at index 3: invalid char [[[}]] ...^.. */print(e.Description()) }elseifme,ok:=err.(*decoder.MismatchTypeError);ok {// decoder.MismatchTypeError is new to Sonic v1.6.0print(me.Description()) }}
If there amismatch-typed value for a given key, sonic will reportdecoder.MismatchTypeError
(if there are many, report the last one), but still skip wrong the value and keep decoding next JSON.
import"github.com/bytedance/sonic"import"github.com/bytedance/sonic/decoder"vardata=struct{AintBint}{}err:=UnmarshalString(`{"A":"1","B":1}`,&data)println(err.Error())// Mismatch type int with value string "at index 5: mismatched type with value\n\n\t{\"A\":\"1\",\"B\":1}\n\t.....^.........\n"fmt.Printf("%+v",data)// {A:0 B:1}
Sonic/ast.Node is a completely self-contained AST for JSON. It implements serialization and deserialization both and provides robust APIs for obtaining and modification of generic data.
Search partial JSON by given paths, which must be non-negative integer or string, or nil
import"github.com/bytedance/sonic"input:= []byte(`{"key1":[{},{"key2":{"key3":[1,2,3]}}]}`)// no path, returns entire jsonroot,err:=sonic.Get(input)raw:=root.Raw()// == string(input)// multiple pathsroot,err:=sonic.Get(input,"key1",1,"key2")sub:=root.Get("key3").Index(2).Int64()// == 3
Tip: sinceIndex()
uses offset to locate data, which is much faster than scanning likeGet()
, we suggest you use it as much as possible. And sonic also provides another APIIndexOrGet()
to underlying use offset as well as ensure the key is matched.
Searcher
provides some options for user to meet different needs:
opts:= ast.SearchOption{CopyReturn:true... }val,err:=sonic.GetWithOptions(JSON,opts,"key")
- CopyReturnIndicate the searcher to copy the result JSON string instead of refer from the input. This can help to reduce memory usage if you cache the results
- ConcurentReadSince
ast.Node
useLazy-Load
design, it doesn't support Concurrently-Read by default. If you want to read it concurrently, please specify it. - ValidateJSONIndicate the searcher to validate the entire JSON. This option is enabled by default, which slow down the search speed a little.
Modify the json content by Set()/Unset()
import"github.com/bytedance/sonic"// Setexist,err:=root.Set("key4",NewBool(true))// exist == falsealias1:=root.Get("key4")println(alias1.Valid())// truealias2:=root.Index(1)println(alias1==alias2)// true// Unsetexist,err:=root.UnsetByIndex(1)// exist == trueprintln(root.Get("key4").Check())// "value not exist"
To encodeast.Node
as json, useMarshalJson()
orjson.Marshal()
(MUST pass the node's pointer)
import ("encoding/json""github.com/bytedance/sonic")buf,err:=root.MarshalJson()println(string(buf))// {"key1":[{},{"key2":{"key3":[1,2,3]}}]}exp,err:=json.Marshal(&root)// WARN: use pointerprintln(string(buf)==string(exp))// true
- validation:
Check()
,Error()
,Valid()
,Exist()
- searching:
Index()
,Get()
,IndexPair()
,IndexOrGet()
,GetByPath()
- go-type casting:
Int64()
,Float64()
,String()
,Number()
,Bool()
,Map[UseNumber|UseNode]()
,Array[UseNumber|UseNode]()
,Interface[UseNumber|UseNode]()
- go-type packing:
NewRaw()
,NewNumber()
,NewNull()
,NewBool()
,NewString()
,NewObject()
,NewArray()
- iteration:
Values()
,Properties()
,ForEach()
,SortKeys()
- modification:
Set()
,SetByIndex()
,Add()
Sonic provides an advanced API for fully parsing JSON into non-standard types (neitherstruct
notmap[string]interface{}
) without using any intermediate representation (ast.Node
orinterface{}
). For example, you might have the following types which are likeinterface{}
but actually notinterface{}
:
typeUserNodeinterface {}// the following types implement the UserNode interface.type (UserNullstruct{}UserBoolstruct{Valuebool }UserInt64struct{Valueint64 }UserFloat64struct{Valuefloat64 }UserStringstruct{Valuestring }UserObjectstruct{Valuemap[string]UserNode }UserArraystruct{Value []UserNode })
Sonic provides the following API to returnthe preorder traversal of a JSON AST. Theast.Visitor
is a SAX style interface which is used in some C++ JSON library. You should implementast.Visitor
by yourself and pass it toast.Preorder()
method. In your visitor you can make your custom types to represent JSON values. There may be an O(n) space container (such as stack) in your visitor to record the object / array hierarchy.
funcPreorder(strstring,visitorVisitor,opts*VisitorOptions)errortypeVisitorinterface {OnNull()errorOnBool(vbool)errorOnString(vstring)errorOnInt64(vint64,n json.Number)errorOnFloat64(vfloat64,n json.Number)errorOnObjectBegin(capacityint)errorOnObjectKey(keystring)errorOnObjectEnd()errorOnArrayBegin(capacityint)errorOnArrayEnd()error}
Seeast/visitor.go for detailed usage. We also implement a demo visitor forUserNode
inast/visitor_test.go.
For developers who want to use sonic to meet diffirent scenarios, we provide some integrated configs assonic.API
ConfigDefault
: the sonic's default config (EscapeHTML=false
,SortKeys=false
...) to run sonic fast meanwhile ensure security.ConfigStd
: the std-compatible config (EscapeHTML=true
,SortKeys=true
...)ConfigFastest
: the fastest config (NoQuoteTextMarshaler=true
) to run on sonic as fast as possible.SonicDOES NOT ensure to support all environments, due to the difficulty of developing high-performance codes. On non-sonic-supporting environment, the implementation will fall back toencoding/json
. Thus beflow configs will all equal toConfigStd
.
Since Sonic usesgolang-asm as a JIT assembler, which is NOT very suitable for runtime compiling, first-hit running of a huge schema may cause request-timeout or even process-OOM. For better stability, we adviseusingPretouch()
for huge-schema or compact-memory applications beforeMarshal()/Unmarshal()
.
import ("reflect""github.com/bytedance/sonic""github.com/bytedance/sonic/option")funcinit() {varvHugeStruct// For most large types (nesting depth <= option.DefaultMaxInlineDepth)err:=sonic.Pretouch(reflect.TypeOf(v))// with more CompileOption...err:=sonic.Pretouch(reflect.TypeOf(v),// If the type is too deep nesting (nesting depth > option.DefaultMaxInlineDepth),// you can set compile recursive loops in Pretouch for better stability in JIT.option.WithCompileRecursiveDepth(loop),// For a large nested struct, try to set a smaller depth to reduce compiling time.option.WithCompileMaxInlineDepth(depth), )}
When decodingstring values without any escaped characters, sonic references them from the origin JSON buffer instead of mallocing a new buffer to copy. This helps a lot for CPU performance but may leave the whole JSON buffer in memory as long as the decoded objects are being used. In practice, we found the extra memory introduced by referring JSON buffer is usually 20% ~ 80% of decoded objects. Once an application holds these objects for a long time (for example, cache the decoded objects for reusing), its in-use memory on the server may go up. -Config.CopyString
/decoder.CopyString()
: We provide the option forDecode()
/Unmarshal()
users to choose not to reference the JSON buffer, which may cause a decline in CPU performance to some degree.
GetFromStringNoCopy()
: For memory safety,sonic.Get()
/sonic.GetFromString()
now copies return JSON. If users want to get json more quickly and not care about memory usage, you can useGetFromStringNoCopy()
to return a JSON directly referenced from source.
For alignment toencoding/json
, we provide API to pass[]byte
as an argument, but the string-to-bytes copy is conducted at the same time considering safety, which may lose performance when the origin JSON is huge. Therefore, you can useUnmarshalString()
andGetFromString()
to pass a string, as long as your origin data is a string ornocopy-cast is safe for your []byte. We also provide APIMarshalString()
for convenientnocopy-cast of encoded JSON []byte, which is safe since sonic's output bytes is always duplicated and unique.
To ensure data security, sonic.Encoder quotes and escapes string values fromencoding.TextMarshaler
interfaces by default, which may degrade performance much if most of your data is in form of them. We provideencoder.NoQuoteTextMarshaler
to skip these operations, which means youMUST ensure their output string escaped and quoted followingRFC8259.
Infully-parsed scenario,Unmarshal()
performs better thanGet()
+Node.Interface()
. But if you only have a part of the schema for specific json, you can combineGet()
andUnmarshal()
together:
import"github.com/bytedance/sonic"node,err:=sonic.GetFromString(_TwitterJson,"statuses",3,"user")varuserUser// your partial schema...err=sonic.UnmarshalString(node.Raw(),&user)
Even if you don't have any schema, useast.Node
as the container of generic values instead ofmap
orinterface
:
import"github.com/bytedance/sonic"root,err:=sonic.GetFromString(_TwitterJson)user:=root.GetByPath("statuses",3,"user")// === root.Get("status").Index(3).Get("user")err=user.Check()// err = user.LoadAll() // only call this when you want to use 'user' concurrently...gosomeFunc(user)
Why? Becauseast.Node
stores its children usingarray
:
Array
's performance ismuch better thanMap
when Inserting (Deserialize) and Scanning (Serialize) data;- Hashing (
map[x]
) is not as efficient asIndexing (array[x]
), whichast.Node
can conduct onboth array and object; - Using
Interface()
/Map()
means Sonic must parse all the underlying values, whileast.Node
can parse themon demand.
CAUTION:ast.Node
DOESN'T ensure concurrent security directly, due to itslazy-load design. However, you can callNode.Load()
/Node.LoadAll()
to achieve that, which may bring performance reduction while it still works faster than converting tomap
orinterface{}
For generic data,ast.Node
should be enough for your needs in most cases.
However,ast.Node
is designed for partially processing JSON string. It has some special designs such as lazy-load which might not be suitable for directly parsing the whole JSON string likeUnmarshal()
. Althoughast.Node
is better thenmap
orinterface{}
, it's also a kind of intermediate representation after all if your final types are customized and you have to convert the above types to your custom types after parsing.
For better performance, in previous case theast.Visitor
will be the better choice. It performs JSON decoding likeUnmarshal()
and you can directly use your final types to represents a JSON AST without any intermediate representations.
Butast.Visitor
is not a very handy API. You might need to write a lot of code to implement your visitor and carefully maintain the tree hierarchy during decoding. Please read the comments inast/visitor.go carefully if you decide to use this API.
Sonic use memory pool in many places likeencoder.Encode
,ast.Node.MarshalJSON
to improve performance, which may produce more memory usage (in-use) when server's load is high. Seeissue 614. Therefore, we introduce some options to let user control the behavior of memory pool. Seeoption package.
For security, sonic useFSM algorithm to validate JSON when decoding raw JSON or encodingjson.Marshaler
, which is much slower (1~10x) thanSIMD-searching-pair algorithm. If user has many redundant JSON value and DO NOT NEED to strictly validate JSON correctness, you can enable below options:
Config.NoValidateSkipJSON
: for faster skipping JSON when decoding, such as unknown fields, json.Unmarshaler(json.RawMessage), mismatched values, and redundant array elementsConfig.NoValidateJSONMarshaler
: avoid validating JSON when encodingjson.Marshaler
SearchOption.ValidateJSON
: indicates if validate located JSON value whenGet
tidwall/gjson has provided a comprehensive and popular JSON-Path API, anda lot of older codes heavily relies on it. Therefore, we provides a wrapper library, which combines gjson's API with sonic's SIMD algorithm to boost up the performance. Seecloudwego/gjson.
Sonic is a subproject ofCloudWeGo. We are committed to building a cloud native ecosystem.
About
A blazingly fast JSON serializing & deserializing library