Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

A blazingly fast JSON serializing & deserializing library

License

NotificationsYou must be signed in to change notification settings

bytedance/sonic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English |中文

A blazingly fast JSON serializing & deserializing library, accelerated by JIT (just-in-time compiling) and SIMD (single-instruction-multiple-data).

Requirement

  • Go: 1.17~1.24
    • Notice: Go1.24.0 is not supported due to theissue, please use higher go version or add build tag--ldflags="-checklinkname=0"
  • OS: Linux / MacOS / Windows
  • CPU: AMD64 / (ARM64, need go1.20 above)

Features

  • Runtime object binding without code generation
  • Complete APIs for JSON value manipulation
  • Fast, fast, fast!

APIs

seego.dev

Benchmarks

Forall sizes of json andall scenarios of usage,Sonic performs best.

  • Medium (13KB, 300+ key, 6 layers)
goversion:1.17.1goos: darwingoarch: amd64cpu: Intel(R) Core(TM) i9-9880H CPU @2.30GHzBenchmarkEncoder_Generic_Sonic-1632393 ns/op402.40 MB/s11965 B/op4 allocs/opBenchmarkEncoder_Generic_Sonic_Fast-1621668 ns/op601.57 MB/s10940 B/op4 allocs/opBenchmarkEncoder_Generic_JsonIter-1642168 ns/op309.12 MB/s14345 B/op115 allocs/opBenchmarkEncoder_Generic_GoJson-1665189 ns/op199.96 MB/s23261 B/op16 allocs/opBenchmarkEncoder_Generic_StdLib-16106322 ns/op122.60 MB/s49136 B/op789 allocs/opBenchmarkEncoder_Binding_Sonic-166269 ns/op2079.26 MB/s14173 B/op4 allocs/opBenchmarkEncoder_Binding_Sonic_Fast-165281 ns/op2468.16 MB/s12322 B/op4 allocs/opBenchmarkEncoder_Binding_JsonIter-1620056 ns/op649.93 MB/s9488 B/op2 allocs/opBenchmarkEncoder_Binding_GoJson-168311 ns/op1568.32 MB/s9481 B/op1 allocs/opBenchmarkEncoder_Binding_StdLib-1616448 ns/op792.52 MB/s9479 B/op1 allocs/opBenchmarkEncoder_Parallel_Generic_Sonic-166681 ns/op1950.93 MB/s12738 B/op4 allocs/opBenchmarkEncoder_Parallel_Generic_Sonic_Fast-164179 ns/op3118.99 MB/s10757 B/op4 allocs/opBenchmarkEncoder_Parallel_Generic_JsonIter-169861 ns/op1321.84 MB/s14362 B/op115 allocs/opBenchmarkEncoder_Parallel_Generic_GoJson-1618850 ns/op691.52 MB/s23278 B/op16 allocs/opBenchmarkEncoder_Parallel_Generic_StdLib-1645902 ns/op283.97 MB/s49174 B/op789 allocs/opBenchmarkEncoder_Parallel_Binding_Sonic-161480 ns/op8810.09 MB/s13049 B/op4 allocs/opBenchmarkEncoder_Parallel_Binding_Sonic_Fast-161209 ns/op10785.23 MB/s11546 B/op4 allocs/opBenchmarkEncoder_Parallel_Binding_JsonIter-166170 ns/op2112.58 MB/s9504 B/op2 allocs/opBenchmarkEncoder_Parallel_Binding_GoJson-163321 ns/op3925.52 MB/s9496 B/op1 allocs/opBenchmarkEncoder_Parallel_Binding_StdLib-163739 ns/op3486.49 MB/s9480 B/op1 allocs/opBenchmarkDecoder_Generic_Sonic-1666812 ns/op195.10 MB/s57602 B/op723 allocs/opBenchmarkDecoder_Generic_Sonic_Fast-1654523 ns/op239.07 MB/s49786 B/op313 allocs/opBenchmarkDecoder_Generic_StdLib-16124260 ns/op104.90 MB/s50869 B/op772 allocs/opBenchmarkDecoder_Generic_JsonIter-1691274 ns/op142.81 MB/s55782 B/op1068 allocs/opBenchmarkDecoder_Generic_GoJson-1688569 ns/op147.17 MB/s66367 B/op973 allocs/opBenchmarkDecoder_Binding_Sonic-1632557 ns/op400.38 MB/s28302 B/op137 allocs/opBenchmarkDecoder_Binding_Sonic_Fast-1628649 ns/op455.00 MB/s24999 B/op34 allocs/opBenchmarkDecoder_Binding_StdLib-16111437 ns/op116.97 MB/s10576 B/op208 allocs/opBenchmarkDecoder_Binding_JsonIter-1635090 ns/op371.48 MB/s14673 B/op385 allocs/opBenchmarkDecoder_Binding_GoJson-1628738 ns/op453.59 MB/s22039 B/op49 allocs/opBenchmarkDecoder_Parallel_Generic_Sonic-1612321 ns/op1057.91 MB/s57233 B/op723 allocs/opBenchmarkDecoder_Parallel_Generic_Sonic_Fast-1610644 ns/op1224.64 MB/s49362 B/op313 allocs/opBenchmarkDecoder_Parallel_Generic_StdLib-1657587 ns/op226.35 MB/s50874 B/op772 allocs/opBenchmarkDecoder_Parallel_Generic_JsonIter-1638666 ns/op337.12 MB/s55789 B/op1068 allocs/opBenchmarkDecoder_Parallel_Generic_GoJson-1630259 ns/op430.79 MB/s66370 B/op974 allocs/opBenchmarkDecoder_Parallel_Binding_Sonic-165965 ns/op2185.28 MB/s27747 B/op137 allocs/opBenchmarkDecoder_Parallel_Binding_Sonic_Fast-165170 ns/op2521.31 MB/s24715 B/op34 allocs/opBenchmarkDecoder_Parallel_Binding_StdLib-1627582 ns/op472.58 MB/s10576 B/op208 allocs/opBenchmarkDecoder_Parallel_Binding_JsonIter-1613571 ns/op960.51 MB/s14685 B/op385 allocs/opBenchmarkDecoder_Parallel_Binding_GoJson-1610031 ns/op1299.51 MB/s22111 B/op49 allocs/opBenchmarkGetOne_Sonic-163276 ns/op3975.78 MB/s24 B/op1 allocs/opBenchmarkGetOne_Gjson-169431 ns/op1380.81 MB/s0 B/op0 allocs/opBenchmarkGetOne_Jsoniter-1651178 ns/op254.46 MB/s27936 B/op647 allocs/opBenchmarkGetOne_Parallel_Sonic-16216.7 ns/op60098.95 MB/s24 B/op1 allocs/opBenchmarkGetOne_Parallel_Gjson-161076 ns/op12098.62 MB/s0 B/op0 allocs/opBenchmarkGetOne_Parallel_Jsoniter-1617741 ns/op734.06 MB/s27945 B/op647 allocs/opBenchmarkSetOne_Sonic-169571 ns/op1360.61 MB/s1584 B/op17 allocs/opBenchmarkSetOne_Sjson-1636456 ns/op357.22 MB/s52180 B/op9 allocs/opBenchmarkSetOne_Jsoniter-1679475 ns/op163.86 MB/s45862 B/op964 allocs/opBenchmarkSetOne_Parallel_Sonic-16850.9 ns/op15305.31 MB/s1584 B/op17 allocs/opBenchmarkSetOne_Parallel_Sjson-1618194 ns/op715.77 MB/s52247 B/op9 allocs/opBenchmarkSetOne_Parallel_Jsoniter-1633560 ns/op388.05 MB/s45892 B/op964 allocs/opBenchmarkLoadNode/LoadAll()-1611384 ns/op1143.93 MB/s6307 B/op25 allocs/opBenchmarkLoadNode_Parallel/LoadAll()-165493 ns/op2370.68 MB/s7145 B/op25 allocs/opBenchmarkLoadNode/Interface()-1617722 ns/op734.85 MB/s13323 B/op88 allocs/opBenchmarkLoadNode_Parallel/Interface()-1610330 ns/op1260.70 MB/s15178 B/op88 allocs/op
  • Small (400B, 11 keys, 3 layers)small benchmarks
  • Large (635KB, 10000+ key, 6 layers)large benchmarks

Seebench.sh for benchmark codes.

How it works

SeeINTRODUCTION.md.

Usage

Marshal/Unmarshal

Default behaviors are mostly consistent withencoding/json, except HTML escaping form (seeEscape HTML) andSortKeys feature (optional support seeSort Keys) that isNOT in conformity toRFC8259.

import"github.com/bytedance/sonic"vardataYourSchema// Marshaloutput,err:=sonic.Marshal(&data)// Unmarshalerr:=sonic.Unmarshal(output,&data)

Streaming IO

Sonic supports decoding json fromio.Reader or encoding objects intoio.Writer, aims at handling multiple values as well as reducing memory consumption.

  • encoder
varo1=map[string]interface{}{"a":"b",}varo2=1varw=bytes.NewBuffer(nil)varenc=sonic.ConfigDefault.NewEncoder(w)enc.Encode(o1)enc.Encode(o2)fmt.Println(w.String())// Output:// {"a":"b"}// 1
  • decoder
varo=map[string]interface{}{}varr=strings.NewReader(`{"a":"b"}{"1":"2"}`)vardec=sonic.ConfigDefault.NewDecoder(r)dec.Decode(&o)dec.Decode(&o)fmt.Printf("%+v",o)// Output:// map[1:2 a:b]

Use Number/Use Int64

import"github.com/bytedance/sonic/decoder"varinput=`1`vardatainterface{}// default float64dc:=decoder.NewDecoder(input)dc.Decode(&data)// data == float64(1)// use json.Numberdc=decoder.NewDecoder(input)dc.UseNumber()dc.Decode(&data)// data == json.Number("1")// use int64dc=decoder.NewDecoder(input)dc.UseInt64()dc.Decode(&data)// data == int64(1)root,err:=sonic.GetFromString(input)// Get json.Numberjn:=root.Number()jm:=root.InterfaceUseNumber().(json.Number)// jn == jm// Get float64fn:=root.Float64()fm:=root.Interface().(float64)// jn == jm

Sort Keys

On account of the performance loss from sorting (roughly 10%), sonic doesn't enable this feature by default. If your component depends on it to work (likezstd), Use it like this:

import"github.com/bytedance/sonic"import"github.com/bytedance/sonic/encoder"// Binding map onlym:=map[string]interface{}{}v,err:=encoder.Encode(m,encoder.SortMapKeys)// Or ast.Node.SortKeys() before marshalvarroot:= sonic.Get(JSON)err:=root.SortKeys()

Escape HTML

On account of the performance loss (roughly 15%), sonic doesn't enable this feature by default. You can useencoder.EscapeHTML option to open this feature (align withencoding/json.HTMLEscape).

import"github.com/bytedance/sonic"v:=map[string]string{"&&":"<>"}ret,err:=Encode(v,EscapeHTML)// ret == `{"\u0026\u0026":{"X":"\u003c\u003e"}}`

Compact Format

Sonic encodes primitive objects (struct/map...) as compact-format JSON by default, except marshalingjson.RawMessage orjson.Marshaler: sonic ensures validating their output JSON butDO NOT compacting them for performance concerns. We provide the optionencoder.CompactMarshaler to add compacting process.

Print Error

If there invalid syntax in input JSON, sonic will returndecoder.SyntaxError, which supports pretty-printing of error position

import"github.com/bytedance/sonic"import"github.com/bytedance/sonic/decoder"vardatainterface{}err:=sonic.UnmarshalString("[[[}]]",&data)iferr!=nil {/* One line by default */println(e.Error())// "Syntax error at index 3: invalid char\n\n\t[[[}]]\n\t...^..\n"/* Pretty print */ife,ok:=err.(decoder.SyntaxError);ok {/*Syntax error at index 3: invalid char            [[[}]]            ...^..        */print(e.Description())    }elseifme,ok:=err.(*decoder.MismatchTypeError);ok {// decoder.MismatchTypeError is new to Sonic v1.6.0print(me.Description())    }}

Mismatched Types [Sonic v1.6.0]

If there amismatch-typed value for a given key, sonic will reportdecoder.MismatchTypeError (if there are many, report the last one), but still skip wrong the value and keep decoding next JSON.

import"github.com/bytedance/sonic"import"github.com/bytedance/sonic/decoder"vardata=struct{AintBint}{}err:=UnmarshalString(`{"A":"1","B":1}`,&data)println(err.Error())// Mismatch type int with value string "at index 5: mismatched type with value\n\n\t{\"A\":\"1\",\"B\":1}\n\t.....^.........\n"fmt.Printf("%+v",data)// {A:0 B:1}

Ast.Node

Sonic/ast.Node is a completely self-contained AST for JSON. It implements serialization and deserialization both and provides robust APIs for obtaining and modification of generic data.

Get/Index

Search partial JSON by given paths, which must be non-negative integer or string, or nil

import"github.com/bytedance/sonic"input:= []byte(`{"key1":[{},{"key2":{"key3":[1,2,3]}}]}`)// no path, returns entire jsonroot,err:=sonic.Get(input)raw:=root.Raw()// == string(input)// multiple pathsroot,err:=sonic.Get(input,"key1",1,"key2")sub:=root.Get("key3").Index(2).Int64()// == 3

Tip: sinceIndex() uses offset to locate data, which is much faster than scanning likeGet(), we suggest you use it as much as possible. And sonic also provides another APIIndexOrGet() to underlying use offset as well as ensure the key is matched.

SearchOption

Searcher provides some options for user to meet different needs:

opts:= ast.SearchOption{CopyReturn:true... }val,err:=sonic.GetWithOptions(JSON,opts,"key")
  • CopyReturnIndicate the searcher to copy the result JSON string instead of refer from the input. This can help to reduce memory usage if you cache the results
  • ConcurentReadSinceast.Node useLazy-Load design, it doesn't support Concurrently-Read by default. If you want to read it concurrently, please specify it.
  • ValidateJSONIndicate the searcher to validate the entire JSON. This option is enabled by default, which slow down the search speed a little.

Set/Unset

Modify the json content by Set()/Unset()

import"github.com/bytedance/sonic"// Setexist,err:=root.Set("key4",NewBool(true))// exist == falsealias1:=root.Get("key4")println(alias1.Valid())// truealias2:=root.Index(1)println(alias1==alias2)// true// Unsetexist,err:=root.UnsetByIndex(1)// exist == trueprintln(root.Get("key4").Check())// "value not exist"

Serialize

To encodeast.Node as json, useMarshalJson() orjson.Marshal() (MUST pass the node's pointer)

import ("encoding/json""github.com/bytedance/sonic")buf,err:=root.MarshalJson()println(string(buf))// {"key1":[{},{"key2":{"key3":[1,2,3]}}]}exp,err:=json.Marshal(&root)// WARN: use pointerprintln(string(buf)==string(exp))// true

APIs

  • validation:Check(),Error(),Valid(),Exist()
  • searching:Index(),Get(),IndexPair(),IndexOrGet(),GetByPath()
  • go-type casting:Int64(),Float64(),String(),Number(),Bool(),Map[UseNumber|UseNode](),Array[UseNumber|UseNode](),Interface[UseNumber|UseNode]()
  • go-type packing:NewRaw(),NewNumber(),NewNull(),NewBool(),NewString(),NewObject(),NewArray()
  • iteration:Values(),Properties(),ForEach(),SortKeys()
  • modification:Set(),SetByIndex(),Add()

Ast.Visitor

Sonic provides an advanced API for fully parsing JSON into non-standard types (neitherstruct notmap[string]interface{}) without using any intermediate representation (ast.Node orinterface{}). For example, you might have the following types which are likeinterface{} but actually notinterface{}:

typeUserNodeinterface {}// the following types implement the UserNode interface.type (UserNullstruct{}UserBoolstruct{Valuebool }UserInt64struct{Valueint64 }UserFloat64struct{Valuefloat64 }UserStringstruct{Valuestring }UserObjectstruct{Valuemap[string]UserNode }UserArraystruct{Value []UserNode })

Sonic provides the following API to returnthe preorder traversal of a JSON AST. Theast.Visitor is a SAX style interface which is used in some C++ JSON library. You should implementast.Visitor by yourself and pass it toast.Preorder() method. In your visitor you can make your custom types to represent JSON values. There may be an O(n) space container (such as stack) in your visitor to record the object / array hierarchy.

funcPreorder(strstring,visitorVisitor,opts*VisitorOptions)errortypeVisitorinterface {OnNull()errorOnBool(vbool)errorOnString(vstring)errorOnInt64(vint64,n json.Number)errorOnFloat64(vfloat64,n json.Number)errorOnObjectBegin(capacityint)errorOnObjectKey(keystring)errorOnObjectEnd()errorOnArrayBegin(capacityint)errorOnArrayEnd()error}

Seeast/visitor.go for detailed usage. We also implement a demo visitor forUserNode inast/visitor_test.go.

Compatibility

For developers who want to use sonic to meet diffirent scenarios, we provide some integrated configs assonic.API

  • ConfigDefault: the sonic's default config (EscapeHTML=false,SortKeys=false...) to run sonic fast meanwhile ensure security.
  • ConfigStd: the std-compatible config (EscapeHTML=true,SortKeys=true...)
  • ConfigFastest: the fastest config (NoQuoteTextMarshaler=true) to run on sonic as fast as possible.SonicDOES NOT ensure to support all environments, due to the difficulty of developing high-performance codes. On non-sonic-supporting environment, the implementation will fall back toencoding/json. Thus beflow configs will all equal toConfigStd.

Tips

Pretouch

Since Sonic usesgolang-asm as a JIT assembler, which is NOT very suitable for runtime compiling, first-hit running of a huge schema may cause request-timeout or even process-OOM. For better stability, we adviseusingPretouch() for huge-schema or compact-memory applications beforeMarshal()/Unmarshal().

import ("reflect""github.com/bytedance/sonic""github.com/bytedance/sonic/option")funcinit() {varvHugeStruct// For most large types (nesting depth <= option.DefaultMaxInlineDepth)err:=sonic.Pretouch(reflect.TypeOf(v))// with more CompileOption...err:=sonic.Pretouch(reflect.TypeOf(v),// If the type is too deep nesting (nesting depth > option.DefaultMaxInlineDepth),// you can set compile recursive loops in Pretouch for better stability in JIT.option.WithCompileRecursiveDepth(loop),// For a large nested struct, try to set a smaller depth to reduce compiling time.option.WithCompileMaxInlineDepth(depth),    )}

Copy string

When decodingstring values without any escaped characters, sonic references them from the origin JSON buffer instead of mallocing a new buffer to copy. This helps a lot for CPU performance but may leave the whole JSON buffer in memory as long as the decoded objects are being used. In practice, we found the extra memory introduced by referring JSON buffer is usually 20% ~ 80% of decoded objects. Once an application holds these objects for a long time (for example, cache the decoded objects for reusing), its in-use memory on the server may go up. -Config.CopyString/decoder.CopyString(): We provide the option forDecode() /Unmarshal() users to choose not to reference the JSON buffer, which may cause a decline in CPU performance to some degree.

  • GetFromStringNoCopy(): For memory safety,sonic.Get() /sonic.GetFromString() now copies return JSON. If users want to get json more quickly and not care about memory usage, you can useGetFromStringNoCopy() to return a JSON directly referenced from source.

Pass string or []byte?

For alignment toencoding/json, we provide API to pass[]byte as an argument, but the string-to-bytes copy is conducted at the same time considering safety, which may lose performance when the origin JSON is huge. Therefore, you can useUnmarshalString() andGetFromString() to pass a string, as long as your origin data is a string ornocopy-cast is safe for your []byte. We also provide APIMarshalString() for convenientnocopy-cast of encoded JSON []byte, which is safe since sonic's output bytes is always duplicated and unique.

Accelerateencoding.TextMarshaler

To ensure data security, sonic.Encoder quotes and escapes string values fromencoding.TextMarshaler interfaces by default, which may degrade performance much if most of your data is in form of them. We provideencoder.NoQuoteTextMarshaler to skip these operations, which means youMUST ensure their output string escaped and quoted followingRFC8259.

Better performance for generic data

Infully-parsed scenario,Unmarshal() performs better thanGet()+Node.Interface(). But if you only have a part of the schema for specific json, you can combineGet() andUnmarshal() together:

import"github.com/bytedance/sonic"node,err:=sonic.GetFromString(_TwitterJson,"statuses",3,"user")varuserUser// your partial schema...err=sonic.UnmarshalString(node.Raw(),&user)

Even if you don't have any schema, useast.Node as the container of generic values instead ofmap orinterface:

import"github.com/bytedance/sonic"root,err:=sonic.GetFromString(_TwitterJson)user:=root.GetByPath("statuses",3,"user")// === root.Get("status").Index(3).Get("user")err=user.Check()// err = user.LoadAll() // only call this when you want to use 'user' concurrently...gosomeFunc(user)

Why? Becauseast.Node stores its children usingarray:

  • Array's performance ismuch better thanMap when Inserting (Deserialize) and Scanning (Serialize) data;
  • Hashing (map[x]) is not as efficient asIndexing (array[x]), whichast.Node can conduct onboth array and object;
  • UsingInterface()/Map() means Sonic must parse all the underlying values, whileast.Node can parse themon demand.

CAUTION:ast.NodeDOESN'T ensure concurrent security directly, due to itslazy-load design. However, you can callNode.Load()/Node.LoadAll() to achieve that, which may bring performance reduction while it still works faster than converting tomap orinterface{}

Ast.Node or Ast.Visitor?

For generic data,ast.Node should be enough for your needs in most cases.

However,ast.Node is designed for partially processing JSON string. It has some special designs such as lazy-load which might not be suitable for directly parsing the whole JSON string likeUnmarshal(). Althoughast.Node is better thenmap orinterface{}, it's also a kind of intermediate representation after all if your final types are customized and you have to convert the above types to your custom types after parsing.

For better performance, in previous case theast.Visitor will be the better choice. It performs JSON decoding likeUnmarshal() and you can directly use your final types to represents a JSON AST without any intermediate representations.

Butast.Visitor is not a very handy API. You might need to write a lot of code to implement your visitor and carefully maintain the tree hierarchy during decoding. Please read the comments inast/visitor.go carefully if you decide to use this API.

Buffer Size

Sonic use memory pool in many places likeencoder.Encode,ast.Node.MarshalJSON to improve performance, which may produce more memory usage (in-use) when server's load is high. Seeissue 614. Therefore, we introduce some options to let user control the behavior of memory pool. Seeoption package.

Faster JSON Skip

For security, sonic useFSM algorithm to validate JSON when decoding raw JSON or encodingjson.Marshaler, which is much slower (1~10x) thanSIMD-searching-pair algorithm. If user has many redundant JSON value and DO NOT NEED to strictly validate JSON correctness, you can enable below options:

  • Config.NoValidateSkipJSON: for faster skipping JSON when decoding, such as unknown fields, json.Unmarshaler(json.RawMessage), mismatched values, and redundant array elements
  • Config.NoValidateJSONMarshaler: avoid validating JSON when encodingjson.Marshaler
  • SearchOption.ValidateJSON: indicates if validate located JSON value whenGet

JSON-Path Support (GJSON)

tidwall/gjson has provided a comprehensive and popular JSON-Path API, anda lot of older codes heavily relies on it. Therefore, we provides a wrapper library, which combines gjson's API with sonic's SIMD algorithm to boost up the performance. Seecloudwego/gjson.

Community

Sonic is a subproject ofCloudWeGo. We are committed to building a cloud native ecosystem.


[8]ページ先頭

©2009-2025 Movatter.jp