- Notifications
You must be signed in to change notification settings - Fork437
One of the fastest alternative JSON parser for Go that does not require schema
License
buger/jsonparser
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
It does not require you to know the structure of the payload (eg. create structs), and allows accessing fields by providing the path to them. It is up to10 times faster than standardencoding/json package (depending on payload size and usage),allocates no memory. See benchmarks below.
Originally I made this for a project that relies on a lot of 3rd party APIs that can be unpredictable and complex.I love simplicity and prefer to avoid external dependecies.encoding/json requires you to know exactly your data structures, or if you prefer to usemap[string]interface{} instead, it will be very slow and hard to manage.I investigated what's on the market and found that most libraries are just wrappers aroundencoding/json, there is few options with own parsers (ffjson,easyjson), but they still requires you to create data structures.
Goal of this project is to push JSON parser to the performance limits and not sacrifice with compliance and developer user experience.
For the given JSON our goal is to extract the user's full name, number of github followers and avatar.
import"github.com/buger/jsonparser"...data:= []byte(`{ "person": { "name": { "first": "Leonid", "last": "Bugaev", "fullName": "Leonid Bugaev" }, "github": { "handle": "buger", "followers": 109 }, "avatars": [ { "url": "https://avatars1.githubusercontent.com/u/14009?v=3&s=460", "type": "thumbnail" } ] }, "company": { "name": "Acme" }}`)// You can specify key path by providing arguments to Get functionjsonparser.Get(data,"person","name","fullName")// There is `GetInt` and `GetBoolean` helpers if you exactly know key data typejsonparser.GetInt(data,"person","github","followers")// When you try to get object, it will return you []byte slice pointer to data containing it// In `company` it will be `{"name": "Acme"}`jsonparser.Get(data,"company")// If the key doesn't exist it will throw an errorvarsizeint64ifvalue,err:=jsonparser.GetInt(data,"company","size");err==nil {size=value}// You can use `ArrayEach` helper to iterate items [item1, item2 .... itemN]jsonparser.ArrayEach(data,func(value []byte,dataType jsonparser.ValueType,offsetint,errerror) {fmt.Println(jsonparser.Get(value,"url"))},"person","avatars")// Or use can access fields by index!jsonparser.GetString(data,"person","avatars","[0]","url")// You can use `ObjectEach` helper to iterate objects { "key1":object1, "key2":object2, .... "keyN":objectN }jsonparser.ObjectEach(data,func(key []byte,value []byte,dataType jsonparser.ValueType,offsetint)error {fmt.Printf("Key: '%s'\n Value: '%s'\n Type: %s\n",string(key),string(value),dataType)returnnil},"person","name")// The most efficient way to extract multiple keys is `EachKey`paths:= [][]string{ []string{"person","name","fullName"}, []string{"person","avatars","[0]","url"}, []string{"company","url"},}jsonparser.EachKey(data,func(idxint,value []byte,vt jsonparser.ValueType,errerror){switchidx {case0:// []string{"person", "name", "fullName"}...case1:// []string{"person", "avatars", "[0]", "url"}...case2:// []string{"company", "url"},... }},paths...)// For more information see docs below
Library API is really simple. You just need theGet method to perform any operation. The rest is just helpers around it.
You also can view API atgodoc.org
funcGet(data []byte,keys...string) (value []byte,dataType jsonparser.ValueType,offsetint,errerror)
Receives data structure, and key path to extract value from.
Returns:
value- Pointer to original data structure containing key value, or just empty slice if nothing found or errordataType- Can be:NotExist,String,Number,Object,Array,BooleanorNulloffset- Offset from provided data structure where key value ends. Used mostly internally, for example forArrayEachhelper.err- If the key is not found or any other parsing issue, it should return error. If key not found it also setsdataTypetoNotExist
Accepts multiple keys to specify path to JSON value (in case of quering nested structures).If no keys are provided it will try to extract the closest JSON value (simple ones or object/array), useful for reading streams or arrays, seeArrayEach implementation.
Note that keys can be an array indexes:jsonparser.GetInt("person", "avatars", "[0]", "url"), pretty cool, yeah?
funcGetString(data []byte,keys...string) (valstring,errerror)
Returns strings properly handing escaped and unicode characters. Note that this will cause additional memory allocations.
If you need string in your app, and ready to sacrifice with support of escaped symbols in favor of speed. It returns string mapped to existing byte slice memory, without any allocations:
s,_, :=jsonparser.GetUnsafeString(data,"person","name","title")switchs {case 'CEO':...case 'Engineer'......}
Note thatunsafe here means that your string will exist until GC will free underlying byte slice, for most of cases it means that you can use this string only in current context, and should not pass it anywhere externally: through channels or any other way.
funcGetBoolean(data []byte,keys...string) (valbool,errerror)funcGetFloat(data []byte,keys...string) (valfloat64,errerror)funcGetInt(data []byte,keys...string) (valint64,errerror)
If you know the key type, you can use the helpers above.If key data type do not match, it will return error.
funcArrayEach(data []byte,cbfunc(value []byte,dataType jsonparser.ValueType,offsetint,errerror),keys...string)
Needed for iterating arrays, accepts a callback function with the same return arguments asGet.
funcObjectEach(data []byte,callbackfunc(key []byte,value []byte,dataTypeValueType,offsetint)error,keys...string) (errerror)
Needed for iterating object, accepts a callback function. Example:
varhandlerfunc([]byte, []byte, jsonparser.ValueType,int)errorhandler=func(key []byte,value []byte,dataType jsonparser.ValueType,offsetint)error {//do stuff here}jsonparser.ObjectEach(myJson,handler)
funcEachKey(data []byte,cbfunc(idxint,value []byte,dataType jsonparser.ValueType,errerror),paths...[]string)
When you need to read multiple keys, and you do not afraid of low-level APIEachKey is your friend. It read payload only single time, and calls callback function once path is found. For example when you call multiple timesGet, it has to process payload multiple times, each time you call it. Depending on payloadEachKey can be multiple times faster thanGet. Path can use nested keys as well!
paths:= [][]string{[]string{"uuid"},[]string{"tz"},[]string{"ua"},[]string{"st"},}vardataSmallPayloadjsonparser.EachKey(smallFixture,func(idxint,value []byte,vt jsonparser.ValueType,errerror){switchidx {case0:data.Uuid,_=valuecase1:v,_:=jsonparser.ParseInt(value)data.Tz=int(v)case2:data.Ua,_=valuecase3:v,_:=jsonparser.ParseInt(value)data.St=int(v)}},paths...)
funcSet(data []byte,setValue []byte,keys...string) (value []byte,errerror)
Receives existing data structure, key path to set, and value to set at that key.This functionality is experimental.
Returns:
value- Pointer to original data structure with updated or added key value.err- If any parsing issue, it should return error.
Accepts multiple keys to specify path to JSON value (in case of updating or creating nested structures).
Note that keys can be an array indexes:jsonparser.Set(data, []byte("http://github.com"), "person", "avatars", "[0]", "url")
funcDelete(data []byte,keys...string)value []byte
Receives existing data structure, and key path to delete.This functionality is experimental.
Returns:
value- Pointer to original data structure with key path deleted if it can be found. If there is no key path, then the whole data structure is deleted.
Accepts multiple keys to specify path to JSON value (in case of updating or creating nested structures).
Note that keys can be an array indexes:jsonparser.Delete(data, "person", "avatars", "[0]", "url")
- It does not rely on
encoding/json,reflectionorinterface{}, the only real package dependency isbytes. - Operates with JSON payload on byte level, providing you pointers to the original data structure: no memory allocation.
- No automatic type conversions, by default everything is a []byte, but it provides you value type, so you can convert by yourself (there is few helpers included).
- Does not parse full record, only keys you specified
There are 3 benchmark types, trying to simulate real-life usage for small, medium and large JSON payloads.For each metric, the lower value is better. Time/op is in nanoseconds. Values better than standard encoding/json marked as bold text.Benchmarks run on standard Linode 1024 box.
Compared libraries:
- https://golang.org/pkg/encoding/json
- https://github.com/Jeffail/gabs
- https://github.com/a8m/djson
- https://github.com/bitly/go-simplejson
- https://github.com/antonholmquist/jason
- https://github.com/mreiferson/go-ujson
- https://github.com/ugorji/go/codec
- https://github.com/pquerna/ffjson
- https://github.com/mailru/easyjson
- https://github.com/buger/jsonparser
If you want to skip next sections we have 2 winner:jsonparser andeasyjson.jsonparser is up to 10 times faster than standardencoding/json package (depending on payload size and usage), and almost infinitely (literally) better in memory consumption because it operates with data on byte level, and provide direct slice pointers.easyjson wins in CPU in medium tests and frankly i'm impressed with this package: it is remarkable results considering that it is almost drop-in replacement forencoding/json (require some code generation).
It's hard to fully comparejsonparser andeasyjson (orffson), they a true parsers and fully process record, unlikejsonparser which parse only keys you specified.
If you searching for replacement ofencoding/json while keeping structs,easyjson is an amazing choice. If you want to process dynamic JSON, have memory constrains, or more control over your data you should tryjsonparser.
jsonparser performance heavily depends on usage, and it works best when you do not need to process full record, only some keys. The more calls you need to make, the slower it will be, in contrasteasyjson (orffjson,encoding/json) parser record only 1 time, and then you can make as many calls as you want.
With great power comes great responsibility! :)
Each test processes 190 bytes of http log as a JSON record.It should read multiple fields.https://github.com/buger/jsonparser/blob/master/benchmark/benchmark_small_payload_test.go
| Library | time/op | bytes/op | allocs/op |
|---|---|---|---|
| encoding/json struct | 7879 | 880 | 18 |
| encoding/json interface{} | 8946 | 1521 | 38 |
| Jeffail/gabs | 10053 | 1649 | 46 |
| bitly/go-simplejson | 10128 | 2241 | 36 |
| antonholmquist/jason | 27152 | 7237 | 101 |
| github.com/ugorji/go/codec | 8806 | 2176 | 31 |
| mreiferson/go-ujson | 7008 | 1409 | 37 |
| a8m/djson | 3862 | 1249 | 30 |
| pquerna/ffjson | 3769 | 624 | 15 |
| mailru/easyjson | 2002 | 192 | 9 |
| buger/jsonparser | 1367 | 0 | 0 |
| buger/jsonparser (EachKey API) | 809 | 0 | 0 |
Winners are ffjson, easyjson and jsonparser, where jsonparser is up to 9.8x faster than encoding/json and 4.6x faster than ffjson, and slightly faster than easyjson.If you look at memory allocation, jsonparser has no rivals, as it makes no data copy and operates with raw []byte structures and pointers to it.
Each test processes a 2.4kb JSON record (based on Clearbit API).It should read multiple nested fields and 1 array.
https://github.com/buger/jsonparser/blob/master/benchmark/benchmark_medium_payload_test.go
| Library | time/op | bytes/op | allocs/op |
|---|---|---|---|
| encoding/json struct | 57749 | 1336 | 29 |
| encoding/json interface{} | 79297 | 10627 | 215 |
| Jeffail/gabs | 83807 | 11202 | 235 |
| bitly/go-simplejson | 88187 | 17187 | 220 |
| antonholmquist/jason | 94099 | 19013 | 247 |
| github.com/ugorji/go/codec | 114719 | 6712 | 152 |
| mreiferson/go-ujson | 56972 | 11547 | 270 |
| a8m/djson | 28525 | 10196 | 198 |
| pquerna/ffjson | 20298 | 856 | 20 |
| mailru/easyjson | 10512 | 336 | 12 |
| buger/jsonparser | 15955 | 0 | 0 |
| buger/jsonparser (EachKey API) | 8916 | 0 | 0 |
The difference between ffjson and jsonparser in CPU usage is smaller, while the memory consumption difference is growing. On the other handeasyjson shows remarkable performance for medium payload.
gabs,go-simplejson andjason are based on encoding/json and map[string]interface{} and actually only helpers for unstructured JSON, their performance correlate withencoding/json interface{}, and they will skip next round.go-ujson while have its own parser, shows same performance asencoding/json, also skips next round. Same situation withugorji/go/codec, but it showed unexpectedly bad performance for complex payloads.
Each test processes a 24kb JSON record (based on Discourse API)It should read 2 arrays, and for each item in array get a few fields.Basically it means processing a full JSON file.
https://github.com/buger/jsonparser/blob/master/benchmark/benchmark_large_payload_test.go
| Library | time/op | bytes/op | allocs/op |
|---|---|---|---|
| encoding/json struct | 748336 | 8272 | 307 |
| encoding/json interface{} | 1224271 | 215425 | 3395 |
| a8m/djson | 510082 | 213682 | 2845 |
| pquerna/ffjson | 312271 | 7792 | 298 |
| mailru/easyjson | 154186 | 6992 | 288 |
| buger/jsonparser | 85308 | 0 | 0 |
jsonparser now is a winner, but do not forget that it is way more lightweight parser thanffson oreasyjson, and they have to parser all the data, whilejsonparser parse only what you need. Allffjson,easysjon andjsonparser have their own parsing code, and does not depend onencoding/json orinterface{}, thats one of the reasons why they are so fast.easyjson also use a bit ofunsafe package to reduce memory consuption (in theory it can lead to some unexpected GC issue, but i did not tested enough)
Also last benchmark did not includedEachKey test, because in this particular case we need to read lot of Array values, and usingArrayEach is more efficient.
All bug-reports and suggestions should go though Github Issues.
- Fork it
- Create your feature branch (git checkout -b my-new-feature)
- Commit your changes (git commit -am 'Added some feature')
- Push to the branch (git push origin my-new-feature)
- Create new Pull Request
All my development happens using Docker, and repo include some Make tasks to simplify development.
make build- builds docker image, usually can be called only oncemake test- run testsmake fmt- run go fmtmake bench- run benchmarks (if you need to run only single benchmark modifyBENCHMARKvariable in make file)make profile- runs benchmark and generate 3 files-cpu.out,mem.mprofandbenchmark.testbinary, which can be used forgo tool pprofmake bash- enter container (i use it for runninggo tool pprofabove)
About
One of the fastest alternative JSON parser for Go that does not require schema
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.