- Notifications
You must be signed in to change notification settings - Fork9
Stack only json deserialization using generators and the System.Text.Json library
License
TomaszRewak/C-sharp-stack-only-json-parser
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
The StackOnlyJsonParser combines the System.Text.Json library with C# 9 code generators to allow for fast and GC-friendly JSON deserialization.
It's intended mostly for the low latency and real time systems that have to deal with big data flows under a requirement of a small memory footprint.
A short write up of the project can be foundon my blog.
This library depends on the C# 9 code generators available with the .NET 5.0.
<PropertyGroup>...<TargetFramework>net5.0</TargetFramework>...</PropertyGroup>To install the package in your project simply use the following command:
dotnet add package StackOnlyJsonParserThe StackOnlyJsonParser will not cooperate with just anyclass. In fact it requires you to define each entity as areadonly ref partial struct (that's a mouthful). Only this way it can ensure that the deserialization process can be performed without unnecessary allocations.
This requirement implies that deserialized objects cannot be persisted and have to be "consumed" immediately (either by copying their state to a pre-allocated memory or by performing the data processing in place). This limitation should be the main factor when deciding if the StackOnlyJsonParser is a good fit for your project.
Each entity you want to be able to deserialize has to be marked with the[StackOnlyJsonType] attribute.
[StackOnlyJsonType]internalreadonlyrefpartialstruct Product{publicintId{get;}publicstringName{get;}publicdoublePrice{get;}}
The code generator will automatically create a correspondingpartial struct that contains constructors used for the data deserialization:
publicProduct(ReadOnlySpan<byte> jsonData);publicProduct(ReadOnlySequence<byte> jsonData);publicProduct(ref System.Text.Json.Utf8JsonReader jsonReader);
It's important to note that the StackOnlyJsonParser only supports Utf8-encoded data sources.
With that code being auto-generated for us, we can deserialize new object in the following way:
ReadOnlySpan<byte>data= ...var product=newProduct(data);
By default, when deserializing the data, the StackOnlyJsonParser will only look for an exact match between the JSON field name and the model property name (case sensitive). If no match is found, the StackOnlyJsonParser will skip the field during the deserialization process.To specify custom JSON field names one can use the[StackOnlyJsonField] attribute:
[StackOnlyJsonType]internalreadonlyrefpartialstruct Product{[StackOnlyJsonField("product-name","productName","ProductName")]publicstringProductName{get;}...}
Entities can hold not only fields of standard types, but also fields of custom types:
[StackOnlyJsonType]internalreadonlyrefpartialstruct Price{publicdecimalValue{get;}publicstringCurrency{get;}}[StackOnlyJsonType]internalreadonlyrefpartialstruct Product{publicintId{get;}publicstringName{get;}public Price Price{get;}}
It's not required for the type of a nested message to use the[StackOnlyJsonType] attribute. The only requirement is for that type to define a constructor that accepts a singleref System.Text.Json.Utf8JsonReader parameter.
It's even possible to combine the StackOnlyJsonParser with theSystem.Text.Json library to deserialize persistable objects, while avoiding the allocation of an underlying collection.
All fields of basic types can be made nullable by using the standard? notation.
[StackOnlyJsonType]internalreadonlyrefpartialstruct Product{publicint?Id{get;}...}
In that case, the field will be given the default value ofnull and will be able to handle anull value in the deserialized data.
Unfortunately, asref structs cannot be used as generic type parameters, the language prohibits us from making them nullable. Because of that, apart from constructors, the StackOnlyJsonParser also adds aHasValue field to the generatedpartial struct code. Iffalse, the field of a given type was either not present or was explicitly set tonull.
If theHasValue property comes into a conflict with one of the existing fields, the conflicting field should be renamed and the[StackOnlyJsonField("HasValue")] attribute used to assign it the proper serialization name.
As theList<> and theDictionary<,> types do not follow the requirements mentioned before, they cannot be used as field types.
Instead, each collection type has to be defined separately using the[StackOnlyJsonArray] or the[StackOnlyJsonDictionary] attributes.
[StackOnlyJsonType]internalreadonlyrefpartialstruct Price{publicdecimalValue{get;}publicstringCurrency{get;}}[StackOnlyJsonDictionary(typeof(string),typeof(Price))]internalreadonlyrefpartialstruct RegionPriceDictionary{}[StackOnlyJsonArray(typeof(int))]internalreadonly ref partialstruct Sizes{}[StackOnlyJsonType]internalreadonly ref partialstruct Product{publicstringName{get;}publicRegionPriceDictionaryPrices{get;}publicSizesSizes{get;}}
Similarly to the[StackOnlyJsonType] attribute, the[StackOnlyJsonDictionary] and the[StackOnlyJsonArray] attributes will enrich the given types with a proper constructors allowing for data deserialization.
They will also provide an implementation of theGetEnumerator and theAny methods, allowing for easy enumeration over elements using the standardforeach statement:
varproduct=newProduct(data);foreach(varpriceinproduct.Prices)Console.WriteLine($"Region:{price.Key}, Price:{price.Value.Value}{price.Value.Currency}");foreach(varsizeinproduct.Sizes)Console.WriteLine($"Size:{size}");
The collection types can also be used to directly deserialize the data, if the outer type of that data is of a collection type:
vardata=Encode("[1, 2, 3]");varsizes=newSizes(data);
If limiting the number of allocations is of the utmost importance to you, instead of using theSystem.String type when defining your models, you can use theStackOnlyJsonParser.StackOnlyJsonString type instead. It's a non-allocating wrapper over theUtf8JsonReader that allows you to easily compare the stored string data with a provided value.
Considering that string values in your deserialized data will most likely be very short lived objects, and that creation of the StackOnlyJsonString requires making a copy of theUtf8JsonReader (which is a relatively big struct), using theStackOnlyJsonString can have a negative performance impact as compared to the standardstring. Nevertheless, it can help you achieve a truly zero-allocation memory profile.
By definitionstructs cannot have cycles in their layouts as that would lead to them having an infinite size. Nevertheless, the StackOnlyJsonParser allows for defining recursive models by the use of lazy loading. It works similarly to the collections - you might think of lazy loaders as collections with only one element. To define a lazy loader use the[StackOnlyJsonLazyLoader] attribute:
[StackOnlyJsonType]internalreadonlyrefpartialstruct RecursiveType{publicintId{get;}public RecursiveTypeLazyLoader Internal{get;}}[StackOnlyJsonLazyLoader(typeof(RecursiveType))]internal readonlyref partialstruct RecursiveTypeLazyLoader{}
Now you can deserialize the data in the following way:
internalvoidProcess(RecursiveTypemodel){Console.WriteLine(model.Id);if(model.Internal.HasValue)Process(model.Internal.Load());}
vardata=Encode(@"{ ""Id"": 1, ""Internal"": { ""Internal"": { ""Id"": 3 }, ""Id"": 2 } }");Process(newRecursiveType(data));
TheLoad method creates and deserializes the new object ad hoc based on the previously cached position of the json tokenizer.
The deserialization of simple and custom message types is rather straightforward. The generated constructors use the providedUtf8JsonReader as a token provider for field deserialization.
The real clue of the idea behind this library comes in a form of collections. Whenever one of them is encountered, the deserialization code skips the entire block, only remembering its bounds. The consecutive elements will be deserialized ad-hoc within theforeach loop when requested. Thanks to this only one element of the collection is alive at one time and the entire process can be performed entirely on the stack with no heap allocations. That can be especially important in case of big collections, which if allocated, could travel across GC generations.
An example of a generated array deserializer:
usingSystem;usingSystem.Buffers;usingSystem.Text.Json;namespaceStackOnlyJsonParser.Example{internalreadonlyrefpartialstruct ProductArray{privatereadonlyUtf8JsonReader _jsonReader;publicreadonlybool HasValue{get;}publicProductArray(ReadOnlySpan<byte>jsonData):this(newUtf8JsonReader(jsonData,newJsonReaderOptions{CommentHandling=JsonCommentHandling.Skip})){}public ProductArray(ReadOnlySequence<byte>jsonData):this(newUtf8JsonReader(jsonData,newJsonReaderOptions{CommentHandling=JsonCommentHandling.Skip})){}private ProductArray(Utf8JsonReaderjsonReader):this(refjsonReader){}public ProductArray(refUtf8JsonReaderjsonReader){if(jsonReader.TokenType!=JsonTokenType.StartArray&&jsonReader.TokenType!=JsonTokenType.Null) jsonReader.Read();switch(jsonReader.TokenType){caseJsonTokenType.StartArray:HasValue=true;_jsonReader=jsonReader;_jsonReader.Read();jsonReader.Skip();break;caseJsonTokenType.Null:HasValue=false;_jsonReader=default;break;default:thrownewJsonException($""Expected '[', but got {jsonReader.TokenType}"");}}publicboolAny()=>HasValue&&_jsonReader.TokenType!=JsonTokenType.EndArray;publicEnumeratorGetEnumerator()=>newEnumerator(_jsonReader);publicrefstructEnumerator{privateUtf8JsonReader_jsonReader;publicEnumerator(inUtf8JsonReaderjsonReader){_jsonReader=jsonReader;Current=default;}publicProductCurrent{get;privateset;}publicboolMoveNext(){if(_jsonReader.TokenType==JsonTokenType.EndArray||_jsonReader.TokenType==JsonTokenType.None)returnfalse;Current=newProduct(_jsonReader);_jsonReader.Read();returntrue;}}}}
Below you can find the results of the performance tests defined in theStackOnlyJsonParser.PerformanceTests project.
In short, each framework was given a serialized json data containing a list of objects with the following definition:
internalclassProduct{publicstringName{get;set;}publicDateTimeProductionDate{get;set;}publicSizeBoxSize{get;set;}publicintAvailableItems{get;set;}publicList<string>Colors{get;set;}publicDictionary<string,Price>Regions{get;set;}}internalclassSize{publicdoubleWidth{get;set;}publicdoubleHeight{get;set;}publicdoubleDepth{get;set;}}internalclassPrice{publicstringCurrency{get;set;}publicdecimalValue{get;set;}}
In case of the StackOnlyJsonParser and the System.Text.Json library, the data was encoded as a UTF8 byte array. The Newtonsoft parser was provided with a string representation.
As the StackOnlyJsonParser loads the data ad hoc, the test included a simple data aggregation task that was performed on data generated by each library.
The StackOnlyJsonParser was profiled with both the standardstring type, as well as theStackOnlyJsonString type as the underlying text representation.
Please note that the processing time of small messages is higher for the StackOnlyJsonParser than for alternative libraries. After all the StackOnlyJsonParser needs to iterate through the entire message multiple times in order to perform lazy loading of arrays and dictionaries. The performance gain when processing bigger messages comes mostly from the fact that the StackOnlyJsonParser doesn't have to perform additional allocations when creating those collections. So if performance in processing of small messages is your main concern, you might want to consider using alternative parsers. But if you main focus is on the memory footprint of the deserialization process in case of big messages, the StackOnlyJsonParser might be a good choice for you.
About
Stack only json deserialization using generators and the System.Text.Json library
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.


