- Notifications
You must be signed in to change notification settings - Fork2
A reimagined scala-pickling in the Scala 3 world
License
jsuereth/sauerkraut
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
The library for those cabbage lovers out there who wantto send data over the wire.
A revitalization ofPickling in theScala 3 world.
When defining over-the-wire messages, do this:
importsauerkraut.core.{Buildable,Writer,given}caseclassMyMessage(field:String,data:Int)derivesBuildable,Writer
Then, when you need to serialize, pick a format and go:
importformat.json.{Json,given}importsauerkraut.{pickle,read,write}valout=StringWriter()pickle(Json).to(out).write(MyMessage("test",1))println(out.toString())valmsg= pickle(Json).from(out.toString()).read[MyMessage]
Here's a feature matrix for each format:
Format | Reader | Writer | All Types | Evolution Friendly | Notes |
---|---|---|---|---|---|
Json | Yes | Yes | Yes | Yes | Uses Jawn for parsing |
Protos | Yes | Yes | Yes | Yes | Binary format evolution friendly format |
NBT | Yes | Yes | Yes | For the kids. | |
XML | Yes | Yes | Yes | Inefficient prototype. | |
Pretty | No | Yes | No | For pretty-printing strings |
SeeCompliance for more details on what this means.
Everyone's favorite non-YAML web data transfer format! This uses Jawn under the covers for parsing, butcan write Json without any dependencies.
Example:
importsauerkraut.{pickle,read,write}importsauerkraut.core.{Buildable,Writer,given}importsauerkraut.format.json.JsoncaseclassMyWebData(value:Int,someStuff:Array[String])derivesBuildable,Writerdefread(in: java.io.InputStream):MyWebData= pickle(Json).from(in).read[MyWebData]defwrite(out: java.io.OutputStream):Unit= pickle(Json).to(out).write(MyWebData(1214,Array("this","is","a","test")))
sbt build:
libraryDependencies+="com.jsuereth.sauerkraut"%%"json"%"<version>"
Seejson project for more information.
A new encoding for protocol buffers within Scala! This supports a subset of all possible protocol buffer messagesbut allows full definition of the message format within your Scala code.
Example:
importsauerkraut.{pickle,write,read,Field}importsauerkraut.core.{Writer,Buildable,given}importsauerkraut.format.pb.{Proto,,given}caseclassMyMessageData(value:Int@Field(3),someStuff:Array[String]@Field(2))derivesWriter,Buildabledefwrite(out: java.io.OutputStream):Unit= pickle(Proto).to(out).write(MyMessageData(1214,Array("this","is","a","test")))
This example serializes to the equivalent of the following protocol buffer message:
messageMyMessageData {int32value=3;repeatedstringsomeStuff=2;}
sbt build:
libraryDependencies+="com.jsuereth.sauerkraut"%%"pb"%"<version>"
Seepb project for more information.
Named-Binary-Tags, a format popularized by Minecraft.
Example:
importsauerkraut.{pickle,read,write}importsauerkraut.core.{Buildable,Writer,given}importsauerkraut.format.nbt.NbtcaseclassMyGameData(value:Int,someStuff:Array[String])derivesBuildable,Writerdefread(in: java.io.InputStream):MyGameData= pickle(Nbt).from(in).read[MyGameData]defwrite(out: java.io.OutputStream):Unit= pickle(Nbt).to(out).write(MyGameData(1214,Array("this","is","a","test")))
sbt build:
libraryDependencies+="com.jsuereth.sauerkraut"%%"nbt"%"<version>"
Seenbt project for more information.
Everyone's favorite markup language for data transfer!
Example:
importsauerkraut.{pickle,read,write}importsauerkraut.core.{Buildable,Writer,given}importsauerkraut.format.xml.{Xml,given}caseclassMySlowWebData(value:Int,someStuff:Array[String])derivesBuildable,Writerdefread(in: java.io.InputStream):MySlowWebData= pickle(Xml).from(in).read[MySlowWebData]defwrite(out: java.io.Writer):Unit= pickle(Xml).to(out).write(MySlowWebData(1214,Array("this","is","a","test")))
sbt build:
libraryDependencies+="com.jsuereth.sauerkraut"%%"xml"%"<version>"
Seexml project for more information.
A format that is solely used to pretty-print object contents to strings. This does not havea [PickleReader] only a [PickleWriter].
Example:
importsauerkraut._,sauerkraut.core.{Writer,given}caseclassMyAwesomeData(theBest:Int,theCoolest:String)derivesWriterscala>MyAwesomeData(1,"The Greatest").prettyPrintvalres0:String=Struct(rs$line$2.MyAwesomeData) {theBest:1theCoolest:TheGreatest}
We split Serialization into three layers:
- The
source
layer. It is expected these are some kind of stream. - The
Format
layer. This is responsible for reading a raw source and converting intothe component types used in theShape
layer. SeePickleReader
andPickleWriter
. - The
Shape
layer. This is responsible for turning Primitives, Structs, Choices and Collectionsinto component types.
It's the circle of data:
Source => format => shape => memory => shape => format => Destination [PickleData] => PickleReader => Builder[T] => T => Writer[T] => PickleWriter => [PickleData]
This, hopefully, means we can reuse a lot of logic betwen various formats with light loss to efficiency.
Note: This library is not measuring performance yet.
The Shape layer is responsible for extracting Scala types into known shapes that can be used forserialization. These shapes, current, areCollection
,Structure
andPrimitive
. Customshapes can be created in terms of these three shapes.
The Shape layer defines these three classes:
sauerkraut.core.Writer[T]
:Can translate a value into write* calls of Primitive, Structure or Collection.sauerkraut.core.Builder[T]
:
Can accept an incomiing stream of collections/structures/primitives and build a value of T from them.sauerkraut.core.Buildable[T]
:Can provide aBuilder[T]
when asked.
The format layer is responsible for mapping sauerkraut shapes (Collection
,Structure
,Primitive
,Choice
) intothe underlying format. Not all shapes in sauerkraut will map exactly to underlying formats, and so eachformat may need to adjust/tweak incoming data as appropriate.
The format layer has these primary classes:
sauerkraut.format.PickleReader
: Can load data and push it into a Builder of type Tsauerkraut.format.PickleWriter
: Accepts pushed structures/collections/primitives and places it into a Pickle
Thesource
layer is allowed to be any type that a format wishes to support. Inputs and outputs areprovided to the API via these two classes:
sauerkraut.format.PickleReaderSupport[Input, Format]
:A given of this instance will allow thePickleReader
to be constructed from a type of input.sauerkraut.format.PickleWriterSupport[Output,Format]
:A given of this instance will allowPickleWriter
to be constructed from a type of output.
This layer is designed to support any type of input and output, not just an in-memory store (like a Json Ast) ora streaming input. Formats can define what types of input/output (or execution environment) they allow.
New formats are expected to provide the "format" + "source" layer implementations they require.
TODO - a bit more here.
There are a few major differences from the oldscala pickling project.
- The core library is built for 100% static code generation. While we think that dynamic (i.e. runtime-reflection-based)pickling could be built using this library, it is a non-goal.
- Users are expected to rely on typeclass derivation to generate Reader/Writers, rather than using macros
- The supported types that can be pickled are limited to the same supported by typeclass derivation or thatcan have hand-written
Writer[_]
/Builder[_]
instances.
- Readers are no longer driven by the Scala type. Instead we use a new
Buildable[A]
/Builder[A}
designto allow eachPickleReader
to push value into aBuilder[A]
that will then construct the scala class. - There have been no runtime performance optimisations around codegen. Those will come as we test thelimits of Scala 3 / Dotty.
- Format implementations are separate libraries.
- The
PickleWriter
contract has been split into several types to avoid misuse. This places a heavier amountof lambdas in play, but may be offsite with optimisations in modern versions of Scala/JVM. - The name is more German.
Benchmarking is still being built-out, and is pending the final design on Choice/Sum-Types within the Format/Shape layer.
You can see benchmark results via: benchmarks/jmh:run -rf csv
.
Latest status/analysis can be found in thebenchmarks directory.
- Basic comparison of all formats
- Size-of-Pickle measurement
- Well-thought out dataset for reading/writing
- Isolated read vs. write testing
- Comparison against other frameworks.
- Protos vs. protocol buffer java implementation
- Json Reading vs. raw JAWN to AST (measure overhead)
- Jackson
- Kryo
- Thrift
- Circe
- uPickle
- Automatic well-formatted graph dump in Markdown of results.
Thanks to everyone who contributed to the original pickling library for inspiration, with a few callouts.
- Heather Miller + Philipp Haller for the original idea, innovation and motivation for Scala.
- Havoc Pennington + Eugene Yokota for helping define what's important when pickling a protocol and evolving that protocol.
About
A reimagined scala-pickling in the Scala 3 world
Topics
Resources
License
Code of conduct
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors2
Uh oh!
There was an error while loading.Please reload this page.