Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Jawn is for parsing jay-sawn (JSON)

NotificationsYou must be signed in to change notification settings

typelevel/jawn

Repository files navigation

"Jawn is for parsing jay-sawn."

Origin

The term "jawn" comes from the Philadelphia area. It conveys about asmuch information as "thing" does. I chose the name because I had movedto Montreal so I was remembering Philly fondly. Also, there isn't abetter way to describe objects encoded in JSON than "things". Finally,we get a catchy slogan.

Jawn was designed to parse JSON into an AST as quickly as possible.

Latest version

Overview

Jawn consists of three parts:

  1. A fast, generic JSON parser (jawn-parser)
  2. A small, somewhat anemic AST (jawn-ast)
  3. A few helpful utilities (jawn-util)

Currently Jawn is competitive with the fastest Java JSON libraries(GSON and Jackson) and in the author's benchmarks it often wins. Itseems to be faster than any other Scala parser that exists (as of July2014).

Given the plethora of really nice JSON libraries for Scala, theexpectation is that you're probably here forjawn-parser or asupport package.

Quick Start

Jawn supports Scala 2.12, 2.13, and 3 on the JVM and Scala.js. Scala2.12 and 2.13 are supported on Scala Native.

Here's abuild.sbt snippet that shows you how to depend on Jawn inyour own sbt project:

// use this if you just want jawn's parser, and will implement your own facadelibraryDependencies+="org.typelevel"%%"jawn-parser"%"1.3.2"// use this if you want jawn's parser and also jawn's astlibraryDependencies+="org.typelevel"%%"jawn-ast"%"1.3.2"

If you want to use Jawn's parser with another project's AST, see the"Supporting external ASTs with Jawn" section. There are a few reasonsyou might want to do this:

  • The library's built-in parser is significantly slower than Jawn's.
  • Jawn supports more input types (ByteBuffer,File, etc.).
  • You need asynchronous JSON parsing.

Dependencies

jawn-parser has no dependencies other than Scala.

jawn-ast depends onjawn-parser but nothing else.

Parsing

Jawn's parser is both fast and relatively featureful. Assuming youwant to get back an AST of typeJ and you have aFacade[J]defined, you can use the followingparse signatures:

Parser.parseUnsafe[J](String)JParser.parseFromString[J](String)Try[J]Parser.parsefromPath[J](String)Try[J]Parser.parseFromFile[J](File)Try[J]Parser.parseFromChannel[J](ReadableByteChannel)Try[J]Parser.parseFromByteBuffer[J](ByteBuffer)Try[J]

Jawn also supports asynchronous parsing, which allows users to feedthe parser with data as it is available. There are three modes:

  • SingleValue waits to return a singleJ value once parsing is done.
  • UnwrapArray if the top-level element is an array, return values as they become available. SetmultiValue totrueif you want to support multiple top level arrays.
  • ValueStream parse one-or-more json values separated by whitespace.

Here's an example:

importorg.typelevel.jawn.astimportorg.typelevel.jawn.AsyncParserimportorg.typelevel.jawn.ParseExceptionvalp= ast.JParser.async(mode=AsyncParser.UnwrapArray)defchunks:Stream[String]=???defsink(j: ast.JValue):Unit=???defloop(st:Stream[String]):Either[ParseException,Unit]=  stmatch {case s#:: tail=>      p.absorb(s)match {caseRight(js)=>          js.foreach(sink)          loop(tail)caseLeft(e)=>Left(e)      }case _=>      p.finish().right.map(_.foreach(sink))  }loop(chunks)

You can also callParser.async[J] to use async parsing with anarbitrary data type (provided you also have an implicitFacade[J]).

Supporting external ASTs with Jawn

Circe

circe is supported via itscirce-parser module.

Argonaut

argonaut is supported via itsargonaut-jawn module.

Do-It-Yourself Parsing

Jawn supports building any JSON AST you need via type classes. Youbenefit from Jawn's fast parser while still using your favorite ScalaJSON library. This mechanism is also what allows Jawn to provide"support" for other libraries' ASTs.

To include Jawn's parser in your project, add the followingsnippet to yourbuild.sbt file:

resolvers+=Resolver.sonatypeRepo("releases")libraryDependencies+="org.typelevel"%%"jawn-parser"%"1.3.2"

To support your AST of choice, you'll want to define aFacade[J]instance, where theJ type parameter represents the base of your JSONAST. For example, here's a facade that supports Spray:

importspray.json._objectSprayextendsSimpleFacade[JsValue] {defjnull()=JsNulldefjfalse()=JsFalsedefjtrue()=JsTruedefjnum(s:String)=JsNumber(s)defjint(s:String)=JsNumber(s)defjstring(s:String)=JsString(s)defjarray(vs:List[JsValue])=JsArray(vs)defjobject(vs:Map[String,JsValue])=JsObject(vs)}

Most ASTs will be easy to define using theSimpleFacade orMutableFacade traits. However, if an ASTs object or array instancesdo more than just wrap a Scala collection, it may be necessary toextendFacade directly.

ExtendSupportParser[J], supplying your facade as the abstractfacade, to get convenient methods for parsing various input types oranAsyncParser.

Using the AST

Access

For accessing atomic values,JValue supports two sets ofmethods:get-style methods andas-style methods.

Theget-style methods returnSome(_) when called on a compatibleJSON value (e.g. strings can returnSome[String], numbers can returnSome[Double], etc.), andNone otherwise:

getBooleanOption[Boolean]getStringOption[String]getLongOption[Long]getDoubleOption[Double]getBigIntOption[BigInt]getBigDecimalOption[BigDecimal]

In constrast, theas-style methods will either return an unwrappedvalue (instead of returningSome(_)) or throw an exception (insteadof returningNone):

asBooleanBoolean// or exceptionasStringString// or exceptionasLongLong// or exceptionasDoubleDouble// or exceptionasBigIntBigInt// or exceptionasBigDecimalBigDecimal// or exception

To access elements of an array, callget with anInt position:

get(i:Int)JValue// returns JNull if index is illegal

To access elements of an object, callget with aString key:

get(k:String)JValue// returns JNull if key is not found

Both of these methods also returnJNull if the value is not theappropraite container. This allows the caller to chain lookups withouthaving to check that each level is correct:

valv:JValue=???// returns JNull if a problem is encountered in structure of 'v'.valt:JValue= v.get("novels").get(0).get("title")// if 'v' had the right structure and 't' is JString(s), then Some(s).// otherwise, None.valtitleOrNone:Option[String]= t.getString// equivalent to titleOrNone.getOrElse(throw ...)valtitleOrDie:String= t.asString

Updating

The atomic values (JNum,JBoolean,JNum, andJString) areimmutable.

Objects are fully-mutable and can have items added, removed, orchanged:

set(k:String,v:JValue)Unitremove(k:String)Option[JValue]

Ifset is called on a non-object, an exception will be thrown.Ifremove is called on a non-object,None will be returned.

Arrays are semi-mutable. Their values can be changed, but their sizeis fixed:

set(i:Int,v:JValue)Unit

Ifset is called on a non-array, or called with an illegal index, anexception will be thrown.

(A future version of Jawn may provide an array whose length can bechanged.)

Profiling

Jawn usesJMHalong with thesbt-jmh plugin.

Running Benchmarks

The benchmarks are located in thebenchmark project. You can run thebenchmarks by typingbenchmark/jmh:run from SBT. There are manysupported arguments, so here are a few examples:

Run all benchmarks, with 10 warmups, 10 iterations, using 3 threads:

benchmark/jmh:run -wi 10 -i 10 -f1 -t3

Run just theCountriesBench test (5 warmups, 5 iterations, 1 thread):

benchmark/jmh:run -wi 5 -i 5 -f1 -t1 .*CountriesBench

Benchmark Issues

Currently, the benchmarks are a bit fiddily. The most obvious symptomis that if you compile the benchmarks, make changes, and compileagain, you may see errors like:

[error] (benchmark/jmh:generateJavaSources) java.lang.NoClassDefFoundError: jawn/benchmark/Bla25Bench

The fix here is to runbenchmark/clean and try again.

You will also see intermittent problems like:

[error] (benchmark/jmh:compile) java.lang.reflect.MalformedParameterizedTypeException

The solution here is easier (though frustrating): just try itagain. If you continue to have problems, consider cleaning the projectand trying again.

(In the future I hope to make the benchmarking here a bit moreresilient. Suggestions and pull requests gladly welcome!)

Files

The benchmarks use files located inbenchmark/src/main/resources. Ifyou want to test your own files (e.g.mydata.json), you would:

  • Copy the file tobenchmark/src/main/resources/mydata.json.
  • Add the following code toJmhBenchmarks.scala:
classMyDataBenchextendsJmhBenchmarks("mydata.json")

Jawn has been tested with much larger files, e.g. 100M - 1G, but theseare obviously too large to ship with the project.

With large files, it's usually easier to comment out most of thebenchmarking methods and only test one (or a few) methods. Some of theslower JSON parsers getmuch slower for large files.

Interpreting the results

Remember that the benchmarking results you see will vary based on:

  • Hardware
  • Java version
  • JSON file size
  • JSON file structure
  • JSON data values

I have tried to use each library in the most idiomatic and fastest waypossible (to parse the JSON into a simple AST). Pull requests toupdate library versions and improve usage are very welcome.

Future Work

More support libraries could be added.

It's likely that some of Jawn's I/O could be optimized a bit more, andalso made more configurable. The heuristics around all-at-once loadingversus input chunking could definitely be improved.

In cases where the user doesn't need fast lookups into JSON objects,an even lighter AST could be used to improve parsing and renderingspeeds.

Strategies to cache/intern field names of objects could pay bigdividends in some cases (this might require AST changes).

If you have ideas for any of these (or other ideas) please feel freeto open an issue or pull request so we can talk about it.

Disclaimers

Jawn only supports UTF-8 when parsing bytes. This might change in thefuture, but for now that's the target case. You can always decode yourdata to a string, and handle the character set decoding using Java'sstandard tools.

Jawn's AST is intended to be very lightweight and simple. It supportssimple access, and limited mutable updates. It intentionally lacks thepower and sophistication of many other JSON libraries.

Community

People are expected to follow theScala Code of Conduct whendiscussing Jawn on GitHub or other venues.

Jawn's current maintainers are:

Copyright and License

All code is available to you under the MIT license, available athttp://opensource.org/licenses/mit-license.php.

Copyright Erik Osheim, 2012-2022.


[8]ページ先頭

©2009-2025 Movatter.jp