Movatterモバイル変換


[0]ホーム

URL:


Dirk Eddelbuettel
RcppSimdJSON:Rcpp Bindings for the simdjson Header Library

Build StatusLicenseCRANDependenciesLast Commit

Motivation

simdjson byDaniel Lemire (with contributions byGeoff Langdale,John Keiser andmanyothers) is an engineering marvel. Through very clever use ofSIMD instructions, itmanages to parse JSON files faster than disc access. Wut? Yes you readthat right: parallel processing with so little overhead that the netthroughput is limited only by disk speed.

Moreover, it is implemented in neat modern C++ and can be accessed asa header-only library. (Well, one library in two files, really.) Whichmakes R packaging easy and convenient and compelling. So here weare.

For further introduction, see thearXiv paper by Langdale andLemire (out/to appear in VLDB Journal 28(6) as well) and/or the video oftherecent talk byDaniel Lemire at QCon (voted best talk).

Example

jsonfile<-system.file("jsonexamples","twitter.json",package="RcppSimdJson")library(RcppSimdJson)validateJSON(jsonfile)# validate a JSON fileres<-fload(jsonfile)# parse a JSON file

Comparison

Asimplefile-oriented parsing benchmark against the other R-accessible1 JSONparsers:

>print(res)Unit: microseconds     expr       min        lq      mean   median        uq        max neval   cld  yyjsonr312.267347.683405.177390.11425.827926.776100 a simdjson274.367323.998447.691467.79526.237773.070100 a  jsonify2727.8742813.6812952.8042896.842972.8527442.755100  b jsonlite4237.5384435.6834587.4284552.384668.3457082.673100   c  RJSONIO9131.8649425.5159707.2749599.489845.00613516.616100    d   ndjson91668.82292628.35795386.21293192.3794507.484152179.095100     e>

Or in chart form, also including thesecondbenchmark parsing strings

Status

All three major OSs are supported, and JSON can be parsed from fileand string under a variety of settings. A C++17 compiler is required forease of setup (though the upstream can fall back to older compiler; onecan editsrc/Makevarsaccordingly if need be).

Contributing

Any problems, bug reports, or features requests for the package canbe submitted and handled most conveniently asGithub issuesin the repository.

Before submitting pull requests, it is frequently preferable to firstdiscuss need and scope in such an issue ticket. See the fileContributing.md(in theRcpp repo) for abrief discussion.

See Also

For standard JSON work on R, as well as for other nicely done C++libraries, consider these:

Author

For the R package wrapper,Dirk Eddelbuettel.

For everything pertaining to simdjson,Daniel Lemire (andmanycontributors) .

License

GPL (>= 2)

Initially created: Wed 12 Feb 2020 07:36:17 PM CST
Last modified: Sun May 26 10:18:35 CDT 2024


[8]
ページ先頭

©2009-2025 Movatter.jp