Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

add SimdJsonParser2 base on bitindex#60

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
heykirby wants to merge1 commit intosimdjson:main
base:main
Choose a base branch
Loading
fromheykirby:feature_simdjson2

Conversation

@heykirby
Copy link

issue:#59

@heykirbyheykirbyforce-pushed thefeature_simdjson2 branch 5 times, most recently from5c92d47 to3139b2cCompareOctober 7, 2024 09:53
@heykirby
Copy link
Author

heykirby commentedOct 19, 2024
edited
Loading

@arouel thanks very much, I have fix the code based on your suggestion.
In the case of determining the parsing path, simdjsonParserWithFixPath provides better performance and supports compressing map and list type data into strings. It can quickly skip paths that do not require parsing and avoid creating instances of JSON nodes for each JSON node

Benchmark testing indicators.refer:
environment is Species[byte, 32, S_256_BIT]

Result "org.simdjson.AParseAndSelectFixPathBenchMark.parseMultiValuesForFixPaths_Jackson":
693.528 ±(99.9%) 18.073 ops/s [Average]
(min, avg, max) = (687.806, 693.528, 699.113), stdev = 4.694
CI (99.9%): [675.455, 711.601] (assumes normal distribution)

Result "org.simdjson.ParseAndSelectFixPathBenchMark.parseMultiValuesForFixPaths_SimdJson":
2258.495 ±(99.9%) 41.596 ops/s [Average]
(min, avg, max) = (2242.400, 2258.495, 2269.942), stdev = 10.802
CI (99.9%): [2216.899, 2300.091] (assumes normal distribution)

Result "org.simdjson.ParseAndSelectFixPathBenchMark.parseMultiValuesForFixPaths_SimdJsonParserWithFixPath":
4075.984 ±(99.9%) 104.804 ops/s [Average]
(min, avg, max) = (4029.568, 4075.984, 4100.273), stdev = 27.217
CI (99.9%): [3971.180, 4180.789] (assumes normal distribution)

arouel reacted with thumbs up emoji

@piotrrzysko
Copy link
Member

How is this different fromOn-Demand parsing available in the c++ simdjson version?

I introduced a form of on-demand parsing in#51 (see:org.simdjson.OnDemandJsonIterator). The API requires specifying a target class to which the JSON will be parsed. However, it should be relatively easy to extend this to support a DOM-like API (JsonValue,JsonIterator, etc.), which I believe is more intuitive than introducing syntax for accessing fields and then returning an array of strings with the corresponding values.

arouel and heykirby reacted with thumbs up emoji

@arouel
Copy link

@piotrrzysko I agree with you, a DOM-like API (JsonValue,JsonIterator, etc.) would be very helpful in use cases where only specific parts of the JSON are conditionally relevant, so that a mapping to an object would cause allocation that you want to avoid.

Can you guide us a bit, so that we can prepare a PR?

heykirby reacted with thumbs up emoji

Copy link

@arouelarouel left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@heykirby I just want share some thoughts/questions:

With some minor API changes insimdjson-java, could we keep theSimdJsonParserWithFixPath in another codebase or it could life in a contribution module, because it is tailored for a very specific use case?

Isn't arecord JsonNode sufficient compared to usinglombok?

heykirby reacted with thumbs up emoji
@heykirby
Copy link
Author

heykirby commentedOct 22, 2024
edited
Loading

@heykirby I just want share some thoughts/questions:

With some minor API changes insimdjson-java, could we keep theSimdJsonParserWithFixPath in another codebase or it could life in a contribution module, because it is tailored for a very specific use case?

Isn't arecord JsonNode sufficient compared to usinglombok?

@arouel Thanks arouel,the unused imports has been removed

arouel reacted with thumbs up emoji

@heykirby
Copy link
Author

How is this different fromOn-Demand parsing available in the c++ simdjson version?

I introduced a form of on-demand parsing in#51 (see:org.simdjson.OnDemandJsonIterator). The API requires specifying a target class to which the JSON will be parsed. However, it should be relatively easy to extend this to support a DOM-like API (JsonValue,JsonIterator, etc.), which I believe is more intuitive than introducing syntax for accessing fields and then returning an array of strings with the corresponding values.

hello,piotrrzysko, I used on-demand parsing,it is very convenient and efficient to deserialize json strings into java classes.it is also a solution provided by many mainstream json sdk.
However, this solution requires building a Java class before parsing the field, especially for deep paths, which is not very convenient for users. for example,if want to get field for $.a.b.c.d. first we need to define class a { class b { class c{class d}}},and then to parse value, and every time parse json string, we need to create an class instance for each node, in case of large-scale data, performance may be affected.

For SimdJsonParserWithFixPath, if we want get values for multi-paths: [$.a.c,$.a,$.a.d,$.b], we only need to provide the json paths, the usage is similar to hive's user define function: json_tuple. It also supports obtaining the value of the children of the container object while obtaining the compressed string value of the container object.
the path tree will only be created once during initialization,and the result array can be reused each time json string is parsed. In scenarios with large amounts of data, repeated creation and destruction of class instances can be avoided, and there will be some advantages in performance.
image

@piotrrzysko
Copy link
Member

Hi, sorry for the delayed reply.

@heykirby
What I meant was that we can introduce on-demand parsing for a DOM-like API, which would significantly reduce the need for creating new objects. In fact, we could have a single instance of something likeOnDemandJsonValue, which would be mutable and traverse a parsed JSON under the hood (likely leveragingorg.simdjson.OnDemandJsonIterator).

The schema-based API you’re referring to is simply using logic that could potentially be utilized by the on-demand DOM API as well.

@arouel

Can you guide us a bit, so that we can prepare a PR?

I’d be happy to help. Perhaps I could start by creating a skeleton of the on-demand DOM API.

heykirby reacted with thumbs up emoji

@heykirby
Copy link
Author

Hi, sorry for the delayed reply.

@heykirby What I meant was that we can introduce on-demand parsing for a DOM-like API, which would significantly reduce the need for creating new objects. In fact, we could have a single instance of something likeOnDemandJsonValue, which would be mutable and traverse a parsed JSON under the hood (likely leveragingorg.simdjson.OnDemandJsonIterator).

The schema-based API you’re referring to is simply using logic that could potentially be utilized by the on-demand DOM API as well.

@arouel

Can you guide us a bit, so that we can prepare a PR?

I’d be happy to help. Perhaps I could start by creating a skeleton of the on-demand DOM API.

thanks,piotrrzysko, It's always an expected feature.

@heykirby
Copy link
Author

@piotrrzysko I submitted a new PR, could you give me some guidance?#63

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

1 more reviewer

@arouelarouelarouel left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@heykirby@piotrrzysko@arouel

[8]ページ先頭

©2009-2025 Movatter.jp