python/cpythonPublic

NotificationsYou must be signed in to change notification settings
Fork33.7k
Star70.4k

Explore SIMD-accelerated parsing architecture for the standard json module #142915

New issue

Open

Explore SIMD-accelerated parsing architecture for the standard json module#142915

Labels

type-featureA feature request or enhancement

Description

ErwinCell

opened

on Dec 18, 2025

Feature or enhancement

Proposal:

Motivation

The standard library json module is widely used and highly stable, but its core parsing architecture is still fundamentally scalar and recursive-descent based. On modern CPUs (especially x86_64 and increasingly ARM64), JSON parsing is often dominated by:

UTF-8 validation
structural character detection ({ } [ ] , :)
whitespace skipping

These stages are well known to be amenable to SIMD acceleration.
Projects such as simdjson demonstrate that a two-stage parsing pipeline (structural scan + semantic parsing), driven by SIMD instructions, can deliver multiple-x speedups while remaining fully compliant with RFC 8259.

Given the growing importance of JSON in performance-sensitive workloads (ML pipelines, telemetry, configuration at scale), I would like to propose a discussion on whether CPython’s json module could adopt a simdjson-inspired architecture, at least optionally.

Scope of the proposal

This issue is not a request to immediately replace the existing implementation. Instead, I would like to explore:

Feasibility
Whether a SIMD-based parsing backend could coexist with the current implementation.
Whether this would fit CPython’s portability and maintenance constraints.
Architecture
A staged parsing model similar to simdjson:
- Stage 1: SIMD structural scan (identify string boundaries, braces, commas, etc.)
- Stage 2: scalar semantic parsing using the structural index

Integration options

Optional backend selected at build time or runtime
Fallback to the existing implementation when SIMD is unavailable

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

Metadata

Assignees

No one assigned

Labels

type-featureA feature request or enhancement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Explore SIMD-accelerated parsing architecture for the standard json module #142915

Description

Feature or enhancement

Proposal:

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions