Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-132983: Introduce_zstd bindings module#133027

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
gpshead merged 59 commits intopython:mainfromemmatyping:3.14-zstd-c-code
May 4, 2025

Conversation

@emmatyping
Copy link
Member

@emmatypingemmatyping commentedApr 26, 2025
edited
Loading

This is part 2 (but parallel to part 1) for implementing PEP 784. This is probably the meat of the implementation making up ~half of the LOC in the overall diff.

This change:

  • Adds the pyzstd license, since this is the first code introduced based on pyzstd
  • Adds the code for the_zstd module inModules/_zstd. Tests will be included when the Pythoncompression.zstd module is added. If people think the tests should be included, I can merge thecompression.zstd changes into the PR, but I wanted to separate them to make the PRs a bit more manageable.
  • Adds unix build changes (Windows requires changes in cpython-source-deps and will be added in a follow-up PR)

I added skip news as I'd like to write a holistic NEWS/What's New entry once the entire implementation has landed. If people think each PR should have NEWS I can write something up.

TODO before merge:

  • try running tests on top of this branch
  • format PEP 7 style
  • run buildbots on this PR
  • verify_zstd gets installed

📚 Documentation preview 📚:https://cpython-previews--133027.org.readthedocs.build/

stonebig reacted with heart emoji
@emmatypingemmatyping changed the titlegh-132983: Introduce_zstd modulegh-132983: Introduce_zstd bindings moduleApr 26, 2025
@emmatyping
Copy link
MemberAuthor

emmatyping commentedApr 26, 2025
edited
Loading

Hm, I'll dig into the C global failure. Didn't show up when the tests ran on the branch :(

E: now fixed

Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nice! Some comments at a glance.

Please remove the module state macros. They make this really difficult to review, and don't really add much (isMS_MEMBER(whatever)really that much easier to type thanstate->whatever?)

@emmatyping
Copy link
MemberAuthor

Thank you for the review@ZeroIntensity! Some of the decisions are just how things were inpyzstd, so I expect they very well should change for merging the code in. I'll go through your review and read up onob_lock.

gpshead reacted with heart emoji

@ZeroIntensity
Copy link
Member

There's probably not too much to read about it other than somewhere inPEP-703. You'll just want to replace the locks withcritical sections, which have some magic to prevent deadlocks.

emmatyping reacted with thumbs up emoji

Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Few preliminary comments, although I'm not sure it can be modified under the given license.

Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Some other comments I managed to find. I didn't review the decompressor but I think most of my comments that I said for the compressor would apply. The most important ones are essentially function signatures where I want you to usePyObject *self and do a manual cast to the expected type inside the function:

#defineMyType_CAST(op)((MyType *)(op))staticvoidmy_method(PyObject*op) {MyType*self=MyType_CAST(op);    ...}

The use of a macro (or a static inline function, up to you) is to allow future runtime type checks.

emmatypingand others added11 commitsApril 28, 2025 17:52
This commit introduces the `_zstd` module, with bindings to libzstd fromthe pyzstd project. It also includes the unix build system configuration.Windows build system support will be integrated independently as itdepends on integration with cpython-source-deps.
Also removes module state references from the classes in the _zstdmodule and instead uses PyType_GetModuleState()
Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
This should avoid races and deadlocks.
The `compress`/`decompress` functions will be moved to Python code for simplicity.C implementations can always be re-added in the future.Also, mark _zstd as not requiring the GIL.
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by@emmatyping for commit30b3934 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F133027%2Fmerge

If you want to schedule another build, you need to add the🔨 test-with-buildbots label again.

@bedevere-botbedevere-bot removed the 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labelMay 3, 2025
@emmatyping
Copy link
MemberAuthor

Okay, latest commit should actually work correctly on RHEL 8, I tested it in a container and verified_zstd gets disabled.

@emmatypingemmatyping added the 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labelMay 3, 2025
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by@emmatyping for commit10e2e80 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F133027%2Fmerge

If you want to schedule another build, you need to add the🔨 test-with-buildbots label again.

@bedevere-botbedevere-bot removed the 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labelMay 3, 2025
@emmatyping
Copy link
MemberAuthor

Okay, configure should now work across all platforms, that was a bit more involved than I was expecting. I wish the configure feedback loop wasn't so long :(

@jasonzitao

This comment was marked as abuse.

@emmatyping
Copy link
MemberAuthor

The RHEL8 s390x LTO+PGO failure is unrelated, should be#133261 (comment) (other RHEL 8 s390x may fail too for the same reason?)

But the changes have worked on RHEL8:https://buildbot.python.org/#/builders/420/builds/1735

checking for stdlib extension module _zstd... disabled

gpshead reacted with thumbs up emoji

@hugovk
Copy link
Member

Okay, configure should now work across all platforms, that was a bit more involved than I was expecting. I wish the configure feedback loop wasn't so long :(

I normally pass--config-cache to configure which makes it a lot quicker, 3s cached vs. 39s without.

@emmatyping
Copy link
MemberAuthor

I normally pass --config-cache to configure which makes it a lot quicker, 3s cached vs. 39s without.

TIL! Thanks that is really handy.

hugovk reacted with thumbs up emoji

@gpsheadgpsheadenabled auto-merge (squash)May 4, 2025 01:24
@gpsheadgpshead merged commit3b43335 intopython:mainMay 4, 2025
42 checks passed
@emmatypingemmatyping deleted the 3.14-zstd-c-code branchMay 4, 2025 01:36
@ZeroIntensity
Copy link
Member

Yay! Congrats for getting this accepted and into 3.14 on time@emmatyping

@emmatyping
Copy link
MemberAuthor

Thank you, as well as everyone who reviewed! The feedback was incredibly valuable. Now on to merging the Python code:#133365
:)

stonebig, hugovk, ZeroIntensity, and chris-eibl reacted with hooray emojistonebig, hugovk, and ZeroIntensity reacted with heart emoji

diegorusso added a commit to diegorusso/cpython that referenced this pull requestMay 4, 2025
* origin/main: (111 commits)pythongh-91048: Add filename and line number to external inspection routines (pythonGH-133385)pythongh-131178: Add tests for `ast` command-line interface (python#133329)  Regenerate pcbuild.sln in Visual Studio 2022 (python#133394)pythongh-133042: disable HACL* HMAC on Emscripten (python#133064)pythongh-133351: Fix remote PDB's multi-line block tab completion (python#133387)pythongh-109700: Improve stress tests for interpreter creation (pythonGH-109946)pythongh-81793: Skip tests for os.link() to symlink on Android (pythonGH-133388)pythongh-126835: Rename `ast_opt.c` to `ast_preprocess.c` and related stuff after moving const folding to the peephole optimizier (python#131830)pythongh-91048: Relax test_async_global_awaited_by to fix flakyness (python#133368)pythongh-132457: make staticmethod and classmethod generic (python#132460)pythongh-132805: annotationlib: Fix handling of non-constant values in FORWARDREF (python#132812)pythongh-132426: Add get_annotate_from_class_namespace replacing get_annotate_function (python#132490)pythongh-81793: Always call linkat() from os.link(), if available (pythonGH-132517)pythongh-122559: Synchronize C and Python implementation of the io module about pickling (pythonGH-122628)pythongh-69605: Add PyREPL import autocomplete feature to 'What's New' (python#133358)  bpo-44172: Keep reference to original window in curses subwindow objects (pythonGH-26226)pythonGH-133231: Changes to executor management to support proposed `sys._jit` module (pythonGH-133287)pythongh-133363: Fix Cmd completion for lines beginning with `! ` (python#133364)pythongh-132983: Introduce `_zstd` bindings module (pythonGH-133027)pythonGH-91048: Add utils for printing the call stack for asyncio tasks (python#133284)  ...
Copy link
Member

@ned-deilyned-deily left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@emmatyping,@gpshead,@hugovk Some showstopper problems with trying to build_zstd.


have_libzstd=no
AC_DEFUN([TEST_ZSTD_VERSION],[
AC_MSG_CHECKING([if libzstd is new enough])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There seem to be two problems with thenew enough check:

  1. There should beAC_MSG_RESULT([yes]) andAC_MSG_RESULT([no]) calls to complete the messages:
checking for libzstd... yeschecking if libzstd is new enough... checking for hstrerror... yes

have_libzstd=yes
])

PKG_CHECK_MODULES([LIBZSTD],[libzstd],[TEST_ZSTD_VERSION()],[
Copy link
Member

@ned-deilyned-deilyMay 5, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

and 2. There's something funky going on here with the two calls toTEST_ZSTD_VERSION() such that version number test fails and the failure is sort of hidden inconfig.log:

configure:22400: checking for libzstdconfigure:22407: $PKG_CONFIG --exists --print-errors "libzstd"configure:22410: $? = 0configure:22424: $PKG_CONFIG --exists --print-errors "libzstd"configure:22427: $? = 0configure:22727: result: yesconfigure:22730: checking if libzstd is new enoughconfigure:22756: gcc -o conftest    conftest.c -ldl  >&5conftest.c:325:10: fatal error: 'zstd.h' file not found  325 | #include "zstd.h"      |          ^~~~~~~~1 error generated.configure:22756: $? = 1configure: program exited with status 1

This later results in:
checking for stdlib extension module _zstd... disabled

I suspect the problem is the call toTEST_ZSTD_VERSION() on line 5437 fails because at that pointCPPFLAGS is no longer set to include$LIBZSTD_CFLAGS and the test compile would succeed only if the header files are installed in a default system location, like/usr/include, which won't be the case, for example, on builds on macOS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Postscript: a likely confirmation of my suspicion above. For our CI runs on macOS, we use Homebrew to supply third-party libs (likelibzstd) not supplied by Apple. If you look at CI logs for current PRs (i.e. after this PR was merged),_zstd is disabled (as above) on theghcr.io/cirruslabs/macos-runner:sonomarunner but succeeds on themacos-13 runner. The difference is that the former is Apple Silicon (arm64) hardware where Homebrew installs to a non-standard/opt/homebrew prefix, whereas themacos-13 runner is Intel hardware where Homebrew continues to install to the legacy/usr/local prefix which, no doubt, is included in the compiler's default search path.

@emmatyping
Copy link
MemberAuthor

To follow up on autoconf issues Ned raised, those were addressed in#133479

hugovk reacted with thumbs up emojistonebig reacted with heart emoji

zanieb added a commit to astral-sh/python-build-standalone that referenced this pull requestJun 4, 2025
Pranjal095 pushed a commit to Pranjal095/cpython that referenced this pull requestJul 12, 2025
* Add _zstd module forhttps://peps.python.org/pep-0784/This commit introduces the `_zstd` module, with bindings to libzstd fromthe pyzstd project. It also includes the unix build system configuration.Windows build system support will be integrated independently as itdepends on integration with cpython-source-deps.* Add _zstd to modules* Fix path for compression.zstd module* Ignore _zstd module like _io* Expand module state macros to improve code qualityAlso removes module state references from the classes in the _zstdmodule and instead uses PyType_GetModuleState()* Remove backticks suggested in reviewCo-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>* Use critical sections to lock object stateThis should avoid races and deadlocks.* Remove compress/decompress and mark module as not reliant on the GILThe `compress`/`decompress` functions will be moved to Python code for simplicity.C implementations can always be re-added in the future.Also, mark _zstd as not requiring the GIL.* Lift critical section to avoid clang warning* Respond to comments by picnixz* Call out pyzstd explicitly in license descriptionCo-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>* Use a much more robust implementation...... for `get_zstd_state_from_type`Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>* Use PyList_GetItemRef for thread safety purposes* Use a macro for the minimum supported version* remove const from primivite types* Use PyMem_New in another spot* Simplify error handling in _get_frame_size* Another simplification of error handling in get_frame_info* Rename _module_state to mod_state* Rewrite comment explaining the context of the code* Add link to pyzstd* Add TODO about refactoring dict training code* Use PyModule_AddObjectRef over PyModule_AddObjectPyModule_AddObject is soft-deprecated, so we should use PyModule_AddObjectRef* Check result of OutputBufferGrow* Simplify return logic in `add_constant_to_type`Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>* Ignore return value of _zstd_clear()Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>* Remove redundant comments* Remove __reduce__ from ZstdDictWe should instead document that to pickle a dictionary a user should usethe `.dict_content` attribute.* Use PyUnicode_FromFormat instead of a buffer* Don't use C constants/types in error messages* Make error messages easier to understand for Python users* Lower minimum required version 1.4.0* Use casts and make slot function signatures correct* Be consistent with CPython on const usage* Make else clauses in line with PEP 7* Fix over-indented blocks in argument clinic* Add critical section around ZSTD_DCtx_setParameter* Add a TODO about refactoring critical sections* Use Py_UNREACHABLE* Move bytes operations out of Py_BEGIN_ALLOW_THREADS* Add TODO about ensuring a lock is held* Remove asserts that may not be correct* Add TODO to make ZstdDict and others GC objects* Make objects GC tracked* Remove unused include* Fix some memory issues* Fix refleaks on module and in ZstdDict* Update configure to check for ZDICT_finalizeDictionary* Properly check version in configure* exit(1) if check fails* Use AC_RUN_IFELSE* Use a define() to re-use version check* Actually properly set _zstd module status based on version---------Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@ned-deilyned-deilyned-deily left review comments

@AA-TurnerAA-TurnerAA-Turner left review comments

@picnixzpicnixzpicnixz left review comments

@StanFromIrelandStanFromIrelandStanFromIreland left review comments

@gpsheadgpsheadgpshead approved these changes

@ZeroIntensityZeroIntensityZeroIntensity approved these changes

@erlend-aaslanderlend-aaslandAwaiting requested review from erlend-aaslanderlend-aasland is a code owner

@corona10corona10Awaiting requested review from corona10corona10 is a code owner

@ericsnowcurrentlyericsnowcurrentlyAwaiting requested review from ericsnowcurrentlyericsnowcurrently is a code owner

Assignees

No one assigned

Labels

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

10 participants

@emmatyping@ZeroIntensity@bedevere-bot@gpshead@jasonzitao@hugovk@ned-deily@AA-Turner@picnixz@StanFromIreland

[8]ページ先頭

©2009-2025 Movatter.jp