Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-132983: Convert dict_content to take Py_buffer#133924

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
AA-Turner wants to merge1 commit intopython:main
base:main
Choose a base branch
Loading
fromAA-Turner:zstd-dict-content-buffer

Conversation

AA-Turner
Copy link
Member

@AA-TurnerAA-Turner commentedMay 12, 2025
edited by bedevere-appbot
Loading

ZstdDict'sdict_content is just binary data, we don't need to allocate an entire PyBytesObject for it. This also lets us use thePy_buffer argument clinic converter.

We replace thedict_content member ofZstdDict withdict_buffer anddict_len.ZstdDict.dict_content is still exposed as a bytes object to Python via a new getter.

A

cc@Rogdham

Comment on lines +68 to +73
self->dict_buffer = PyMem_RawMalloc(dict_content->len);
if (!self->dict_buffer) {
return PyErr_NoMemory();
}
memcpy(self->dict_buffer, dict_content->buf, dict_content->len);
self->dict_len = dict_content->len;
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

cc@vstinner, I thinkRawMalloc is correct here but would be grateful for a review / confirmation please

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It seems like you told the GIL in this code, so you can use PyMem_Malloc() instead. The Raw variant is when you don't hold the GIL.

@serhiy-storchakaserhiy-storchaka self-requested a reviewMay 12, 2025 15:01
@@ -60,24 +59,28 @@ _zstd_ZstdDict_new_impl(PyTypeObject *type, PyObject *dict_content,
}

/* Check dict_content's type */
self->dict_content = PyBytes_FromObject(dict_content);
if (self->dict_content == NULL) {
if (dict_content == NULL) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It is never NULL.

BTW, overriding arbitrary exception with a TypeError in the current code is wrong.

if (Py_SIZE(self->dict_content) < 8) {
self->dict_buffer = PyMem_RawMalloc(dict_content->len);
if (!self->dict_buffer) {
return PyErr_NoMemory();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

self is leaked here.

emmatyping reacted with thumbs up emoji
@@ -124,23 +124,33 @@ PyDoc_STRVAR(ZstdDict_dictid_doc,
"The special value '0' means a 'raw content' dictionary,"
"without any restrictions on format or content.");

PyDoc_STRVAR(ZstdDict_dictcontent_doc,
"The content of a Zstandard dictionary, as a bytes object.");

static PyObject *
ZstdDict_str(PyObject *ob)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It is wrong to implement__str__.__repr__ should be implemented.

static PyObject *
ZstdDict_str(PyObject *ob)
{
ZstdDict *dict = ZstdDict_CAST(ob);
return PyUnicode_FromFormat("<ZstdDict dict_id=%u dict_size=%zd>",
dict->dict_id,Py_SIZE(dict->dict_content));
dict->dict_id, dict->dict_len);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
dict->dict_id,dict->dict_len);
(unsignedint)dict->dict_id,dict->dict_len);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

dict_id is already a uint32_t, is there a reason for adding this cast?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

On every supported platform int is 32-bit, but it is not harm to be pedantic. This just goes to show that we didn't miss it.

emmatyping reacted with thumbs up emoji
self->dict_len = dict_content->len;

/* Both ordinary and "raw content" dictionaries must be 8 bytes minimum */
if (self->dict_len < 8) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This can be checked at the beginning, before any allocations.

}

self->dict_content = NULL;
self->d_dict = NULL;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
self->d_dict=NULL;
self->d_dict=NULL;
self->dict_buffer=NULL;

Otherwiseself->dict_buffer is not initialized in the destructor.

Comment on lines +67 to 71
char *dict_buffer = self->dict_buffer;
Py_ssize_t dict_len = self->dict_len;
Py_BEGIN_ALLOW_THREADS
self->d_dict = ZSTD_createDDict(dict_buffer,
dict_len);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Just inline them, as in the code below.

Suggested change
char*dict_buffer=self->dict_buffer;
Py_ssize_tdict_len=self->dict_len;
Py_BEGIN_ALLOW_THREADS
self->d_dict=ZSTD_createDDict(dict_buffer,
dict_len);
Py_BEGIN_ALLOW_THREADS
self->d_dict=ZSTD_createDDict(self->dict_buffer,self->dict_len);

Comment on lines +173 to 177
char *dict_buffer = self->dict_buffer;
Py_ssize_t dict_len = self->dict_len;
Py_BEGIN_ALLOW_THREADS
cdict = ZSTD_createCDict(dict_buffer,
dict_len,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
char*dict_buffer=self->dict_buffer;
Py_ssize_tdict_len=self->dict_len;
Py_BEGIN_ALLOW_THREADS
cdict=ZSTD_createCDict(dict_buffer,
dict_len,
Py_BEGIN_ALLOW_THREADS
cdict=ZSTD_createCDict(self->dict_buffer,self->dict_len,

emmatyping reacted with thumbs up emoji
@emmatyping
Copy link
Member

It would probably be good to run the refleaks buildbots on this PR before merge.

@emmatyping
Copy link
Member

@AA-Turner would you like to pick this up again? I'm happy to take this over if you don't have time. Want to get the last of the zstd items done before b2.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@vstinnervstinnervstinner left review comments

@serhiy-storchakaserhiy-storchakaserhiy-storchaka left review comments

@emmatypingemmatypingAwaiting requested review from emmatyping

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

4 participants
@AA-Turner@emmatyping@vstinner@serhiy-storchaka

[8]ページ先頭

©2009-2025 Movatter.jp