Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

DOC: Clarify (potentially misleading) nbytes docstring#28943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
zvun wants to merge5 commits intonumpy:main
base:main
Choose a base branch
Loading
fromzvun:nbytes-doc-clarification

Conversation

zvun
Copy link

The documentation fornumpy.ndarray.nbytes has the potentially misleading description that it's the "total bytes consumed by the elements of the array", but thenbytes for a view doesn't reflect the memoryconsumption of its elements, but rather what that consumption would've been if it were a copy. This has been mentioned before in#22925, but the issue was closed before this was clarified. I have included an additional example in the docstring that demonstrates this.

Copy link
Member

@sebergseberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks, adding a note seems good, but that much is much too complicated for the extra information.


Notes
-----
If the array is a view, this shows how much memory it *would* use
if it were copied into a separate array.
Does not include memory consumed by non-element attributes of the
array object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Maybe we ca add that it also doesn't include memory indirectly held by the elements.
(I.e. if you store Python objects or the newStringDType)

>>> arr_1.nbytes
800000
>>> arr_2.nbytes
2400
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This introduces way too much complexity for very little gain. If anything at all, just do some slicing likearr[::2] or so.

@@ -2698,6 +2701,17 @@
>>> np.prod(x.shape) * x.itemsize
480

>>> import numpy as np
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

No need I think, but a sentence on why the next thing comes would help.

@ngoldbaum
Copy link
Member

Another wrinkle is that it's only the memory used by the array buffer. For object arrays or StringDType arrays, it's an underestimate.

@zvun
Copy link
Author

Thank you, I have edited the docstring based on the suggestions.

@mattip
Copy link
Member

Maybe it is enough to use qualifiers like "approximately" and "at least"/"at most" rather than try to describe all the ways the number is wrong. Then point to a documentation page likehttps://numpy.org/devdocs/dev/internals.code-explanations.html#memory-model, and maybe add nuanced qualifications there instead of in the docstring.

@zvun
Copy link
Author

I guess one future-proof way could be to describe hownbytes is calculated, and then mention some examples for different dtypes. With aquicksearch, it seems to be the product of the array's dimensions multiplied by the item size. For object elements the latter is probably the size of the pointers, not sure how it would be forStringDType though. What do you think?@mattip@seberg

@ngoldbaum
Copy link
Member

What do you think?

That sounds good. Something like "This is the memory used by the main array buffer and does not account for any memory used for array metadata or for data stored outside of the array buffer. For example,nbytes, is a lower limit for the object and StringDType types because these types can store data outside the main array buffer.

@melissawmmelissawm moved this fromAwaiting a code review toPending authors' response inNumPy first-time contributor PRsMay 23, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@sebergsebergseberg left review comments

Assignees
No one assigned
Projects
Status: Pending authors' response
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

4 participants
@zvun@ngoldbaum@mattip@seberg

[8]ページ先頭

©2009-2025 Movatter.jp