Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Improvehash() builtin docstring with caveats.#125229

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
gpshead wants to merge2 commits intopython:main
base:main
Choose a base branch
Loading
fromgpshead:docs/builtins/hash

Conversation

gpshead
Copy link
Member

Mention its return type and that the value can be expected to change between processes (hash randomization).

Why? Thehash builtin gets reached for and used by a lot of people whether it is the right tool or not. IDEs surface docstrings and people use pydoc andhelp(hash).

There are more possible caveats we could go into here such as classes implementing their own dunder methods like__eq__ or__hash__ naturally being able to violate the constraint stated in this docstring. Butthat feels like too much for a beginner friendly docstring.

Mention its return type and that the value can be expected to change betweenprocesses (hash randomization).Why? The `hash` builtin gets reached for and used by a lot of people whether itis the right tool or not. IDEs surface docstrings and people use pydoc and`help(hash)`.There are more possible caveats we could go into here such as classesimplementing their own dunder methods like `__eq__` or `__hash__` naturallybeing able to violate the constraint stated in this docstring. But _that_ feelslike too much for a beginner friendly docstring.
@gpsheadgpshead added docsDocumentation in the Doc dir skip issue skip news needs backport to 3.12only security fixes needs backport to 3.13bugs and security fixes labelsOct 9, 2024
reverse is not necessarily true. Hash values may vary between Python
processes.

This hash value is used internally by Python dict and set hash tables.
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm not sure if this sentence adds value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Maybe remove the "hash tables" part.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Or say "the dict and set builtin types".

JelleZijlstra and willingc reacted with thumbs up emoji
@@ -1623,15 +1623,18 @@ hash as builtin_hash
obj: object
/

Return the hash value for the given object.
Return theintegerhash value for the given object within this process.
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I struggled on which way to word this. do people, especially newbies, understand what a process is? i've also seen linter wording around this say "between runs" but that is less technical, what even is a run? people come at interactive python use from so many environments that I don't know if the concept of a run or process makes sense. but process is at least technically accurate. as hash randomization, for example, is process based. and hashes of things likeobject() are the pointer values and thus process specific.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I agree with@JelleZijlstra - given the addition to the next paragraph, I don't think you need this addition here.

@@ -1623,15 +1623,18 @@ hash as builtin_hash
obj: object
/

Return the hash value for the given object.
Return theintegerhash value for the given object within this process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Not sure the "within this process" part adds value; why did you add it?

willingc reacted with thumbs up emoji
reverse is not necessarily true. Hash values may vary between Python
processes.

This hash value is used internally by Python dict and set hash tables.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Maybe remove the "hash tables" part.


Two objects that compare equal must also have the same hash value, but the
reverse is not necessarily true.
reverse is not necessarily true. Hash values may vary between Python
processes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Worth mentioning that mutable objects aren't hashable, and that hash() on them raises a TypeError?

willingc reacted with thumbs up emoji
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Something like "Equivalent objects will always give the same hash value within a single Python process, but a different Python process may report a different hash value. Not all objects are hashable."?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Or, "The same object can have different hash values in different processes. Use hash() only within a single process."

Copy link
Member

@sobolevnsobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Should we explicitly say thathash should not be used for hashing in crypto operations?

This is something I've noticed several times.

However, sometimes adding a note to not do something actually make people do this even more 🙈

Copy link
Contributor

@ncoghlanncoghlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

General idea seems sound, wordsmithing is tricky :(

@@ -1623,15 +1623,18 @@ hash as builtin_hash
obj: object
/

Return the hash value for the given object.
Return theintegerhash value for the given object within this process.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I agree with@JelleZijlstra - given the addition to the next paragraph, I don't think you need this addition here.


Two objects that compare equal must also have the same hash value, but the
reverse is not necessarily true.
reverse is not necessarily true. Hash values may vary between Python
processes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Something like "Equivalent objects will always give the same hash value within a single Python process, but a different Python process may report a different hash value. Not all objects are hashable."?

reverse is not necessarily true. Hash values may vary between Python
processes.

This hash value is used internally by Python dict and set hash tables.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Or say "the dict and set builtin types".

JelleZijlstra and willingc reacted with thumbs up emoji
@antiseebs
Copy link

I like the "within this process" language, and while I can see that it is also covered by something in another paragraph... I also think this is a thing that is empirically very easy for people to miss or get confused by, and I think the extra couple of words to make sure that someone who reads one paragraph and stops there because they think they know what the function does are probably a very good investment in reduced user pain.

Copy link
Contributor

@willingcwillingc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks good. Thanks@gpshead

@hugovkhugovk removed the needs backport to 3.12only security fixes labelApr 10, 2025
@serhiy-storchakaserhiy-storchaka added the needs backport to 3.14bugs and security fixes labelMay 8, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@nedbatnedbatnedbat left review comments

@JelleZijlstraJelleZijlstraJelleZijlstra left review comments

@ncoghlanncoghlanncoghlan left review comments

@willingcwillingcwillingc left review comments

@sobolevnsobolevnsobolevn left review comments

@ericsnowcurrentlyericsnowcurrentlyAwaiting requested review from ericsnowcurrentlyericsnowcurrently is a code owner

Assignees
No one assigned
Labels
awaiting changesdocsDocumentation in the Doc dirneeds backport to 3.13bugs and security fixesneeds backport to 3.14bugs and security fixesskip issueskip news
Projects
Status: Todo
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

9 participants
@gpshead@antiseebs@nedbat@JelleZijlstra@ncoghlan@willingc@sobolevn@hugovk@serhiy-storchaka

[8]ページ先頭

©2009-2025 Movatter.jp