Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Constant hash value for None to aid reproducibility #99540

Closed
Labels
type-featureA feature request or enhancement
@yonillasky

Description

@yonillasky

Feature or enhancement

Fixhash(None) to a constant value.

Pitch

(Updated 2022.11.18)

  • Under current behavior, the runtime leaks the ASLR offset, since the original address of theNone singleton is fixed and_Py_HashPointerRaw is reversible. Admittedly, there are other similar objects, likeNotImplemented orEllipsis that also have this problem, and need to be similarly fixed.

  • Because of ASLR,hash(None) changes every run; that consequently means the hash of many useful "key" types changes every run, particularly tuples, NamedTuples and frozen dataclasses that haveOptional fields.

  • The other source of hash value instability across runs in common "key" types like str or Enum, can be fixed using thePYTHONHASHSEED environment var.

  • other singletons commonly used as (or as part of) mapping keys,True andFalse already have fixed hash values.

CPython's builtin set classes, as do all other non-concurrent hash-tables, either open or closed, AFAIK, grant the user a certain stability property. Given a specific sequence of initialization and subsequent mutation (if any), and given specific inputs with certain hash values, if one were to "replay" it, the result set will be in the same observable state every time: not only have the same items (correctness), but also they would be retrieved from the set in the same order when iterated.

This property means that code that starts out with identical data, performs computations and makes decisions based on the results will behave identically between runs. For example, if based on some mathematical properties of the input, we have computed a set of N valid choices, they are given integer scores, then we pick the first choice that has maximal score. If the set guarantees the property described above, we are also guaranteed that the exact same choice will be made every time this code runs, even in case of ties. This is very helpful for reproducibility, especially in complex algorithmic code that makes a lot of combinatorial decisions of that kind.

There is a counterargument that we should simply just offerStableSet andStableFrozenSet that guarantee a specific order, the same way thatdict does.
A few things to note about that:

  • I've written such set classes as an adapter overdict[T, None], there is a substantial perf overhead to that
  • Is it worth the extra "weight" in code inside the core? That's suspect - why hasn't it been added all those years?
  • In a large codebase, it requires automated code inspection and editing tools to enforce this. It's all too easy, and natural, to add a seemingly harmless set comprehension somewhere and defeat the whole effort
  • The insertion-order-as-iteration-order guarantee is stronger than what we actually require, in order to have the "reproducability" property I've described, so we're paying extra for something we don't really need.

My PR makes a small change to CPython, inobjects.c, that sets thetp_hash descriptor ofNoneType to a function that simply returns a constant value.

Admittedly, determinism between runs isn't a concern that most users/programs care about. It is rather niche. However, I argue that still, there is no externalized cost to this change.

Previous discussion

https://discuss.python.org/t/constant-hash-for-none/21110

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    type-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp