Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 442 – Safe object finalization

Author:
Antoine Pitrou <solipsis at pitrou.net>
BDFL-Delegate:
Benjamin Peterson <benjamin at python.org>
Status:
Final
Type:
Standards Track
Created:
18-May-2013
Python-Version:
3.4
Post-History:
18-May-2013
Resolution:
Python-Dev message

Table of Contents

Abstract

This PEP proposes to deal with the current limitations of objectfinalization. The goal is to be able to define and run finalizersfor any object, regardless of their position in the object graph.

This PEP doesn’t call for any change in Python code. Objectswith existing finalizers will benefit automatically.

Definitions

Reference
A directional link from an object to another. The target of thereference is kept alive by the reference, as long as the source isitself alive and the reference isn’t cleared.
Weak reference
A directional link from an object to another, which doesn’t keepalive its target. This PEP focuses on non-weak references.
Reference cycle
A cyclic subgraph of directional links between objects, which keepsthose objects from being collected in a pure reference-countingscheme.
Cyclic isolate (CI)
A standalone subgraph of objects in which no object is referencedfrom the outside, containing one or several reference cycles,andwhose objects are still in a usable, non-broken state: they canaccess each other from their respective finalizers.
Cyclic garbage collector (GC)
A device able to detect cyclic isolates and turn them into cyclictrash. Objects in cyclic trash are eventually disposed of bythe natural effect of the references being cleared and theirreference counts dropping to zero.
Cyclic trash (CT)
A former cyclic isolate whose objects have started being clearedby the GC. Objects in cyclic trash are potential zombies; if theyare accessed by Python code, the symptoms can vary from weirdAttributeErrors to crashes.
Zombie / broken object
An object part of cyclic trash. The term stresses that the objectis not safe: its outgoing references may have been cleared, or oneof the objects it references may be zombie. Therefore,it should not be accessed by arbitrary code (such as finalizers).
Finalizer
A function or method called when an object is intended to bedisposed of. The finalizer can access the object and release anyresource held by the object (for example mutexes or filedescriptors). An example is a__del__ method.
Resurrection
The process by which a finalizer creates a new reference to anobject in a CI. This can happen as a quirky but supportedside-effect of__del__ methods.

Impact

While this PEP discusses CPython-specific implementation details, thechange in finalization semantics is expected to affect the Pythonecosystem as a whole. In particular, this PEP obsoletes the currentguideline that “objects with a__del__ method should not be part of areference cycle”.

Benefits

The primary benefits of this PEP regard objects with finalizers, suchas objects with a__del__ method and generators with afinallyblock. Those objects can now be reclaimed when they are part of areference cycle.

The PEP also paves the way for further benefits:

  • The module shutdown procedure may not need to set global variables toNone anymore. This could solve a well-known class of irritating issues.

The PEP doesn’t change the semantics of:

  • Weak references caught in reference cycles.
  • C extension types with a customtp_dealloc function.

Description

Reference-counted disposal

In normal reference-counted disposal, an object’s finalizer is calledjust before the object is deallocated. If the finalizer resurrectsthe object, deallocation is aborted.

However, if the object was already finalized, then the finalizer isn’tcalled. This prevents us from finalizing zombies (see below).

Disposal of cyclic isolates

Cyclic isolates are first detected by the garbage collector, and thendisposed of. The detection phase doesn’t change and won’t be describedhere. Disposal of a CI traditionally works in the following order:

  1. Weakrefs to CI objects are cleared, and their callbacks called. Atthis point, the objects are still safe to use.
  2. The CI becomes a CT as the GC systematically breaks allknown references inside it (using thetp_clear function).
  3. Nothing. All CT objects should have been disposed of in step 2(as a side-effect of clearing references); this collection isfinished.

This PEP proposes to turn CI disposal into the following sequence (newsteps are in bold):

  1. Weakrefs to CI objects are cleared, and their callbacks called. Atthis point, the objects are still safe to use.
  2. The finalizers of all CI objects are called.
  3. The CI is traversed again to determine if it is still isolated.If it is determined that at least one object in CI is now reachablefrom outside the CI, this collection is aborted and the whole CIis resurrected. Otherwise, proceed.
  4. The CI becomes a CT as the GC systematically breaks allknown references inside it (using thetp_clear function).
  5. Nothing. All CT objects should have been disposed of in step 4(as a side-effect of clearing references); this collection isfinished.

Note

The GC doesn’t recalculate the CI after step 2 above, hence the needfor step 3 to check that the whole subgraph is still isolated.

C-level changes

Type objects get a newtp_finalize slot to which__del__ methodsare mapped (and reciprocally). Generators are modified to use this slot,rather thantp_del. Atp_finalize function is a normal Cfunction which will be called with a valid and alivePyObject as itsonly argument. It doesn’t need to manipulate the object’s reference count,as this will be done by the caller. However, it must ensure that theoriginal exception state is restored before returning to the caller.

For compatibility,tp_del is kept in the type structure. Handlingof objects with a non-NULLtp_del is unchanged: when part of a CI,they are not finalized and end up ingc.garbage. However, a non-NULLtp_del is not encountered anymore in the CPython source tree (exceptfor testing purposes).

Two new C API functions are provided to ease calling oftp_finalize,especially from custom deallocators.

On the internal side, a bit is reserved in the GC header for GC-managedobjects to signal that they were finalized. This helps avoid finalizingan object twice (and, especially, finalizing a CT object after it wasbroken by the GC).

Note

Objects which are not GC-enabled can also have atp_finalize slot.They don’t need the additional bit since theirtp_finalize functioncan only be called from the deallocator: it therefore cannot be calledtwice, except when resurrected.

Discussion

Predictability

Following this scheme, an object’s finalizer is always called exactlyonce, even if it was resurrected afterwards.

For CI objects, the order in which finalizers are called (step 2 above)is undefined.

Safety

It is important to explain why the proposed change is safe. Thereare two aspects to be discussed:

  • Can a finalizer access zombie objects (including the object beingfinalized)?
  • What happens if a finalizer mutates the object graph so as to impactthe CI?

Let’s discuss the first issue. We will divide possible cases in twocategories:

  • If the object being finalized is part of the CI: by construction, noobjects in CI are zombies yet, since CI finalizers are called beforeany reference breaking is done. Therefore, the finalizer cannotaccess zombie objects, which don’t exist.
  • If the object being finalized is not part of the CI/CT: by definition,objects in the CI/CT don’t have any references pointing to them fromoutside the CI/CT. Therefore, the finalizer cannot reach any zombieobject (that is, even if the object being finalized was itselfreferenced from a zombie object).

Now for the second issue. There are three potential cases:

  • The finalizer clears an existing reference to a CI object. The CIobject may be disposed of before the GC tries to break it, whichis fine (the GC simply has to be aware of this possibility).
  • The finalizer creates a new reference to a CI object. This can onlyhappen from a CI object’s finalizer (see above why). Therefore, thenew reference will be detected by the GC after all CI finalizers arecalled (step 3 above), and collection will be aborted without anyobjects being broken.
  • The finalizer clears or creates a reference to a non-CI object. Byconstruction, this is not a problem.

Implementation

An implementation is available in branchfinalize of the repositoryathttp://hg.python.org/features/finalize/.

Validation

Besides running the normal Python test suite, the implementation addstest cases for various finalization possibilities including reference cycles,object resurrection and legacytp_del slots.

The implementation has also been checked to not produce any regressions onthe following test suites:

References

Notes about reference cycle collection and weak reference callbacks:http://hg.python.org/cpython/file/4e687d53b645/Modules/gc_weakref.txt

Generator memory leak:http://bugs.python.org/issue17468

Allow objects to decide if they can be collected by GC:http://bugs.python.org/issue9141

Module shutdown procedure based on GChttp://bugs.python.org/issue812369

Copyright

This document has been placed in the public domain.


Source:https://github.com/python/peps/blob/main/peps/pep-0442.rst

Last modified:2025-02-01 08:59:27 GMT


[8]ページ先頭

©2009-2025 Movatter.jp