Python Enhancement Proposals

Python »
PEP Index »
PEP 768

PEP 768 – Safe external debugger interface for CPython

Author:: Pablo Galindo Salgado <pablogsal at python.org>, Matt Wozniski <godlygeek at gmail.com>, Ivona Stojanovic <stojanovic.i at hotmail.com>
Discussions-To:

Table of Contents

Important

This PEP is a historical document. The up-to-date, canonical documentation can now be found atRemote debugging attachment protocol.

SeePEP 1 for how to propose changes.

Abstract

This PEP proposes adding a zero-overhead debugging interface to CPython thatallows debuggers and profilers to safely attach to running Python processes. Theinterface provides safe execution points for attaching debugger code withoutmodifying the interpreter’s normal execution path or adding runtime overhead.

A key application of this interface will be enabling pdb to attach to liveprocesses by process ID, similar togdb-p, allowing developers to inspect anddebug Python applications interactively in real-time without stopping orrestarting them.

Motivation

Debugging Python processes in production and live environments presents uniquechallenges. Developers often need to analyze application behavior withoutstopping or restarting services, which is especially crucial forhigh-availability systems. Common scenarios include diagnosing deadlocks,inspecting memory usage, or investigating unexpected behavior in real-time.

Very few Python tools can attach to running processes, primarily because doingso requires deep expertise in both operating system debugging interfaces andCPython internals. While C/C++ debuggers like GDB and LLDB can attach toprocesses using well-understood techniques, Python tools must implement all ofthese low-level mechanisms plus handle additional complexity. For example, whenGDB needs to execute code in a target process, it:

Uses ptrace to allocate a small chunk of executable memory (easier said than done)
Writes a small sequence of machine code - typically a function prologue, thedesired instructions, and code to restore registers
Saves all the target thread’s registers
Changes the instruction pointer to the injected code
Lets the process run until it hits a breakpoint at the end of the injected code
Restores the original registers and continues execution

Python tools face this same challenge of code injection, but with an additionallayer of complexity. Not only do they need to implement the above mechanism,they must also understand and safely interact with CPython’s runtime state,including the interpreter loop, garbage collector, thread state, and referencecounting system. This combination of low-level system manipulation anddeep domain specific interpreter knowledge makes implementing Python debugging toolsexceptionally difficult.

The few tools (see for exampleDebugPyandMemray)that do attempt this resort to suboptimal and unsafe methods,using system debuggers like GDB and LLDB to forcefully inject code. Thisapproach is fundamentally unsafe because the injected code can execute at anypoint during the interpreter’s execution cycle - even during critical operationslike memory allocation, garbage collection, or thread state management. Whenthis happens, the results are catastrophic: attempting to allocate memory whilealready insidemalloc() causes crashes, modifying objects during garbagecollection corrupts the interpreter’s state, and touching thread state at thewrong time leads to deadlocks.

Various tools attempt to minimize these risks through complex workarounds, suchas spawning separate threads for injected code or carefully timing theiroperations or trying to select some good points to stop the process. However,these mitigations cannot fully solve the underlying problem: without cooperationfrom the interpreter, there’s no way to know if it’s safe to execute code at anygiven moment. Even carefully implemented tools can crash the interpreter becausethey’re fundamentally working against it rather than with it.

Rationale

Rather than forcing tools to work around interpreter limitations with unsafecode injection, we can extend CPython with a proper debugging interface thatguarantees safe execution. By adding a few thread state fields and integratingwith the interpreter’s existing evaluation loop, we can ensure debuggingoperations only occur at well-defined safe points. This eliminates thepossibility of crashes and corruption while maintaining zero overhead duringnormal execution.

The key insight is that we don’t need to inject code at arbitrary points - wejust need to signal to the interpreter that we want code executed at the nextsafe opportunity. This approach works with the interpreter’s natural executionflow rather than fighting against it.

After describing this idea to the PyPy development team, this proposal hasalreadybeen implemented in PyPy,proving both its feasibility and effectiveness. Their implementationdemonstrates that we can provide safe debugging capabilities with zero runtimeoverhead during normal execution. The proposed mechanism not only reduces risksassociated with current debugging approaches but also lays the foundation forfuture enhancements. For instance, this framework could enable integration withpopular observability tools, providing real-time insights into interpreterperformance or memory usage. One compelling use case for this interface isenabling pdb to attach to running Python processes, similar to how gdb allowsusers to attach to a program by process ID (gdb-p<pid>). With thisfeature, developers could inspect the state of a running application, evaluateexpressions, and step through code dynamically. This approach would alignPython’s debugging capabilities with those of other major programming languagesand debugging tools that support this mode.

Specification

This proposal introduces a safe debugging mechanism that allows externalprocesses to trigger code execution in a Python interpreter at well-defined safepoints. The key insight is that rather than injecting code directly via systemdebuggers, we can leverage the interpreter’s existing evaluation loop and threadstate to coordinate debugging operations.

The mechanism works by having debuggers write to specific memory locations inthe target process that the interpreter then checks during its normal executioncycle. When the interpreter detects that a debugger wants to attach, it executes therequested operations only when it’s safe to do so - that is, when no internallocks are held and all data structures are in a consistent state.

Runtime State Extensions

A new structure is added to PyThreadState to support remote debugging:

typedefstruct{intdebugger_pending_call;chardebugger_script_path[...];}_PyRemoteDebuggerSupport;

This structure is appended toPyThreadState, adding only a few fields thatarenever accessed during normal execution. Thedebugger_pending_call fieldindicates when a debugger has requested execution, whiledebugger_script_pathprovides a filesystem path to a Python source file (.py) that will be executed whenthe interpreter reaches a safe point. The path must point to a Python source file,not compiled Python code (.pyc) or any other format.

The size ofdebugger_script_path will be a trade-off between binary sizeand how big debugging scripts’ paths can be. To limit the memory overhead perthread we will be limiting this to 512 bytes. This size will also be provided aspart of the debugger support structure so debuggers know how much they canwrite. This value can be extended in the future if we ever need to.

Debug Offsets Table

Python 3.12 introduced a debug offsets table placed at the start of thePyRuntime structure. This section contains the_Py_DebugOffsets structure thatallows external tools to reliably find critical runtime structures regardless ofASLR orhow Python was compiled.

This proposal extends the existing debug offsets table with new fields fordebugger support:

struct_debugger_support{uint64_teval_breaker;// Location of the eval breaker flaguint64_tremote_debugger_support;// Offset to our support structureuint64_tdebugger_pending_call;// Where to write the pending flaguint64_tdebugger_script_path;// Where to write the script pathuint64_tdebugger_script_path_size;// Size of the script path buffer}debugger_support;

These offsets allow debuggers to locate critical debugging control structures inthe target process’s memory space. Theeval_breaker andremote_debugger_supportoffsets are relative to eachPyThreadState, while thedebugger_pending_callanddebugger_script_path offsets are relative to each_PyRemoteDebuggerSupportstructure, allowing the new structure and its fields to be found regardless ofwhere they are in memory.debugger_script_path_size informs the attachingtool of the size of the buffer.

Attachment Protocol

When a debugger wants to attach to a Python process, it follows these steps:

LocatePyRuntime structure in the process:
- Find Python binary (executable or libpython) in process memory (OS dependent process)
- Extract.PyRuntime section offset from binary’s format (ELF/Mach-O/PE)
- Calculate the actualPyRuntime address in the running process by relocating the offset to the binary’s load address
Access debug offset information by reading the_Py_DebugOffsets at the start of thePyRuntime structure.
Use the offsets to locate the desired thread state
Use the offsets to locate the debugger interface fields within that thread state
Write control information:
- Most debuggers will pause the process before writing to its memory. This isstandard practice for tools like GDB, which use SIGSTOP or ptrace to pause the process.This approach prevents races when writing to process memory. Profilers and other toolsthat don’t wish to stop the process can still use this interface, but they need tohandle possible races. This is a normal consideration for profilers.
- Write a file path to a Python source file (.py) into thedebugger_script_path field in_PyRemoteDebuggerSupport.
- Setdebugger_pending_call flag in_PyRemoteDebuggerSupport to 1
- Set_PY_EVAL_PLEASE_STOP_BIT in theeval_breaker field

Once the interpreter reaches the next safe point, it will execute the Python codecontained in the file specified by the debugger.

Interpreter Integration

The interpreter’s regular evaluation loop already includes a check of theeval_breaker flag for handling signals, periodic tasks, and other interrupts. Weleverage this existing mechanism by checking for debugger pending calls onlywhen theeval_breaker is set, ensuring zero overhead during normal execution.This check has no overhead. Indeed, profiling with Linuxperf shows this branchis highly predictable - thedebugger_pending_call check is never taken duringnormal execution, allowing modern CPUs to effectively speculate past it.

When a debugger has set both theeval_breaker flag anddebugger_pending_call,the interpreter will execute the provided debugging code at the next safe point.This all happens in a completely safe context, since the interpreter isguaranteed to be in a consistent state whenever the eval breaker is checked.

The only valid values fordebugger_pending_call will initially be 0 and 1and other values are reserved for future use.

An audit event will be raised before the code is executed, allowing this mechanismto be audited or disabled if desired by a system’s administrator.

// In ceval.cif(tstate->eval_breaker){if(tstate->remote_debugger_support.debugger_pending_call){tstate->remote_debugger_support.debugger_pending_call=0;constchar*path=tstate->remote_debugger_support.debugger_script_path;if(*path){if(0!=PySys_Audit("debugger_script","%s",path)){PyErr_Clear();}else{FILE*f=fopen(path,"r");if(!f){PyErr_SetFromErrno(OSError);}else{PyRun_AnyFile(f,path);fclose(f);}if(PyErr_Occurred()){PyErr_WriteUnraisable(...);}}}}}

If the code being executed raises any Python exception it will be processed asanunraisable exception inthe thread where the code was executed.

Python API

To support safe execution of Python code in a remote process without having tore-implement all these steps in every tool, this proposal extends thesys modulewith a new function. This function allows debuggers or external tools to executearbitrary Python code within the context of a specified Python process:

defremote_exec(pid:int,script:str|bytes|PathLike)->None:"""    Executes a file containing Python code in a given remote Python process.    This function returns immediately, and the code will be executed by the    target process's main thread at the next available opportunity, similarly    to how signals are handled. There is no interface to determine when the    code has been executed. The caller is responsible for making sure that    the file still exists whenever the remote process tries to read it and that    it hasn't been overwritten.    Args:         pid (int): The process ID of the target Python process.         script (str|bytes|PathLike): The path to a file containing             the Python code to be executed.    """

An example usage of the API would look like:

importsysimportuuid# Execute a print statement in a remote Python process with PID 12345script=f"/tmp/{uuid.uuid4()}.py"withopen(script,"w")asf:f.write("print('Hello from remote execution!')")try:sys.remote_exec(12345,script)exceptExceptionase:print(f"Failed to execute code:{e}")

Configuration API

To allow redistributors, system administrators, or users to disable thismechanism, several methods will be provided to control the behavior of theinterpreter:

A newPYTHON_DISABLE_REMOTE_DEBUG environment variable willbe provided to control the behaviour at runtime. If set to any value (including an empty string), theinterpreter will ignore any attempts to attach a debugger using this mechanism.

This environment variable will be added together with a new-Xdisable-remote-debugflag to the Python interpreter to allow users to disable this feature at runtime.

Additionally a new--without-remote-debug flag will be added to theconfigure script to allow redistributors to build Python without support forremote debugging if they so desire.

A new flag indicating the status of remote debugging will be made available viathe debug offsets so tools can query if a remote process has disabled thefeature. This way, tools can offer a useful error message explaining why theywon’t work, instead of believing that they have attached and then never havingtheir script run.

Multi-threading Considerations

The overall execution pattern resembles how Python handles signals internally.The interpreter guarantees that injected code only runs at safe points, neverinterrupting atomic operations within the interpreter itself. This approachensures that debugging operations cannot corrupt the interpreter state whilestill providing timely execution in most real-world scenarios.

However, debugging code injected through this interface can execute in anythread. This behavior is different than how Python handles signals, sincesignal handlers can only run in the main thread. If a debugger wants to injectcode into every running thread, it must inject it into everyPyThreadState.If a debugger wants to run code in the first available thread, it needs toinject it into everyPyThreadState, and that injected code must checkwhether it has already been run by another thread (likely by setting some flagin the globals of some module).

Note that the Global Interpreter Lock (GIL) continues to govern execution asnormal when the injected code runs. This means if a target thread is currentlyexecuting a C extension that holds the GIL continuously, the injected codewon’t be able to run until that operation completes and the GIL becomesavailable. However, the interface introduces no additional GIL contentionbeyond what the injected code itself requires. Importantly, the interfaceremains fully compatible with Python’s free-threaded mode.

It may be useful for a debugger that injected some code to be run to followthat up by sending some pre-registered signal to the process, which caninterrupt any blocking I/O or sleep states waiting for external resources, andallow a safe opportunity to run the injected code.

Backwards Compatibility

This change has no impact on existing Python code or interpreter performance.The added fields are only accessed during debugger attachment, and the checkingmechanism piggybacks on existing interpreter safe points.

Security Implications

This interface does not introduce new security concerns as it is only usable byprocesses that can already write to arbitrary memory within a given process andexecute arbitrary code on the machine (in order to create the file containingthe Python code to be executed).

Furthermore, the execution of the code is gated by the interpreter’saudit hooks, which can be used to monitor or prevent the execution of the codein sensitive environments.

Existing operating system security mechanisms are effective for guardingagainst attackers gaining arbitrary memory write access. Although the PEPdoesn’t specify how memory should be written to the target process, in practicethis will be done using standard system calls that are already being used byother debuggers and tools. Some examples are:

On Linux, theprocess_vm_readv()andprocess_vm_writev() system callsare used to read and write memory from another process. These operations arecontrolled byptrace access modechecks - the same ones that govern debugger attachment. A process can only read fromor write to another process’s memory if it has the appropriate permissions (typicallyrequiring either root or theCAP_SYS_PTRACEcapability, though less security minded distributions may allow any process running as the same uid to attach).
On macOS, the interface would leveragemach_vm_read_overwrite() andmach_vm_write() through the Mach task system. These operations requiretask_for_pid() access, which is strictly controlled by the operatingsystem. By default, access is limited to processes running as root or thosewith specific entitlements granted by Apple’s security framework.
On Windows, theReadProcessMemory()andWriteProcessMemory() functionsprovide similar functionality. Access is controlled through the Windowssecurity model - a process needsPROCESS_VM_READandPROCESS_VM_WRITEpermissions, which typically require the same user context or appropriateprivileges. These are the same permissions required by debuggers, ensuringconsistent security semantics across platforms.

All mechanisms ensure that:

Only authorized processes can read/write memory
The same security model that governs traditional debugger attachment applies
No additional attack surface is exposed beyond what the OS already provides for debugging
Even if an attacker can write arbitrary memory, they cannot escalate thisto arbitrary code execution unless they already have filesystem access

The memory operations themselves are well-established and have been used safelyfor decades in tools like GDB, LLDB, and various system profilers.

It’s important to note that any attempt to attach to a Python process via thismechanism would be detectable by system-level monitoring tools as well as byPython audit hooks. This transparency provides an additional layer ofaccountability, allowing administrators to audit debugging operations insensitive environments.

Further, the strict reliance on OS-level security controls ensures that existingsystem policies remain effective. For enterprise environments, this meansadministrators can continue to enforce debugging restrictions using standardtools and policies without requiring additional configuration. For instance,leveraging Linux’sptrace_scopeor macOS’staskgated to restrict debugger access will equally govern theproposed interface.

By maintaining compatibility with existing security frameworks, this designensures that adopting the new interface requires no changes to established.

Security scenarios

For an external attacker, the ability to write to arbitrary memory in aprocess is already a severe security issue. This interface does not introduceany new attack surface, as the attacker would already have the ability toexecute arbitrary code in the process. This interface behaves in exactlythe same way as existing debuggers, and does not introduce any new additionalsecurity risks.
For an attacker who has gained arbitrary memory write access to a process butnot arbitrary code execution, this interface does not allow them to escalate.The ability to calculate and write to specific memory locations is required,which is not available without compromising other machine resources thatare external to the Python process.

Additionally, the fact that the code to be executed is gated by the interpreter’saudit hooks means that the execution of the code can be monitored and controlledby system administrators. This means that even if the attacker has compromised theapplicationand the filesystem, leveraging this interface for maliciouspurposes provides a very risky proposition for an attacker, as they riskexposing their actions to system administrators that could not only detect theattack but also take action to prevent it.

Finally, is important to note that if an attacker has arbitrary memory writeaccess to a process and has compromised the filesystem, they can alreadyescalate to arbitrary code execution using other existing mechanisms, so thisinterface does not introduce any new risks in this scenario.

How to Teach This

For tool authors, this interface becomes the standard way to implement debuggerattachment, replacing unsafe system debugger approaches. A section in the PythonDeveloper Guide could describe the internal workings of the mechanism, includingthedebugger_support offsets and how to interact with them using systemAPIs.

End users need not be aware of the interface, benefiting only from improveddebugging tool stability and reliability.

Reference Implementation

A reference implementation with a prototype adding remote support forpdbcan be foundhere.

Rejected Ideas

Writing Python code into the buffer

We have chosen to have debuggers write the path to a file containing Python codeinto a buffer in the remote process. This has been deemedmore secure than writing the Python code to be executed itself into a buffer inthe remote process, because it means that an attacker who has gained arbitrarywrites in a process but not arbitrary code execution or file systemmanipulation can’t escalate to arbitrary code execution through this interface.

This does require the attaching debugger to pay close attention to filesystempermissions when creating the file containing the code to be executed, however.If an attacker has the ability to overwrite the file, or to replace a symlinkin the file path to point to somewhere attacker controlled, this would allowthem to force their malicious code to be executed rather than the code thedebugger intends to run.

Using a Single Runtime Buffer

During the review of this PEP it has been suggested using a singleshared buffer at the runtime level for all debugger communications. While thisappeared simpler and required less memory, we discovered it would actually prevent scenarioswhere multiple debuggers need to coordinate operations across different threads,or where a single debugger needs to orchestrate complex debugging operations. Asingle shared buffer would force serialization of all debugging operations,making it impossible for debuggers to work independently on different threads.

The per-thread buffer approach, despite its memory overhead in highly threadedapplications, enables these important debugging scenarios by allowing eachdebugger to communicate independently with its target thread.

Thanks

We would like to thank CF Bolz-Tereick for their insightful comments and suggestionswhen discussing this proposal.

Copyright

This document is placed in the public domain or under the CC0-1.0-Universallicense, whichever is more permissive.

Source:https://github.com/python/peps/blob/main/peps/pep-0768.rst

Last modified:2025-10-04 13:46:29 GMT

Movatterモバイル変換

PEP 768 – Safe external debugger interface for CPython