Bug description
_remote_debugging.RemoteUnwinder.get_async_stack_trace() reconstructs the async
call graph of a target process by recursing up the awaited_by relation. The
recursion is a three-function cycle with no depth limit, cycle detection, or
_Py_EnterRecursiveCall guard:
process_task_and_waiters → process_task_awaited_by → process_waiter_task →
process_task_and_waiters (…)
(On main these are in Modules/_remote_debugging/asyncio.c; on 3.14 they are in
the single-file Modules/_remote_debugging_module.c.) Each level also stack-allocates
a char task_obj[SIZEOF_TASK_OBJ] (SIZEOF_TASK_OBJ == 4096), so ~1900 levels
exhaust a default 8 MiB stack. When the target's running task sits at the bottom of
a sufficiently deep awaited_by chain, the debugger process (the one calling
get_async_stack_trace()) overflows its C stack and dies with SIGSEGV.
This is asymmetric with the iterative sibling path: get_all_awaited_by /
append_awaited_by_for_thread bounds its walk with MAX_ITERATIONS = 2 << 15.
Only the recursive get_async_stack_trace path is unbounded. The module already
treats the target's tables as untrusted input (debug_offsets_validation.h) and the
thread-list walk already has explicit "corrupted remote memory" cycle detection, so
bounding this traversal is consistent with the module's existing invariants.
The same pattern (C recursion converted to RecursionError instead of a segfault)
was treated as a bug in #137894.
Reproducer
A target with a deep linear awaited_by chain whose leaf is the running task; a
second process attaches and calls get_async_stack_trace():
# target.py <N>: tN await t(N-1) await ... await leaf; leaf busy-spins (= running task)
# attacker.py <pid>:
from _remote_debugging import RemoteUnwinder
RemoteUnwinder(int(pid)).get_async_stack_trace()
N = 10 → returns a stack trace cleanly (exit 0).
N >= ~2000 → attacker process SIGSEGV (exit 139). gdb shows ~1884 stacked
process_task_awaited_by frames terminating at a guard-page fault.
(Full PoC scripts available on request.)
Reproduced on
- CPython 3.14.6 (GA,
python:3.14 image, aarch64 Linux) — crashes.
- CPython 3.16
main (local --with-pydebug build) — same unguarded recursion in source.
Cross-process attach uses the normal Linux ptrace/process_vm_readv path
(--cap-add=SYS_PTRACE); the crash is in the debugger, driven by the target's
graph shape. A privileged profiler/observability tool attaching to an untrusted (or
just legitimately deeply nested) workload is the realistic setting.
Expected behavior
A bounded traversal — raise/propagate an error (as the iterative path does on hitting
its limit), not crash the debugger process.
Proposed fix
Bound the process_task_and_waiters ↔ process_waiter_task recursion, matching the
iterative sibling. I have a PR ready that adds an explicit recursion-depth cap
(MAX_TASK_AWAITED_BY_DEPTH, mirroring the existing MAX_ITERATIONS /
MAX_SET_TABLE_SIZE constants in the module) and raises a RuntimeError on
overflow, which also handles a cyclic awaited_by graph from corrupted remote
memory. Happy to open it.
Linked PRs
Bug description
_remote_debugging.RemoteUnwinder.get_async_stack_trace()reconstructs the asynccall graph of a target process by recursing up the
awaited_byrelation. Therecursion is a three-function cycle with no depth limit, cycle detection, or
_Py_EnterRecursiveCallguard:process_task_and_waiters→process_task_awaited_by→process_waiter_task→process_task_and_waiters(…)(On main these are in
Modules/_remote_debugging/asyncio.c; on 3.14 they are inthe single-file
Modules/_remote_debugging_module.c.) Each level also stack-allocatesa
char task_obj[SIZEOF_TASK_OBJ](SIZEOF_TASK_OBJ == 4096), so ~1900 levelsexhaust a default 8 MiB stack. When the target's running task sits at the bottom of
a sufficiently deep
awaited_bychain, the debugger process (the one callingget_async_stack_trace()) overflows its C stack and dies with SIGSEGV.This is asymmetric with the iterative sibling path:
get_all_awaited_by/append_awaited_by_for_threadbounds its walk withMAX_ITERATIONS = 2 << 15.Only the recursive
get_async_stack_tracepath is unbounded. The module alreadytreats the target's tables as untrusted input (
debug_offsets_validation.h) and thethread-list walk already has explicit "corrupted remote memory" cycle detection, so
bounding this traversal is consistent with the module's existing invariants.
The same pattern (C recursion converted to
RecursionErrorinstead of a segfault)was treated as a bug in #137894.
Reproducer
A target with a deep linear
awaited_bychain whose leaf is the running task; asecond process attaches and calls
get_async_stack_trace():N = 10→ returns a stack trace cleanly (exit 0).N >= ~2000→ attacker process SIGSEGV (exit 139). gdb shows ~1884 stackedprocess_task_awaited_byframes terminating at a guard-page fault.(Full PoC scripts available on request.)
Reproduced on
python:3.14image, aarch64 Linux) — crashes.main(local--with-pydebugbuild) — same unguarded recursion in source.Cross-process attach uses the normal Linux ptrace/
process_vm_readvpath(
--cap-add=SYS_PTRACE); the crash is in the debugger, driven by the target'sgraph shape. A privileged profiler/observability tool attaching to an untrusted (or
just legitimately deeply nested) workload is the realistic setting.
Expected behavior
A bounded traversal — raise/propagate an error (as the iterative path does on hitting
its limit), not crash the debugger process.
Proposed fix
Bound the
process_task_and_waiters↔process_waiter_taskrecursion, matching theiterative sibling. I have a PR ready that adds an explicit recursion-depth cap
(
MAX_TASK_AWAITED_BY_DEPTH, mirroring the existingMAX_ITERATIONS/MAX_SET_TABLE_SIZEconstants in the module) and raises aRuntimeErroronoverflow, which also handles a cyclic
awaited_bygraph from corrupted remotememory. Happy to open it.
Linked PRs