test(utils): fix flaky SIGTERM-ignored terminate test (bash tail-exec)#81
Merged
Conversation
…e test bash -c "trap \"\" TERM; sleep 60" tail-execs into sleep, so /proc/pid/cmdline argv0 becomes "sleep". TerminateProcess verifies argv0 == "bash" before signaling; when the exec wins the startup race against that check it no-ops, the undead sleep runs its full 60s, and the utils package trips its 120s test timeout. A trailing "; :" keeps bash resident so the cmdline stays "bash".
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The
testjob on master (7e4406f) failed with a 120s package timeout inutils, hung onTestTerminateProcess_SIGTERMIgnored_FallsBackToKill(58s in). A rerun passed — it is flaky, not a regression, and not caused by #80 (which only added two fast pure-function tests to the package).Root cause
The helper process
bash -c ""trap "" TERM; sleep 60""gets tail-exec-optimized by bash: withsleepas the last command, bashexecs it in place, so/proc/<pid>/cmdlineargv0 becomessleep, notbash. Verified on ubuntu 24.04 bash 5.2.21 (the CI image) and bash 5.3.TerminateProcess(pid, "bash", "sleep", …)callsVerifyProcessCmdline, which requiresfilepath.Base(argv0) == "bash"before it will signal. It is a startup race:cmd.Start()forks bash, which then takes a few ms to exec into sleep.bash …before the exec lands → check passes → SIGTERM (trapped) → grace → SIGKILL → fast pass.sleep→ check fails →TerminateProcessno-ops → the undeadsleep 60runs its full 60s → the test blocks oncmd.Wait()→ theutilspackage exceeds its 120s-timeout.(macOS never hit this:
verifyProcessCmdlinereturnserrVerifyUnsupportedthere, soVerifyProcessCmdlinefalls back toIsProcessAliveand always proceeds.)Fix
Append
; :sosleepis no longer the last command and bash cannot tail-exec — the process stays resident asbash, keeping argv0 stable. This is the same guardTestFindVMMByCmdlinealready uses (sleep 60 && :). ProductionTerminateProcesscallers pass real binaries (cloud-hypervisor) that never rename themselves, so only the test was affected.Verification
sleep 60; fixed command → cmdlinebash -c ...(argv0 staysbash).-race, 50 iterations: all pass, no hang (181s total, uniform — a single flake hit would have added a ~60s outlier).make lint(linux+darwin, 0 issues),golangci-lint fmt --diffclean.