Add decommission command for repos transferred out of the org by dev-milos · Pull Request #58 · G-Research/github-terraformer

dev-milos · 2026-06-29T10:49:35Z

What

Adds a decommission subcommand to the importer CLI that produces the exact terraform state rm commands needed to stop managing a repository without destroying it, plus a runbook (docs/decommissioning.md).

Refs #15.

Why

When a repository is transferred out of the org but its repos/<repo>.yaml stays in GCSS, Terraform still tracks it in state. On the next run the provider can no longer read the repo (it belongs to another org), so plan/apply fails — and because the workspace state is shared, one orphaned repo blocks PRs for every repository.

Deleting the YAML on its own does not fix this: the repo is then in state but not in config, so Terraform plans to destroy it — either erroring against a repo the org no longer owns, or acting on a repo that now belongs to someone else. The correct action is to remove the resources from state (forget) rather than destroy them.

What it does

For each --repo, the command reads the YAML and derives every state address the repo owns:

module.repository["<repo>"] — the repo and everything nested in it
github_repository_ruleset.ruleset["<sha256("<repo>/<name>")>"] — one per ruleset
github_repository_custom_property.custom_property["<repo>/<name>"] — one per custom property

The ruleset key is a hash, so it can't be found by grepping the repo name in terraform state list — deriving it from the YAML is the point. It then emits a ready-to-run terraform state rm script (optionally --output <file> and --delete-yaml).

Because the commands mutate state only and never Terraform configuration, they cannot break plans for other repositories.

Why state-only instead of a `removed` block

A removed { … destroy = false } block is the declarative equivalent, but it must live in the Terraform root (sourced from this repo) while the YAML lives in the config repo. A removed block for an address still declared in config (i.e. while the YAML exists) is a hard error for every plan, and there is no atomic way to land the two changes across two repos. A state-only operation sidesteps that.

Testing

Unit tests cover address derivation, including that the ruleset key matches Terraform's sha256() hex output, ordering, and YAML resolution.
Verified the mechanism end-to-end against a live org: applied a for_each module instance + repo, ran terraform state rm 'module.repository["<name>"]', and confirmed the resource left state while the GitHub repo still existed (no destroy).

Follow-up (separate, needs prod state access)

The original incident was cleaned up manually with terraform state list | grep "Pulp-manager", which would not have matched hash-keyed rulesets. Worth checking the live state for an orphaned github_repository_ruleset still pointing at Pulp-manager.

When a repository leaves GCSS management (e.g. transferred to another org), its YAML must be removed without Terraform destroying the now-foreign repo. Deleting the YAML alone makes Terraform plan a destroy, which fails against a repo the org no longer owns and blocks the shared workspace's plans for every repository. Add a `decommission` subcommand that derives every Terraform state address a repository owns from its YAML — the module instance, rulesets (keyed by sha256("<repo>/<name>"), which a `terraform state list | grep <repo>` cannot find) and custom properties — and emits the `terraform state rm` commands to forget them without destroying the underlying GitHub objects. State-only operations do not touch Terraform configuration, so they never break plans for other repositories. Includes unit tests and a runbook (docs/decommissioning.md). Refs #15

Remove --output (redundant with shell redirection) and --delete-yaml (the only file-mutating behavior). The command now only prints terraform state rm commands and makes no changes itself.

- Generated script now uses an idempotent rm_state helper (skips addresses already absent from state), so it is safe to re-run or resume after a partial run; set -euo pipefail still aborts on genuine errors. - Runbook adds an explicit workspace-lock step covering the inconsistent window between state rm and YAML deletion, where a triggered run would otherwise plan to CREATE (and could recreate an empty repo in the org). - Clarify that decommissioning is for repos that have left the org (transferred out / deleted), not for archiving a repo you still own.

The generated script's rm_state helper used `grep -qxF` (whole-line match) to test whether an address is in state. `terraform state list` never prints a bare `module.repository["<repo>"]` line — only the resource instances nested inside it — so the whole-line match never found the module address, the guard reported it "already absent", and `terraform state rm` was never run on it. The script silently no-opped on the repository and all its nested resources (the exact thing #15 is about). Switch the guard to `grep -qF -- "$1"` (fixed-string substring match), which matches both the nested module lines and the verbatim top-level ruleset/custom-property lines. No false positives: every emitted address ends in `"]`, so `module.repository["foo"]` cannot match `module.repository["foobar"]...`. Factor the match semantics into stateAddressPresent in Go and add a regression test exercising it (module-via-nested-lines, verbatim top-level, genuinely absent, and the foo/foobar false-positive guard), since the existing tests only covered address derivation.

cobra's cmd.Print/Printf/Println default to OutOrStderr(), so the generated script went to stderr. `decommission ... > script.sh` produced an empty file and piping to a shell got nothing — the script was unusable for its primary purpose. Write it to cmd.OutOrStdout() explicitly, and add a test asserting the script lands on stdout (and nothing on stderr).

dev-milos marked this pull request as draft June 29, 2026 10:49

dev-milos added 5 commits June 29, 2026 13:06

Trim decommissioning runbook to operator steps only

1abb316

Drop optional flags; make decommission a pure command generator

bfbc3bf

Remove --output (redundant with shell redirection) and --delete-yaml (the only file-mutating behavior). The command now only prints terraform state rm commands and makes no changes itself.

dev-milos marked this pull request as ready for review June 29, 2026 13:31

dev-milos requested a review from pavlovic-ivan June 29, 2026 13:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add decommission command for repos transferred out of the org#58

Add decommission command for repos transferred out of the org#58
dev-milos wants to merge 6 commits into
mainfrom
decommission-transferred-out-repos

dev-milos commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dev-milos commented Jun 29, 2026

What

Why

What it does

Why state-only instead of a removed block

Testing

Follow-up (separate, needs prod state access)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Why state-only instead of a `removed` block