Skip to content

docs: final single-thread numba-vs-rust A/B — gate passed (Phase 5 W4)#259

Merged
d-laub merged 6 commits into
rust-migrationfrom
phase-5-w4
Jun 27, 2026
Merged

docs: final single-thread numba-vs-rust A/B — gate passed (Phase 5 W4)#259
d-laub merged 6 commits into
rust-migrationfrom
phase-5-w4

Conversation

@d-laub

@d-laub d-laub commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator

Phase 5 W4 — final single-thread __getitem__ A/B (benchmark-only, no code)

The migration's final single-thread parity gate before the W5 consolidation (numba deletion + rayon). Gate: rust at parity-or-better single-thread → proceed to consolidation.

Methodology

The shared Carter node makes cross-session wall-clock unreliable, so rust and numba were measured single-thread, in the same back-to-back session, two passes (within-session stable), with the ratio direction pinned (speedup = numba÷rust, higher ⇒ rust faster). The durable signal is byte-identical parity (already gated across W1–W3 + the full parity suite) plus same-session improve-or-hold. Two independent tools agreed: test_e2e.py pedantic-min and profile.py steady-state throughput.

Result — GATE PASSED

Rust is parity-or-better on every mode:

Mode speedup (numba÷rust)
haplotypes ~1.65×
tracks-seqs ~1.65×
annotated ~1.4×
variants ~1.4×
variant-windows ~4.6×
tracks-only (pure) ~1.05× (parity — fixed per-batch IO cost, not kernel-bound; rust never behind)

Combined with byte-identical parity, there is no single-thread regression risk in removing numba. → Proceed to W5 (golden-snapshot the numba-oracle parity suites, delete numba, add rayon batch parallelism gated byte-identical to the serial golden result).

Full tables + methodology: docs/roadmaps/phase-5-w4-final-ab.md.

Measured on phase-5-w4rust-migration + W3 fusion (#258); W2 (#257) is test-only and perf-neutral. numba remains present until W5. Phase 5 🚧 (W1–W4 done; W5–W9 remain).

🤖 Generated with Claude Code

d-laub and others added 6 commits June 26, 2026 16:29
…sized plan)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…FFI crossing (Phase 5 W3)

Add reconstruct_annotated_haplotypes_spliced_fused — the annotated counterpart of
reconstruct_haplotypes_spliced_fused. Folds RC in-kernel (bytes RC'd, annotation rows
reversed) so the Python _FlatAnnotatedHaps.reverse_masked post-pass is dropped on the
rust backend. Byte-identical to the composed numba oracle (new parity backstop).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…combos now single-FFI (Phase 5 W3)

Also applies ruff formatting to _haps.py (post-Task-1 residual).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… passed

Rust parity-or-better single-thread on every __getitem__ mode (same-session,
two tools, two passes): haps/tracks-seqs ~1.65x, annotated/variants ~1.4x,
variant-windows ~4.6x, pure tracks-only ~1.05x (fixed-cost-bound, parity).
Combined with byte-identical parity (W1-W3 + full suite), no regression risk
in removing numba. Gate passed -> proceed to W5 consolidation.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@d-laub d-laub merged commit efb87ea into rust-migration Jun 27, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant