Skip to content

feat: fuse annotated+spliced haplotype reconstruction (Phase 5 W3)#258

Merged
d-laub merged 4 commits into
rust-migrationfrom
phase-5-w3
Jun 27, 2026
Merged

feat: fuse annotated+spliced haplotype reconstruction (Phase 5 W3)#258
d-laub merged 4 commits into
rust-migrationfrom
phase-5-w3

Conversation

@d-laub

@d-laub d-laub commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator

Phase 5 W3 — fuse the deferred annotated+spliced reconstruction path

Closes the last un-fused FFI seam in haplotype reconstruction. Three of the four annotated×spliced combinations were already single-FFI-crossing fused kernels; the fourth — annotated AND spliced — was deferred to Phase 5. On the rust backend it ran the un-fused dispatched core plus a Python RC post-pass (_FlatAnnotatedHaps.reverse_masked).

What changed

  • New Rust kernel reconstruct_annotated_haplotypes_spliced_fused (src/ffi/mod.rs) — a faithful merge of the two existing parity-proven kernels: the spliced scaffolding (reconstruct_haplotypes_spliced_fused: precomputed out_offsets, permuted ploidy-1 inputs, no get_diffs_sparse) + the annotation buffers and RC triple (reconstruct_annotated_haplotypes_fused). Registered in src/lib.rs, imported in _haps.py.
  • Python wiring (_dataset/_haps.py): the splice branch of _reconstruct_annotated_haplotypes now calls the fused kernel on the rust backend (RC folded in-kernel) and drops the Python reverse_masked post-pass; the numba branch is retained unchanged as the oracle.

RC invariant (the parity-critical detail)

For the spliced path RC is per permuted element. Rust folds RC in-kernel: rc_flat_rows_inplace on the sequence bytes (reverse+complement) and reverse_flat_rows_inplace on both annotation arrays (reverse only, no complement) — byte-identical to _FlatAnnotatedHaps.reverse_masked(mask, _COMP). Numba RCs externally in _query.py::_getitem_spliced (numba-only guard); _query.py is untouched. No double-RC, no missed-RC.

Parity gate

New tests/parity/test_annotated_spliced_haplotypes_parity.py: spy proves the fused entry fires on rust only; all three arrays (haps, var_idxs, ref_coords) + offsets compared byte-identically to the composed numba oracle; a negative-strand transcript with an rc_neg True-vs-False difference check exercises the in-kernel RC path. Full parity suite green on both backends (78p/1s each); dataset+unit 632p; ruff/format/typecheck/clippy clean; 17 cargo reconstruct tests pass.

numba remains the oracle (deletion is W5/W6). Phase 5 stays 🚧 (W4–W9 remain).

Plan: docs/superpowers/plans/2026-06-26-rust-migration-phase-5-w3.md

🤖 Generated with Claude Code

d-laub and others added 4 commits June 26, 2026 16:29
…sized plan)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…FFI crossing (Phase 5 W3)

Add reconstruct_annotated_haplotypes_spliced_fused — the annotated counterpart of
reconstruct_haplotypes_spliced_fused. Folds RC in-kernel (bytes RC'd, annotation rows
reversed) so the Python _FlatAnnotatedHaps.reverse_masked post-pass is dropped on the
rust backend. Byte-identical to the composed numba oracle (new parity backstop).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…combos now single-FFI (Phase 5 W3)

Also applies ruff formatting to _haps.py (post-Task-1 residual).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@d-laub d-laub merged commit 3172337 into rust-migration Jun 27, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant