Skip to content

Rust migration Phase 0: dispatch registry, ffi seam, parity harness + intervals_to_tracks#241

Merged
d-laub merged 21 commits into
mainfrom
feat/rust-migration-phase-0
Jun 24, 2026
Merged

Rust migration Phase 0: dispatch registry, ffi seam, parity harness + intervals_to_tracks#241
d-laub merged 21 commits into
mainfrom
feat/rust-migration-phase-0

Conversation

@d-laub

@d-laub d-laub commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Rust Migration — Phase 0: Foundation & differential-test harness

Stands up the reusable migration machinery and proves it end-to-end by migrating one real numba kernel to Rust.

What landed

  • Backend-dispatch registry (python/genvarloader/_dispatch.py) — maps each migratable kernel to {numba, rust, default}; GVL_BACKEND env override for CI parity sweeps; per-kernel default. A temporary strangler-window scaffold (deleted in a later phase).
  • src/ffi/ PyO3 seam — the single place new kernels touch Python. Core Rust logic lives in lazily-grown domain modules (src/intervals.rs first; no empty skeletons).
  • Proof-point kernel: intervals_to_tracks — the base-pair track-painting kernel on the live Dataset.__getitem__ read path, ported numba→Rust byte-identically (integer offset/slice math; float32 values copied, never reduced), routed through dispatch with default="rust". The numba impl is retained as the parity reference.
  • Both-layer differential-test harness (tests/parity/) — run-both-assert-byte-identical (_harness.py, return-value + in-place variants), a hypothesis per-kernel gate (100 contract-valid examples), and a meaningful dataset-level read-path backstop that spies the kernel to prove it is actually invoked + output is non-trivial (not a vacuous pass).
  • Build/test wiring — pixi cargo-test + memray-write tasks; abi3 wheel confirmed (…cp310-abi3…).
  • Baseline driversprofile_write.py + baseline_getitem.sh. gvl.write baseline captured (1.143 s / 3.59 GB on the 1kg slice).
  • Roadmap updated (docs/roadmaps/rust-migration.md).

Proof-point pivot (important)

The plan originally migrated splits_sum_le_value, but it was discovered dead on the default gvl.write path (_write_track routes BigWigs/Table to their own writers, bypassing the only caller). That migration was cleanly reverted and the proof-point re-pointed to intervals_to_tracks, which is verified live on the read path. The dead-path lesson is baked into the backstop (it now asserts the kernel is actually called).

Tests

  • cargo test green (incl. 5 new intervals_to_tracks unit tests); tests/parity + tests/unit + tests/dataset = 581 passed; per-kernel parity = 100 hypothesis examples; read-path backstop spy-verified; typecheck 0 errors; abi3 wheel builds.

Status / follow-up

  • Roadmap marker stays 🚧 until the getitem/tracks baseline lands — it needs the /carter (GVL_BENCH_SOURCE) corpus + macOS sudo (py-spy), handed off: build_realistic.pymemray-tracks + baseline_getitem.sh.
  • This branch also carries the Phase 0 spec + plan doc commits (not yet on origin/main).

🤖 Generated with Claude Code

d-laub and others added 21 commits June 23, 2026 16:39
…ness

Backend dispatch registry, ffi/ seam, both-layer parity harness,
all-four baselines, splits_sum_le_value proof-point (Python-entry
kernel; padded_slice rejected as njit-internal leaf).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
8 tasks: dispatch registry, ffi/ seam + Rust splits_sum_le_value,
call-site routing, both-layer parity harness, build/wheel wiring,
baselines, roadmap update.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Exercises splits_sum_le_value dispatch via _write_track_legacy with a
4-sample synthetic BigWigs track (60 regions, MAX_MEM=50 000 → 6 chunks).
Both numba and rust backends produce byte-identical intervals.npy and
offsets.npy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Port numba intervals_to_tracks kernel to pure-ndarray Rust core
(src/intervals.rs) with byte-identical contract: zero entire out buffer,
end-clamp, break-on-start>=length, copy-not-reduce, sequential queries.
Add PyO3 ffi seam (src/ffi/mod.rs) that writes into the caller's numpy
buffer. Rename existing bigwig `intervals` pyfunction to `bigwig_intervals`
to avoid namespace collision with the new `pub mod intervals`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add two convenience tasks to [tasks]:
- cargo-test: shortcut for `cargo test --release`
- memray-write: memray profile for the write path (profile_write.py created in Task 7)

Wheel genvarloader-0.35.0-cp310-abi3-macosx_11_0_arm64.whl builds cleanly
with the new pure-Rust intervals/ and ffi/ modules. The release wheel matrix
(.github/workflows/publish.yaml, py310-313 x linux-64/osx-arm64) is
release-gated and unaffected by these pure-Rust additions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Capture the pending Phase 0 read-path baseline on the Carter HPC and close
out the foundation phase.

- profile.py now prints wall-clock + throughput (BURN_IN excluded). The
  baseline_getitem.sh comment referenced a number the script never emitted.
- Baseline (chr22_geuv, 165 regions × 5 samples, NUMBA_NUM_THREADS=1,
  AMD EPYC 7543 linux-64): tracks (intervals_to_tracks, default rust)
  169.9 batch/s / 3.531 GB peak; haplotypes 123.9 batch/s. variants mode
  blocked by a separate pre-existing bug (_FlatVariants.to_fixed).
- build_realistic.py: drop symbolic/breakend/multi-allelic variants at the
  plink2 stage (drop_unsupported_variants). The full 1kGP chr22 set carries
  symbolic SVs the prior corpus build never hit; filtering via a genoray
  reader filter trips a filtered-PGEN coordinate-space bug in gvl.write
  (var_idxs unfiltered-space vs filtered _index) — filed as d-laub/genoray#69.
  Pre-filtering keeps both spaces aligned and rebuilds the stale 0.25.0
  (truncated-tracks) corpus as clean 0.35.0.
- Roadmap: fill baseline table, flip Phase 0 🚧→✅, note update() deferred to
  Phase 4 (driver only runs a synthetic smoke annot).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… for variants)

Variants are ragged by definition (allele lengths vary), so the variants
profiling mode requesting with_len(SEQLEN) was the error, not gvl: gvl's
fixed-length path intends to pass variant output through untouched. Query
variants variable-length instead.

Captured: variants 145.3 batch/s (6.884 ms/batch). Roadmap note corrected —
the earlier "blocked by a bug" was wrong. (A real but minor gvl gap remains:
a mixed variants+fixed-length-tracks query AttributeErrors because the
_query.py exemption checks RaggedVariants while the value is still
_FlatVariants — one-line guard, not a Phase 0 gate.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@d-laub d-laub merged commit 1c2fd08 into main Jun 24, 2026
8 checks passed
@d-laub d-laub deleted the feat/rust-migration-phase-0 branch June 24, 2026 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant