Skip to content

fix(intervals): clip sub-query interval starts in both kernels (#242)#244

Merged
d-laub merged 6 commits into
mainfrom
fix/issue-242-intervals-track-jitter-clip
Jun 25, 2026
Merged

fix(intervals): clip sub-query interval starts in both kernels (#242)#244
d-laub merged 6 commits into
mainfrom
fix/issue-242-intervals-track-jitter-clip

Conversation

@d-laub

@d-laub d-laub commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

Fixes #242. intervals_to_tracks (the base-pair track-painting kernel, with numba and rust backends) produced silently-wrong output (numba) or panicked (rust) on datasets with tracks + max_jitter>0.

Root cause: intervals are clipped once at write to a fixed jitter-expanded window, but the read window slides per-read, so an interval's start can fall before the per-read query_start. Both kernels assumed itv.start >= query_start — numba wrapped the resulting negative index (silent wrong output); rust hit a debug_assert/bounds panic.

Fix: clip each interval to the query window in both backends — s = max(start, 0), e = min(end, length), paint only when e > s — preserving the sorted-by-start break. This is the complete, jitter-preserving fix; the right edge was already correct. No write-path, read-path coordinate, or public-API changes.

Changes

  • fix(intervals) — clip sub-query starts in both numba (_dataset/_intervals.py) and rust (src/intervals.rs) kernels; cross-backend oracle tests + rust unit tests
  • test(parity) — widen the hypothesis parity strategy to generate first-interval starts below query_start, exercising the previously-impossible regime; numba↔rust parity holds
  • docs(roadmap) — record the contract fix in the rust migration roadmap
  • docs(intervals) — sync the rust module doc-comment with the new left-clamp

Verification

  • Full tree green: 847 passed / 21 skipped / 4 xfailed (pytest) + 20 rust tests (cargo)
  • The four issue-named end-to-end tests now pass (test_haplotypes_plus_tracks_exact, test_reference_plus_tracks_exact, test_end_to_end_set_insertion_fill, test_dummy_dataset_with_default_insertion_fill_does_not_crash)
  • Both backends byte-identical across all input regimes (start<0, end<0, end>length, start>=length, empty slice)
  • ruff format/check + pyrefly clean

🤖 Generated with Claude Code

d-laub and others added 6 commits June 24, 2026 18:55
Kernel-clip both backends (numba + rust) so intervals starting before
the per-read jittered query window paint correctly. Rejects the two
upstream fixes in the issue (write-clip breaks left-jitter; read-side
expanded-query is a redesign). Adds oracle test + widens parity strategy
to cover negative interval starts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tracks + max_jitter>0 store intervals against a jitter-expanded window,
but the read path queries the original chromStart, so left-edge intervals
have start < query_start. numba wrapped the negative index (silent wrong
output); rust hit debug_assert / bounds panic. Both kernels now clip to
the query window (s=max(start,0), e=min(end,length)), which is correct and
jitter-preserving. Adds cross-backend oracle tests + rust unit tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…242)

Widen the hypothesis strategy so the first interval may start before the
query (negative relative start), exercising the case that previously
violated the kernel contract. numba↔rust parity holds.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The module doc-comment described end-clamping but not the new left-clamp
(s = max(start, 0)) added for sub-query interval starts. Sync it with the
inline comment so the documented semantics match the code.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@d-laub d-laub merged commit b4b7c91 into main Jun 25, 2026
8 checks passed
@d-laub d-laub deleted the fix/issue-242-intervals-track-jitter-clip branch June 25, 2026 03:06
d-laub added a commit that referenced this pull request Jun 25, 2026
…pro 0.20

Design for: merge origin/main (#242/#244 clip fix + #243 splice-subset fix)
into the branch, lift the now-obsolete #242 xfails, port Reference.fetch to
rust, fuse the annotated/splice haps paths, bump seqpro 0.18->0.20 with
to_numpy(validate=False) adoption, and reconcile the roadmap honestly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tracks + max_jitter>0: silent wrong track output (numba) / panic (rust) from interval/query coordinate mismatch

1 participant