fix(intervals): clip sub-query interval starts in both kernels (#242)#244
Merged
Conversation
Kernel-clip both backends (numba + rust) so intervals starting before the per-read jittered query window paint correctly. Rejects the two upstream fixes in the issue (write-clip breaks left-jitter; read-side expanded-query is a redesign). Adds oracle test + widens parity strategy to cover negative interval starts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tracks + max_jitter>0 store intervals against a jitter-expanded window, but the read path queries the original chromStart, so left-edge intervals have start < query_start. numba wrapped the negative index (silent wrong output); rust hit debug_assert / bounds panic. Both kernels now clip to the query window (s=max(start,0), e=min(end,length)), which is correct and jitter-preserving. Adds cross-backend oracle tests + rust unit tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…242) Widen the hypothesis strategy so the first interval may start before the query (negative relative start), exercising the case that previously violated the kernel contract. numba↔rust parity holds. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The module doc-comment described end-clamping but not the new left-clamp (s = max(start, 0)) added for sub-query interval starts. Sync it with the inline comment so the documented semantics match the code. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
d-laub
added a commit
that referenced
this pull request
Jun 25, 2026
…pro 0.20 Design for: merge origin/main (#242/#244 clip fix + #243 splice-subset fix) into the branch, lift the now-obsolete #242 xfails, port Reference.fetch to rust, fuse the annotated/splice haps paths, bump seqpro 0.18->0.20 with to_numpy(validate=False) adoption, and reconcile the roadmap honestly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This was referenced Jun 25, 2026
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #242.
intervals_to_tracks(the base-pair track-painting kernel, with numba and rust backends) produced silently-wrong output (numba) or panicked (rust) on datasets with tracks +max_jitter>0.Root cause: intervals are clipped once at write to a fixed jitter-expanded window, but the read window slides per-read, so an interval's start can fall before the per-read
query_start. Both kernels assumeditv.start >= query_start— numba wrapped the resulting negative index (silent wrong output); rust hit adebug_assert/bounds panic.Fix: clip each interval to the query window in both backends —
s = max(start, 0),e = min(end, length), paint only whene > s— preserving the sorted-by-startbreak. This is the complete, jitter-preserving fix; the right edge was already correct. No write-path, read-path coordinate, or public-API changes.Changes
fix(intervals)— clip sub-query starts in both numba (_dataset/_intervals.py) and rust (src/intervals.rs) kernels; cross-backend oracle tests + rust unit teststest(parity)— widen the hypothesis parity strategy to generate first-interval starts belowquery_start, exercising the previously-impossible regime; numba↔rust parity holdsdocs(roadmap)— record the contract fix in the rust migration roadmapdocs(intervals)— sync the rust module doc-comment with the new left-clampVerification
test_haplotypes_plus_tracks_exact,test_reference_plus_tracks_exact,test_end_to_end_set_insertion_fill,test_dummy_dataset_with_default_insertion_fill_does_not_crash)🤖 Generated with Claude Code