Release/0.7.8 by maltsev-dev · Pull Request #37 · nullrunio/nullrun-sdk-python

maltsev-dev · 2026-06-28T07:39:03Z

What

Why

How

Test plan

Unit tests pass (per-repo, e.g. cd backend && cargo test, cd frontend && npm test)
Lint passes (per-repo, e.g. cd frontend && npm run lint)
Type-check passes (per-repo, e.g. cd frontend && npm run type-check)
Manually verified in dev / staging

Risk

Checklist

I have read the repo's CONTRIBUTING.md (if present)
My change does not introduce new lint warnings
I have updated the CHANGELOG (if user-visible)
I have considered backwards compatibility

Additive patch on top of the 0.7.0 thin-client refactor. No breaking changes. Added ----- * nullrun.integrations.fastapi — one-line FastAPI integration that turns every NullRunDecision / NullRunInfrastructureError thrown by @nullrun.protect endpoints into a clean JSON response with the right HTTP status code. No per-endpoint except blocks required. Response shape: {"error_code": "NR-B004", "user_message": "You've reached the usage limit...", "category": "decision"} HTTP status mapping: * NR-B004 (budget), NR-L001 (loop), NR-R001 (rate) -> 429 with optional Retry-After * NR-T001 (tool blocked), NR-X001 (generic block) -> 403 * NR-W003 (paused) -> 503 with Retry-After * NR-W002 (killed) -> 503; WorkflowKilledInterrupt is a BaseException subclass so Starlette's add_exception_handler refuses it — handled via ASGI middleware instead (hybrid pattern, documented in module docstring). * NullRunInfrastructureError subclasses -> 503 (our side, not user's). * nullrun.messages — default user-facing message catalog. Every NR-* error code has an English default message owned by NULLRUN, not customer code. Customer Support Bots hitting a budget cap show the same wording across every NullRun-backed application. * format_user_message(exc) — render exception as user-facing string * set_user_message(code, text) — per-process override for branded variants * get_user_message(code) — raw lookup * reset_overrides() — clear all overrides (for tests) Changed ------- * Transport._send_batch canonical JSON serialization — route the /track/batch body through _signed_request_body for consistent compact-separator serialization. HMAC itself is unaffected, but consistent serialization removes a special-case from the wire-format contract tests. * Transport._send_batch actions response handling — backend renamed BatchTrackResponse.actions_taken (debug names) -> BatchTrackResponse.actions (ActionTaken structs). Read both for forward-compat; per-element try/except so one malformed entry doesn't abort the whole loop. * pyproject.toml metadata — long-form description with search keywords, Maintainer: populated via maintainers=[...], expanded classifiers (Linux / Windows / macOS, Python 3.13, CPython, Security / AI / WWW/HTTP topics), project URL expander. Tests ----- * tests/test_messages.py (new, 282 lines) — catalog completeness (every NR-* code has a default message), override / reset behavior, render path. * tests/test_integrations_fastapi.py (new, 289 lines) — HTTP status mapping per error code, response shape, ASGI middleware path for WorkflowKilledInterrupt, hybrid composition. * tests/test_decision_split.py (new, 199 lines) — pins the decision / infrastructure error split. * Updates to tests/test_runtime.py, tests/test_extractors.py reflecting transport canonical-JSON + actions-renamed changes. Release plumbing ---------------- * pyproject.toml: version bumped 0.7.0 -> 0.7.6 * src/nullrun/__version__.py: __version__ = "0.7.6" * CHANGELOG.md: full 0.7.6 entry covering additions, transport changes, metadata improvements Tests pass locally (per session log) — pytest on Windows / Python 3.14.2 is green.

…padding PR #35 (release/0.7.6) failed all four CI jobs (test 3.10/3.11/3.12, coverage, codecov/patch) on the same root cause + one latent bug masked by it. This commit lands the fixes plus the last-mile tests that bring coverage above the 82% threshold. CI failure root --------------- * tests/test_integrations_fastapi.py does from fastapi import ... at module top-level. CI installs only pip install -e '.[dev]', and fastapi was declared as an *optional* [fastapi] extra, NOT in [dev]. Pytest collection aborted with ModuleNotFoundError: No module named 'fastapi' → all 4 jobs red. * Fix: add fastapi>=0.100,<1.0 to [dev]. Same precedent as langchain-core (already in [dev] for the same import-time contract: nullrun.instrumentation.langgraph is eager-imported from nullrun.decorators at collection time, so the test extras must cover the import chain). Latent bug surfaced by the first fix ------------------------------------ The same PR refactored Transport._send_batch_with_retry_info to route the /track/batch body through _signed_request_body for canonical-JSON serialization (matching /gate and /execute). The two sibling call sites use the module-level helper _signed_request_body (no self.); this one used self._signed_request_body by typo. Result: AttributeError on every batch flush, breaking 15 existing tests across test_transport.py / test_track_batch_retry.py / test_integration_contract.py / test_signal_safety.py. As long as the fastapi collection error aborted pytest, this was hidden. Fixed to _signed_request_body(...) with a docstring noting why it is module-level and what the bug looked like. Coverage padding (codecov/patch was failing on this too) -------------------------------------------------------- Total coverage on the failing CI run was 81.98% — 0.02pp under the fail-under=82 gate. After the two fixes above it would have recovered to ~82.0% on the dot, so I added minimal tests for the cheapest-to-cover gaps: * tests/test_breaker_main.py (new) — covers the 5 statements in nullrun.breaker.__main__.main() (0% → 100%). The module exists so python -m nullrun.breaker exits cleanly instead of failing with No module named nullrun.breaker.__main__; the previous fix-mechanism was return 0 after a print, but no test was exercising it. * tests/test_status.py — extends TestSummary with seven scenarios covering each conditional branch of NullRunStatus.summary() (organization_id, workflow_id, workflow_state != Normal, backend_reachable=False, ws_connected=False, recent_errors). status.py jumps 84.52% → 98.81%. * tests/test_integrations_fastapi.py — four tests on _build_headers covering non-numeric, zero, negative, and resume_after (the WorkflowPausedException code path). integrations/fastapi.py jumps 90.22% → 94.57%. After all three: TOTAL 81.98% → 82.46%, comfortably above the gate. Verification ------------ * Local pytest: 997 passed, 13 skipped, 0 failed (Windows / Python 3.14.2, 8m47s — same env the original commit was validated in). * python -m coverage report — 82.46%, no fail-under complaint.

…ng/tools Patch coverage on PR #35 was 62.38% against a 65% threshold (codecov target 70% / threshold 5pp). The two biggest delta-holders against master were auto.py (+286) and langgraph.py (+221), both dominated by Phase 4.1 additions: * auto._normalize_finish_reason + _FINISH_REASON_MAP * auto._openai_extractor second-tier fields (cache_read_tokens, cache_write_tokens, reasoning_tokens, finish_reason, tool_names) * auto._anthropic_extractor cache_read / cache_write * langgraph._safe_get_gen_message * langgraph._get_finish_reason (5-source fallback chain) * langgraph.extract_usage_from_response second-tier fields These are pure / near-pure functions with no network or vendor SDK calls. Coverage padding is cheap — pin the canonical wire shapes once and the backend ingest contract gets a free live spec. Local numbers: * auto.py 63.44% -> 64.01% (file-level, +57 statements) * langgraph.py 78.50% -> 86.01% (file-level, +32 statements) * TOTAL 82.46% -> 83.13% (already above 82% gate) 41 tests, all green. Existing test_extractors.py and test_langgraph_callback.py left untouched — these tests deliberately target the Phase 4.1 fields (cache_read / cache_write / reasoning / finish_reason / tool_names) that the older tests didn't pin.

Pre-0.7.7 every SDK /gate call for any workflow with a budget was hard-blocked because the runtime hard-coded the literal string "budget-precheck" as the model. The backend's PolicyEvaluationGraph treated any synthetic cost_limit rule with score > 0.8 as Block, so the pricing lookup never landed on a real model and the rule fired with the wrong score. This commit: * Adds nullrun.set_call_context(model=..., tools=[...]) plus get_call_model / get_call_tools helpers (and the underlying _call_model_var / _call_tools_var contextvars in nullrun.context). * Wires the call context into check_workflow_budget: the /gate payload now carries the real model name (or None when unset) and the user-supplied tool list. tools=[] vs missing-None are distinguished on the wire per gate/internal.rs::check_tool_block. * Transport.check forwards the tools key when set (it was silently dropped pre-fix). * tests/conftest.py reset_runtime clears the new contextvars so a test's set_call_context(...) doesn't leak into the next test's wire payload. * New tests/test_gate_real_path.py pins down the regression: default request allows a clean workflow, real block still honored, no policy-N residue on the wire, set_call_context flows into the body, no-context means no tools key, and the helpers are reachable from nullrun.*. Bumps version to 0.7.7. No breaking changes - new helpers default to None / empty so existing call sites keep working.

Conflict resolution between release/0.7.7 (T4 per-call context for /gate) and origin/master (Release/0.7.6 #35, which bumped the SDK to 0.7.6): * pyproject.toml: keep 0.7.7 (the HEAD side). 0.7.6 on master is superseded by 0.7.7 once this merges. * CHANGELOG.md: keep BOTH the new 0.7.7 block (from HEAD) and the 0.7.6 block (from master). They document different releases and are listed in chronological order with the older 0.7.6 block below. * src/nullrun/{__init__.py, runtime.py, transport.py}: auto-merged cleanly - master doesn't touch the T4 hunks. Auto-merge result equals HEAD, but the merge commit is still needed to record the parent relationship and clear the conflict state on the PR.

Two silent fail-OPEN footguns are converted to explicit DeprecationWarning / RuntimeError so misconfigurations show up at SDK init instead of being diagnosed from a missing proto trace. Deprecated: * NullRunRuntime.start_recording() and .stop_recording() now emit DeprecationWarning. They have been silent no-op stubs since Sprint 2.1 (0.4.0) — decision history is now on the backend dashboard at /control-center/decision-history. Both methods will be removed in 0.9.0. * NULLRUN_USE_GRPC=1 now raises RuntimeError at SDK init instead of silently falling back to HTTP with an info log. gRPC is on the roadmap but not implemented; unset the env var to use HTTP. Hardening (init path): * Transport._post_auth_with_retry (new) — retry transient 503 / 504 + network blips during /api/v1/auth/verify. Backend emits 503 + Retry-After: 5 on transient DB errors (handlers.rs:11346-51). Pre-fix the first 503 surfaced as NR-A001 to the user as if the API key were bad. Three attempts, exponential backoff (0.5s → 1s → 2s), honors Retry-After when present. Auth-key failures (401) are NOT retried — a wrong key on attempt 1 is a wrong key on attempt 3. Transport refactor: * Transport._add_hmac_headers (new) — pulls the HMAC header construction out of _signed_request_body so /track/batch, /gate, /check, /execute all share one source of truth for Content-Type / X-Signature / X-Signature-Timestamp / X-API-Key / Authorization headers. HMAC formula unchanged. * generate_hmac_signature + verify_hmac_signature accept str | bytes for body. Legacy str callers (and the FastAPI integration) keep working without an explicit .encode(). * actions_taken → actions on /track/batch response. Backend renamed BatchTrackResponse.actions_taken (debug names) → actions (ActionTaken structs with human-readable strings moved to messages). Read both keys for forward-compat. Test updates: * tests/test_framework_patches — alignment with retry + actions rename. * tests/test_high_reliability_fixes — re-pinned for _post_auth_with_retry. * tests/test_hmac_signing — expanded for str/bytes body + new _add_hmac_headers helper. * tests/test_integration_contract — backend actions rename covered. * tests/test_transport — retry semantics. Bumps version to 0.7.8. No breaking changes for callers who don't touch start_recording / stop_recording / NULLRUN_USE_GRPC.

Conflict resolution between release/0.7.8 (fail-loud on deprecated surface) and origin/master (release: 0.7.7 #36, the squash-merge of PR #36 which bumped the SDK to 0.7.7): * pyproject.toml: keep 0.7.8 (the HEAD side). 0.7.7 on master is superseded by 0.7.8 once this merges. * src/nullrun/__version__.py: keep 0.7.8 (same reasoning). * CHANGELOG.md: keep BOTH the new 0.7.8 block (from HEAD) and the 0.7.7 block (from master). They document different releases and are listed in chronological order with the older 0.7.7 block below. * src/nullrun/runtime.py and src/nullrun/transport.py: auto-merged cleanly - master doesn't touch the 0.7.8 hunks. * Test files: auto-merged cleanly - master doesn't touch the 0.7.8 test changes either. Auto-merge result equals HEAD, but the merge commit is still needed to record the parent relationship and clear the conflict state on the PR.

The 0.7.8 commit changed NULLRUN_USE_GRPC=1 from silent no-op + INFO log to an explicit RuntimeError, but the regression test in tests/test_grpc_removed.py still pinned the old behavior (``test_nullrun_use_grpc_does_not_crash_init`` asserting make_runtime() succeeded and an INFO line was logged). CI on PR #38 failed on this test: FAILED tests/test_grpc_removed.py::TestGrpcRemoved ::test_nullrun_use_grpc_does_not_crash_init E RuntimeError: NULLRUN_USE_GRPC is set but the gRPC transport is not yet implemented. ... This commit updates the test to pin the new 0.7.8 contract: the env var must raise RuntimeError, and the error message must name the offending variable + point at the docs page. The test is renamed from ``test_nullrun_use_grpc_does_not_crash_init`` to ``test_nullrun_use_grpc_raises_runtime_error`` so the test name itself documents the new contract. The module docstring (point 2 in the contract list) is updated to say "raises RuntimeError" instead of "does NOT crash init — it logs an INFO line and silently falls back to HTTP". The 0.3.1 -> 0.7.8 evolution is documented in the test docstring as a contract-evolution footnote for future maintainers. Imports: removed unused `import logging` and `caplog` parameter (no longer asserting on log records); added `import pytest` for `pytest.raises`. No production-code change. No version bump. The fix is self-contained to tests/test_grpc_removed.py.

The 0.7.8 commit (fail-loud on deprecated surface) added ``import warnings`` mid-block in src/nullrun/runtime.py:34, breaking alphabetical order: asyncio logging os warnings <-- out of order threading time uuid Ruff on PR #38 CI (Run ruff check src/) flagged it as I001. Reorder to alphabetical: asyncio logging os threading time uuid warnings Verified: * ruff check src/ -> All checks passed! * pytest tests/test_grpc_removed.py tests/test_runtime_branches.py -> 47 passed No behavior change, no production logic touched. Pure lint fix.

codecov · 2026-06-28T08:31:29Z

Codecov Report

❌ Patch coverage is 68.18182% with 14 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/nullrun/runtime.py	57.57%	12 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

maltsev-dev · 2026-06-28T08:34:43Z

Closing as duplicate of #38, which was merged into master as commit c5a8e65 release: 0.7.8 — fail-loud on deprecated surface (#38) at 2026-06-28T08:32:58Z.

Both PRs pointed at the same head SHA 8941098 on release/0.7.8 (same 9 commits). The release shipped via #38; nothing to merge here.

maltsev-dev added 9 commits June 27, 2026 12:14

maltsev-dev closed this Jun 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Release/0.7.8#37

Release/0.7.8#37
maltsev-dev wants to merge 9 commits into
masterfrom
release/0.7.8

maltsev-dev commented Jun 28, 2026

Uh oh!

codecov Bot commented Jun 28, 2026

Uh oh!

maltsev-dev commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

maltsev-dev commented Jun 28, 2026

What

Why

How

Test plan

Risk

Checklist

Uh oh!

codecov Bot commented Jun 28, 2026

Codecov Report

Uh oh!

maltsev-dev commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant