Skip to content

Release/0.7.8#37

Closed
maltsev-dev wants to merge 9 commits into
masterfrom
release/0.7.8
Closed

Release/0.7.8#37
maltsev-dev wants to merge 9 commits into
masterfrom
release/0.7.8

Conversation

@maltsev-dev

Copy link
Copy Markdown
Member

What

Why

How

Test plan

  • Unit tests pass (per-repo, e.g. cd backend && cargo test, cd frontend && npm test)
  • Lint passes (per-repo, e.g. cd frontend && npm run lint)
  • Type-check passes (per-repo, e.g. cd frontend && npm run type-check)
  • Manually verified in dev / staging

Risk

Checklist

  • I have read the repo's CONTRIBUTING.md (if present)
  • My change does not introduce new lint warnings
  • I have updated the CHANGELOG (if user-visible)
  • I have considered backwards compatibility

Additive patch on top of the 0.7.0 thin-client refactor. No
breaking changes.

Added
-----

* nullrun.integrations.fastapi — one-line FastAPI integration
  that turns every NullRunDecision / NullRunInfrastructureError
  thrown by @nullrun.protect endpoints into a clean JSON
  response with the right HTTP status code. No per-endpoint
  except blocks required.

  Response shape:
    {"error_code": "NR-B004",
     "user_message": "You've reached the usage limit...",
     "category": "decision"}

  HTTP status mapping:
    * NR-B004 (budget), NR-L001 (loop), NR-R001 (rate) -> 429
      with optional Retry-After
    * NR-T001 (tool blocked), NR-X001 (generic block) -> 403
    * NR-W003 (paused) -> 503 with Retry-After
    * NR-W002 (killed) -> 503; WorkflowKilledInterrupt is a
      BaseException subclass so Starlette's
      add_exception_handler refuses it — handled via ASGI
      middleware instead (hybrid pattern, documented in
      module docstring).
    * NullRunInfrastructureError subclasses -> 503 (our side,
      not user's).

* nullrun.messages — default user-facing message catalog.
  Every NR-* error code has an English default message owned
  by NULLRUN, not customer code. Customer Support Bots hitting
  a budget cap show the same wording across every NullRun-backed
  application.
    * format_user_message(exc) — render exception as user-facing
      string
    * set_user_message(code, text) — per-process override for
      branded variants
    * get_user_message(code) — raw lookup
    * reset_overrides() — clear all overrides (for tests)

Changed
-------

* Transport._send_batch canonical JSON serialization — route the
  /track/batch body through _signed_request_body for consistent
  compact-separator serialization. HMAC itself is unaffected,
  but consistent serialization removes a special-case from the
  wire-format contract tests.

* Transport._send_batch actions response handling — backend
  renamed BatchTrackResponse.actions_taken (debug names) ->
  BatchTrackResponse.actions (ActionTaken structs). Read both
  for forward-compat; per-element try/except so one malformed
  entry doesn't abort the whole loop.

* pyproject.toml metadata — long-form description with search
  keywords, Maintainer: populated via maintainers=[...],
  expanded classifiers (Linux / Windows / macOS, Python 3.13,
  CPython, Security / AI / WWW/HTTP topics), project URL
  expander.

Tests
-----

* tests/test_messages.py (new, 282 lines) — catalog
  completeness (every NR-* code has a default message),
  override / reset behavior, render path.
* tests/test_integrations_fastapi.py (new, 289 lines) — HTTP
  status mapping per error code, response shape, ASGI
  middleware path for WorkflowKilledInterrupt, hybrid
  composition.
* tests/test_decision_split.py (new, 199 lines) — pins the
  decision / infrastructure error split.
* Updates to tests/test_runtime.py, tests/test_extractors.py
  reflecting transport canonical-JSON + actions-renamed
  changes.

Release plumbing
----------------

* pyproject.toml: version bumped 0.7.0 -> 0.7.6
* src/nullrun/__version__.py: __version__ = "0.7.6"
* CHANGELOG.md: full 0.7.6 entry covering additions,
  transport changes, metadata improvements

Tests pass locally (per session log) — pytest on Windows /
Python 3.14.2 is green.
…padding

PR #35 (release/0.7.6) failed all four CI jobs (test 3.10/3.11/3.12,
coverage, codecov/patch) on the same root cause + one latent bug
masked by it. This commit lands the fixes plus the last-mile tests
that bring coverage above the 82% threshold.

CI failure root
---------------

* tests/test_integrations_fastapi.py does from fastapi import ...
  at module top-level. CI installs only pip install -e '.[dev]',
  and fastapi was declared as an *optional* [fastapi] extra,
  NOT in [dev]. Pytest collection aborted with
  ModuleNotFoundError: No module named 'fastapi' → all 4 jobs red.
* Fix: add fastapi>=0.100,<1.0 to [dev]. Same precedent as
  langchain-core (already in [dev] for the same import-time
  contract: nullrun.instrumentation.langgraph is eager-imported
  from nullrun.decorators at collection time, so the test extras
  must cover the import chain).

Latent bug surfaced by the first fix
------------------------------------

The same PR refactored Transport._send_batch_with_retry_info to
route the /track/batch body through _signed_request_body for
canonical-JSON serialization (matching /gate and /execute). The two
sibling call sites use the module-level helper _signed_request_body
(no self.); this one used self._signed_request_body by typo.
Result: AttributeError on every batch flush, breaking 15 existing
tests across test_transport.py / test_track_batch_retry.py /
test_integration_contract.py / test_signal_safety.py. As long as
the fastapi collection error aborted pytest, this was hidden. Fixed
to _signed_request_body(...) with a docstring noting why it is
module-level and what the bug looked like.

Coverage padding (codecov/patch was failing on this too)
--------------------------------------------------------

Total coverage on the failing CI run was 81.98% — 0.02pp under the
fail-under=82 gate. After the two fixes above it would have
recovered to ~82.0% on the dot, so I added minimal tests for the
cheapest-to-cover gaps:

* tests/test_breaker_main.py (new) — covers the 5 statements in
  nullrun.breaker.__main__.main() (0% → 100%). The module
  exists so python -m nullrun.breaker exits cleanly instead of
  failing with No module named nullrun.breaker.__main__; the
  previous fix-mechanism was return 0 after a print, but no
  test was exercising it.
* tests/test_status.py — extends TestSummary with seven
  scenarios covering each conditional branch of NullRunStatus.summary()
  (organization_id, workflow_id, workflow_state != Normal,
  backend_reachable=False, ws_connected=False, recent_errors).
  status.py jumps 84.52% → 98.81%.
* tests/test_integrations_fastapi.py — four tests on
  _build_headers covering non-numeric, zero, negative, and
  resume_after (the WorkflowPausedException code path).
  integrations/fastapi.py jumps 90.22% → 94.57%.

After all three: TOTAL 81.98% → 82.46%, comfortably above the gate.

Verification
------------

* Local pytest: 997 passed, 13 skipped, 0 failed
  (Windows / Python 3.14.2, 8m47s — same env the original commit
  was validated in).
* python -m coverage report — 82.46%, no fail-under complaint.
…ng/tools

Patch coverage on PR #35 was 62.38% against a 65% threshold (codecov
target 70% / threshold 5pp). The two biggest delta-holders against
master were auto.py (+286) and langgraph.py (+221), both dominated
by Phase 4.1 additions:

  * auto._normalize_finish_reason + _FINISH_REASON_MAP
  * auto._openai_extractor  second-tier fields (cache_read_tokens,
    cache_write_tokens, reasoning_tokens, finish_reason, tool_names)
  * auto._anthropic_extractor cache_read / cache_write
  * langgraph._safe_get_gen_message
  * langgraph._get_finish_reason (5-source fallback chain)
  * langgraph.extract_usage_from_response second-tier fields

These are pure / near-pure functions with no network or vendor SDK
calls. Coverage padding is cheap — pin the canonical wire shapes
once and the backend ingest contract gets a free live spec.

Local numbers:
  * auto.py        63.44% -> 64.01%   (file-level, +57 statements)
  * langgraph.py   78.50% -> 86.01%   (file-level, +32 statements)
  * TOTAL          82.46% -> 83.13%   (already above 82% gate)

41 tests, all green. Existing test_extractors.py and
test_langgraph_callback.py left untouched — these tests
deliberately target the Phase 4.1 fields (cache_read /
cache_write / reasoning / finish_reason / tool_names) that the
older tests didn't pin.
Pre-0.7.7 every SDK /gate call for any workflow with a budget was

hard-blocked because the runtime hard-coded the literal string

"budget-precheck" as the model. The backend's PolicyEvaluationGraph

treated any synthetic cost_limit rule with score > 0.8 as Block,

so the pricing lookup never landed on a real model and the rule

fired with the wrong score.

This commit:

* Adds nullrun.set_call_context(model=..., tools=[...]) plus

  get_call_model / get_call_tools helpers (and the underlying

  _call_model_var / _call_tools_var contextvars in

  nullrun.context).

* Wires the call context into check_workflow_budget: the /gate

  payload now carries the real model name (or None when unset)

  and the user-supplied tool list. tools=[] vs missing-None are

  distinguished on the wire per gate/internal.rs::check_tool_block.

* Transport.check forwards the tools key when set (it was

  silently dropped pre-fix).

* tests/conftest.py reset_runtime clears the new contextvars so

  a test's set_call_context(...) doesn't leak into the next

  test's wire payload.

* New tests/test_gate_real_path.py pins down the regression:

  default request allows a clean workflow, real block still

  honored, no policy-N residue on the wire, set_call_context

  flows into the body, no-context means no tools key, and the

  helpers are reachable from nullrun.*.

Bumps version to 0.7.7. No breaking changes - new helpers

default to None / empty so existing call sites keep working.
Conflict resolution between release/0.7.7 (T4 per-call context for
/gate) and origin/master (Release/0.7.6 #35, which bumped the SDK
to 0.7.6):

* pyproject.toml: keep 0.7.7 (the HEAD side). 0.7.6 on master is
  superseded by 0.7.7 once this merges.
* CHANGELOG.md: keep BOTH the new 0.7.7 block (from HEAD) and the
  0.7.6 block (from master). They document different releases and
  are listed in chronological order with the older 0.7.6 block below.
* src/nullrun/{__init__.py, runtime.py, transport.py}: auto-merged
  cleanly - master doesn't touch the T4 hunks.

Auto-merge result equals HEAD, but the merge commit is still
needed to record the parent relationship and clear the conflict
state on the PR.
Two silent fail-OPEN footguns are converted to explicit
DeprecationWarning / RuntimeError so misconfigurations show up at
SDK init instead of being diagnosed from a missing proto trace.

Deprecated:

* NullRunRuntime.start_recording() and .stop_recording() now emit
  DeprecationWarning. They have been silent no-op stubs since
  Sprint 2.1 (0.4.0) — decision history is now on the backend
  dashboard at /control-center/decision-history. Both methods
  will be removed in 0.9.0.

* NULLRUN_USE_GRPC=1 now raises RuntimeError at SDK init instead
  of silently falling back to HTTP with an info log. gRPC is on
  the roadmap but not implemented; unset the env var to use HTTP.

Hardening (init path):

* Transport._post_auth_with_retry (new) — retry transient 503 / 504
  + network blips during /api/v1/auth/verify. Backend emits 503
  + Retry-After: 5 on transient DB errors (handlers.rs:11346-51).
  Pre-fix the first 503 surfaced as NR-A001 to the user as if the
  API key were bad. Three attempts, exponential backoff
  (0.5s → 1s → 2s), honors Retry-After when present. Auth-key
  failures (401) are NOT retried — a wrong key on attempt 1 is a
  wrong key on attempt 3.

Transport refactor:

* Transport._add_hmac_headers (new) — pulls the HMAC header
  construction out of _signed_request_body so /track/batch,
  /gate, /check, /execute all share one source of truth for
  Content-Type / X-Signature / X-Signature-Timestamp / X-API-Key
  / Authorization headers. HMAC formula unchanged.

* generate_hmac_signature + verify_hmac_signature accept str | bytes
  for body. Legacy str callers (and the FastAPI integration) keep
  working without an explicit .encode().

* actions_taken → actions on /track/batch response. Backend renamed
  BatchTrackResponse.actions_taken (debug names) → actions
  (ActionTaken structs with human-readable strings moved to
  messages). Read both keys for forward-compat.

Test updates:

* tests/test_framework_patches — alignment with retry + actions
  rename.
* tests/test_high_reliability_fixes — re-pinned for _post_auth_with_retry.
* tests/test_hmac_signing — expanded for str/bytes body + new
  _add_hmac_headers helper.
* tests/test_integration_contract — backend actions rename covered.
* tests/test_transport — retry semantics.

Bumps version to 0.7.8. No breaking changes for callers who don't
touch start_recording / stop_recording / NULLRUN_USE_GRPC.
Conflict resolution between release/0.7.8 (fail-loud on deprecated
surface) and origin/master (release: 0.7.7 #36, the squash-merge of
PR #36 which bumped the SDK to 0.7.7):

* pyproject.toml: keep 0.7.8 (the HEAD side). 0.7.7 on master is
  superseded by 0.7.8 once this merges.
* src/nullrun/__version__.py: keep 0.7.8 (same reasoning).
* CHANGELOG.md: keep BOTH the new 0.7.8 block (from HEAD) and the
  0.7.7 block (from master). They document different releases and
  are listed in chronological order with the older 0.7.7 block below.
* src/nullrun/runtime.py and src/nullrun/transport.py: auto-merged
  cleanly - master doesn't touch the 0.7.8 hunks.
* Test files: auto-merged cleanly - master doesn't touch the 0.7.8
  test changes either.

Auto-merge result equals HEAD, but the merge commit is still
needed to record the parent relationship and clear the conflict
state on the PR.
The 0.7.8 commit changed NULLRUN_USE_GRPC=1 from silent no-op +
INFO log to an explicit RuntimeError, but the regression test
in tests/test_grpc_removed.py still pinned the old behavior
(``test_nullrun_use_grpc_does_not_crash_init`` asserting
make_runtime() succeeded and an INFO line was logged).

CI on PR #38 failed on this test:

  FAILED tests/test_grpc_removed.py::TestGrpcRemoved
    ::test_nullrun_use_grpc_does_not_crash_init
  E   RuntimeError: NULLRUN_USE_GRPC is set but the gRPC
      transport is not yet implemented. ...

This commit updates the test to pin the new 0.7.8 contract:
the env var must raise RuntimeError, and the error message
must name the offending variable + point at the docs page.

The test is renamed from
``test_nullrun_use_grpc_does_not_crash_init`` to
``test_nullrun_use_grpc_raises_runtime_error`` so the test
name itself documents the new contract.

The module docstring (point 2 in the contract list) is
updated to say "raises RuntimeError" instead of "does NOT
crash init — it logs an INFO line and silently falls back
to HTTP". The 0.3.1 -> 0.7.8 evolution is documented in the
test docstring as a contract-evolution footnote for future
maintainers.

Imports: removed unused `import logging` and `caplog`
parameter (no longer asserting on log records); added
`import pytest` for `pytest.raises`.

No production-code change. No version bump. The fix is
self-contained to tests/test_grpc_removed.py.
The 0.7.8 commit (fail-loud on deprecated surface) added
``import warnings`` mid-block in src/nullrun/runtime.py:34,
breaking alphabetical order:

    asyncio
    logging
    os
    warnings       <-- out of order
    threading
    time
    uuid

Ruff on PR #38 CI (Run ruff check src/) flagged it as I001.

Reorder to alphabetical:

    asyncio
    logging
    os
    threading
    time
    uuid
    warnings

Verified:
  * ruff check src/ -> All checks passed!
  * pytest tests/test_grpc_removed.py tests/test_runtime_branches.py
    -> 47 passed

No behavior change, no production logic touched. Pure lint fix.
@codecov

codecov Bot commented Jun 28, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 68.18182% with 14 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/nullrun/runtime.py 57.57% 12 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

@maltsev-dev

Copy link
Copy Markdown
Member Author

Closing as duplicate of #38, which was merged into master as commit c5a8e65 release: 0.7.8 — fail-loud on deprecated surface (#38) at 2026-06-28T08:32:58Z.

Both PRs pointed at the same head SHA 8941098 on release/0.7.8 (same 9 commits). The release shipped via #38; nothing to merge here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant