Skip to content

fix: 0.8.3 — close silent zero-billing across langgraph + init-ordering#42

Merged
maltsev-dev merged 2 commits into
masterfrom
0.8.3
Jun 29, 2026
Merged

fix: 0.8.3 — close silent zero-billing across langgraph + init-ordering#42
maltsev-dev merged 2 commits into
masterfrom
0.8.3

Conversation

@maltsev-dev

Copy link
Copy Markdown
Member

Three coordinated defenses against the same class of bug 0.8.2 closed on the httpx path: llm_call events reaching the backend without a model field were silently recorded as ≈$0 (backend unwrap_or('default') + DEFAULT_RATE). 0.8.3 closes the langgraph callback path and the init-ordering hazard, and promotes the missing-model wire failure from WARN to fail-LOUD.

1. langgraph callback path (src/nullrun/instrumentation/langgraph.py)
_extract_model_from_response now consults response.llm_output FIRST — that's where langchain-openai 1.x puts the date-suffixed model id (gpt-4.1-mini-2025-04-14). The previous chain led with response_metadata, which langchain 1.x leaves empty on the AIMessage inside generations[0][0].message. Without this promotion every OpenAI-via-LangChain 1.x call silently zero-billed. Also adds a "any key containing model" sweep inside llm_output for non-OpenAI wrappers (proxies, custom chat models).

2. Init-ordering hazard (src/nullrun/instrumentation/auto.py)
patch_httpx's class-level __init__ wrap only catches Clients created AFTER it is installed. Users that build ChatOpenAI(...) before nullrun.init(api_key=...) get a pre-existing httpx.Client that the patch never sees. We now sweep gc.get_objects() once at patch install and wrap any pre-existing Client/AsyncClient whose transport isn't already a NullRun*Transport. Idempotent via the existing class-level marker.

3. Fail-LOUD wire tag (src/nullrun/runtime.py)
runtime.track() now escalates the missing-model warning from logger.warning to logger.error, bumps dropped_llm_call_no_model for dashboards, and tags the wire event with __missing_model: True so the backend's into_track_request gate can reject with HTTP 422 instead of silently recording a zero-cost call. The event is still sent (not fail-CLOSED) so the backend can audit; the flag is wire-private and stripped before persisting.

Files

  • src/nullrun/instrumentation/langgraph.py — llm_output-first chain + non-OpenAI sweep
  • src/nullrun/instrumentation/auto.py — eager gc sweep for pre-existing Clients
  • src/nullrun/runtime.py — ERROR + counter + __missing_model flag
  • tests/contract/__init__.py (new package)
  • tests/contract/test_llm_call_model_wire.py — 12 tests pinning all three invariants

Test surface
7 unit tests for _extract_model_from_response (langchain-openai 1.x, model key, "any key containing model" sweep, llm_output-before-response_metadata ordering, response_metadata fallback, generations-message fallback, all-empty None + DEBUG log, empty-string fallthrough).
3 tests for track() missing-model handling (ERROR + counter + tag; tag silent when model set; tag silent for non-llm_call types).
2 tests for the eager gc sweep (pre-existing Client gets wrapped; idempotent on re-patch).

Branch base: master @ d4884a7 (after the 0.8.2 version bump from PR #41).

Three coordinated defenses against the same class of bug the 0.8.2
audit closed on the httpx path: llm_call events reaching the backend
without a model field were silently recorded as ≈$0 (backend
unwrap_or('default') + DEFAULT_RATE). 0.8.3 closes the langgraph
callback path and the init-ordering hazard, and promotes the
missing-model wire failure from WARN to fail-LOUD.

1. langgraph callback path (instrumentation/langgraph.py)
   _extract_model_from_response now consults response.llm_output
   FIRST — that's where langchain-openai 1.x puts the
   date-suffixed model id (e.g. 'gpt-4.1-mini-2025-04-14'). The
   previous chain led with response_metadata, which langchain 1.x
   leaves empty on the AIMessage inside generations[0][0].message.
   Without this promotion every OpenAI-via-LangChain 1.x call
   silently zero-billed. Also adds a 'any key containing model'
   sweep inside llm_output so non-OpenAI wrappers (proxies,
   custom chat models) still get attributed.

2. Init-ordering hazard (instrumentation/auto.py)
   patch_httpx's class-level __init__ wrap only catches Clients
   created AFTER it is installed. Users that build
   ChatOpenAI(...) before nullrun.init(api_key=...) get a
   pre-existing httpx.Client that the patch never sees — those
   clients keep the unpatched transport and emit nothing. We now
   sweep gc.get_objects() once at patch install time and wrap
   any pre-existing Client/AsyncClient whose transport isn't
   already a NullRun*Transport. Idempotent via the existing
   class-level marker.

3. Fail-LOUD wire tag (runtime.py)
   runtime.track() now escalates the missing-model warning from
   logger.warning to logger.error, bumps a runtime counter
   (dropped_llm_call_no_model) for dashboards, and tags the wire
   event with __missing_model: True so the backend's into_track_request
   gate can reject with HTTP 422 instead of silently recording a
   zero-cost call. The event is still sent (not fail-CLOSED) so the
   backend can audit the rejection; the flag is wire-private and
   stripped before persisting.

tests/contract/test_llm_call_model_wire.py pins all three
invariants: 7 unit tests for _extract_model_from_response
(every known langchain shape + non-OpenAI wrappers + empty-string
fallthrough), 3 tests for track()'s missing-model wire tagging
(ERROR + counter + __missing_model flag + non-llm_call silence),
and 2 tests for the eager-wrap sweep (pre-existing Client gets
wrapped, idempotent on re-patch).
@codecov

codecov Bot commented Jun 29, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 73.17073% with 11 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/nullrun/instrumentation/auto.py 62.50% 8 Missing and 1 partial ⚠️
src/nullrun/instrumentation/langgraph.py 85.71% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

`import gc` was inserted between `hashlib` and `json`; ruff's
isort rule wants `gc` before `hashlib` alphabetically.
@maltsev-dev maltsev-dev merged commit 2df68a1 into master Jun 29, 2026
5 checks passed
maltsev-dev added a commit that referenced this pull request Jun 29, 2026
…l-LOUD wire tag (#43)

Version bump after PR #42: the wire-format fixes from 0.8.2 closed
the silent zero-billing bug on the httpx path. 0.8.3 closes it on
the langgraph callback path (langchain-openai 1.x puts the
date-suffixed model on LLMResult.llm_output, not on AIMessage
response_metadata), the init-ordering hazard (ChatOpenAI(...) built
before nullrun.init() never sees the class-level __init__ wrap),
and promotes the missing-model wire failure from WARN to fail-LOUD
(ERROR + dropped_llm_call_no_model counter + __missing_model wire
flag the backend can reject with HTTP 422).

CHANGELOG entry added above 0.8.2. No public-API break.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant