Skip to content

feat(ai): sandboxed harness adapters (provider-agnostic sandbox layer) [WIP]#774

Draft
AlemTuzlak wants to merge 44 commits into
mainfrom
feat/sandboxes
Draft

feat(ai): sandboxed harness adapters (provider-agnostic sandbox layer) [WIP]#774
AlemTuzlak wants to merge 44 commits into
mainfrom
feat/sandboxes

Conversation

@AlemTuzlak

@AlemTuzlak AlemTuzlak commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Provider-agnostic sandbox layer: harness adapters (Claude Code, Codex, Gemini CLI, OpenCode) run inside isolated sandboxes — files, repos, processes — and stream back through chat(). Persistence is out of scope but every seam is persistence-ready.

Done + tested + committed

Core — @tanstack/ai-sandbox: SandboxHandle/SandboxProvider/SandboxCapabilities contracts; capability tokens (SandboxCapability + optional SandboxStore/Locks/SandboxPolicy); in-memory store/lock; defineSandbox + ensure state machine (resume→restoreSnapshot→create+bootstrap, capability-aware degradation); withSandbox; defineWorkspace; bootstrapWorkspace; defineSandboxPolicy + evaluateCommand; compound key; hardened createExecBackedGit; spawnNdjson.

Providers

  • -local-process — host fs/child_process (dev loop, no isolation).
  • -docker — dockerode; create/resume/restoreSnapshot/destroy, exec + duplex spawn, base64 fs, commit-snapshot + fork. Integration tests verified against a real daemon.
  • -cloudflare@cloudflare/sandbox (edge/Workers); exec/base64-fs/exposePort/setEnvVars; ephemeral-disk degradation. Compiles against real CF types; runtime verify needs a Workers runtime.

Harness adapters (all run in-sandbox, declare requires:[SandboxCapability])

  • -claude-codeclaude -p --output-format stream-json via stdin; reuses translate; file.changed diff event; policy→CLI flag mapping; MCP tool-proxy (host Streamable-HTTP MCP server proxies chat() tools back to the host, verified via the MCP SDK client).
  • -codexcodex exec --experimental-json (mirrors @openai/codex-sdk's own invocation).
  • -gemini-cligemini --acp driven over ACP via a SpawnHandle→WebStream transport (ACP protocol reused).
  • -opencode — spawns opencode serve in-sandbox, exposes the port, connects the SDK client via baseUrl.

Core wiring — @tanstack/ai: TextOptions.capabilities; DefinedChatMiddleware/AnyChatMiddleware exports.

Also: examples/sandbox-coding-agent (runnable local e2e), docs/sandbox/overview.md + nav, ai-sandbox agent skill, changesets for every package, git-exec security hardening.

Verification: ~180 unit/integration tests across the sandbox packages (real Docker; deterministic fake-CLI tests in real local-process sandboxes for claude/codex; transport/server-helper + mock tests for gemini/opencode/cloudflare; MCP bridge via the MCP SDK client). @tanstack/ai 1033 tests still pass. types/eslint/build/knip/sherif/docs all green. Live agent-in-sandbox runs are the manual e2e (via the example; needs the agent CLIs + keys).

Remaining (documented)

  • Full client-in-the-loop interactive approvals (pause mid-run → client approves → resume): the implemented safety lever is defineSandboxPolicy → permission-mode / allowed/disallowed-tools mapping + each harness's native permission modes. The full resume loop is entangled with each harness's permission-prompt contract + chat()'s resume/persistence and needs the live CLIs to verify.
  • chat()-tools MCP bridging for codex/gemini-cli/opencode (claude-code has it; the others use their native tools).
  • E2E Playwright suite (deferred by request).

jherr and others added 10 commits June 12, 2026 07:00
…xample

New @tanstack/ai-claude-code package that runs Claude Code (via
@anthropic-ai/claude-agent-sdk) as a TanStack AI chat backend. Unlike HTTP
provider adapters, this is a harness adapter: Claude Code owns the agent
loop and executes its built-in tools (bash, file edits, search) server-side.

- Stream translator maps Agent SDK messages to AG-UI events; harness tool
  activity arrives as already-resolved TOOL_CALL_*/TOOL_CALL_RESULT pairs
  and runs always finish with stop/length (never tool_calls), so the engine
  never re-executes harness tools. Every started tool call is guaranteed a
  result (synthesized on abort) to keep the engine's pending-call scan safe.
- TanStack toolDefinition() server tools are bridged into the harness as an
  in-process MCP server (raw JSON Schema passthrough, no zod round-trip).
  Client-side/approval tools fail fast — documented v1 limitation.
- Stateful sessions: session id surfaced via a claude-code.session-id CUSTOM
  event; resume via modelOptions.sessionId (+ forkSession).
- Structured output uses the SDK's native outputFormat json_schema.
- settingSources defaults to ['project'] so servers don't inherit user-level
  ~/.claude config from the host machine.
- E2E: excluded from the aimock matrix (subprocess can't carry X-Test-Id
  isolation); covered by 44 unit tests plus a gated live smoke spec
  (CLAUDE_CODE_E2E=1).

Also adds examples/ts-react-coding-agent: a TanStack Start app demoing
session resume, the harness tool timeline, read-only/edit permission modes,
tool bridging, and a sandboxed scratch workspace — with the agent registry
structured so future Codex/Gemini CLI harness adapters can slot in.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ters

Add two new coding-agent harness adapters alongside Claude Code:

- @tanstack/ai-codex drives OpenAI Codex via @openai/codex-sdk with local
  tool execution, resumable sessions (modelOptions.sessionId), structured
  output, and a localhost MCP bridge for TanStack server tools.
- @tanstack/ai-gemini-cli drives `gemini --acp` over the Agent Client
  Protocol with token-level streaming, resumable sessions, a configurable
  permission policy, and headless ACP auth method selection (authMethodId)
  so runs never stall on an interactive auth picker.

Wire both into the ts-react-coding-agent example: the agent dropdown keeps
every harness selectable, and a server function (createServerFn) reports
which agents are actually configured at runtime so the UI can surface a
setup dialog for unconfigured ones. Includes adapter docs and changesets.

Co-authored-by: Cursor <cursoragent@cursor.com>
Add the @tanstack/ai-opencode package, an OpenCode harness adapter that
drives OpenCode (via @opencode-ai/sdk) as a TanStack AI chat backend with
local tool execution, token-level streaming, stateful sessions, and
TanStack tool bridging over a localhost MCP server. Wires the adapter into
the ts-react-coding-agent example, adds the OpenCode adapter docs page, and
anchors the OpenCode.md gitignore entry so it no longer shadows the docs
page on case-insensitive filesystems.

Co-authored-by: Cursor <cursoragent@cursor.com>
…e, withSandbox, workspace, policy

- @tanstack/ai-sandbox: provider-agnostic SandboxHandle/SandboxProvider/SandboxCapabilities contracts
- capability tokens (SandboxCapability + optional SandboxStore/Locks), in-memory store/lock defaults
- defineSandbox lazy controller + ensure state machine (resume->restoreSnapshot->create+bootstrap) with capability-aware degradation
- withSandbox middleware (setup provides handle; onFinish/onError snapshot+destroy)
- defineWorkspace (git/local/none + skills + secrets), provider-agnostic bootstrapWorkspace
- defineSandboxPolicy + evaluateCommand (glob, deny>ask>allow), compound sandbox key (secrets excluded)
- export DefinedChatMiddleware/AnyChatMiddleware from @tanstack/ai for portable middleware authoring
- 22 unit tests (ensure/policy/key/store); types + lint clean

Refs sandbox proposal (Phase A).
…git helper

- @tanstack/ai-sandbox-local-process: SandboxHandle over host fs/child_process (no isolation, dev loop)
- virtual /workspace root mapped to a real host dir with path containment
- exec/spawn (duplex stdin, streamed stdout), localhost port channel, env, fork via dir copy, durable fs resume-by-dir
- core: createExecBackedGit helper (shared by providers without native git); bootstrap clones into the handle's own root
- 10 unit tests (fs/exec/spawn/lifecycle/fork/bootstrap/ensure); types + lint clean
…runner

- @tanstack/ai: TextOptions.capabilities carries the middleware capability context so harness adapters can read provided capabilities (getSandbox(options.capabilities)) from chatStream; populated by the engine
- @tanstack/ai-sandbox: spawnNdjson/toLines — spawn an agent CLI in a sandbox and stream parsed NDJSON stdout (the reusable harness-execution primitive)
- tests: toLines buffering + spawnNdjson parsing (core), real spawn+NDJSON via local-process (11) — 25 core tests; types + lint clean
@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: df56284a-f036-473d-860f-fa39f2732310

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/sandboxes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

🚀 Changeset Version Preview

9 package(s) bumped directly, 31 bumped as dependents.

🟥 Major bumps

Package Version Reason
@tanstack/ai-claude-code 0.1.0 → 1.0.0 Changeset
@tanstack/ai-codex 0.1.0 → 1.0.0 Changeset
@tanstack/ai-gemini-cli 0.1.0 → 1.0.0 Changeset
@tanstack/ai-opencode 0.1.0 → 1.0.0 Changeset
@tanstack/ai-sandbox 0.1.0 → 1.0.0 Changeset
@tanstack/ai-sandbox-cloudflare 0.1.0 → 1.0.0 Changeset
@tanstack/ai-sandbox-docker 0.1.0 → 1.0.0 Changeset
@tanstack/ai-sandbox-local-process 0.1.0 → 1.0.0 Changeset
@tanstack/ai-angular 0.1.3 → 1.0.0 Dependent
@tanstack/ai-anthropic 0.15.4 → 1.0.0 Dependent
@tanstack/ai-code-mode 0.2.8 → 1.0.0 Dependent
@tanstack/ai-code-mode-skills 0.2.8 → 1.0.0 Dependent
@tanstack/ai-elevenlabs 0.2.23 → 1.0.0 Dependent
@tanstack/ai-event-client 0.6.2 → 1.0.0 Dependent
@tanstack/ai-fal 0.8.2 → 1.0.0 Dependent
@tanstack/ai-gemini 0.16.2 → 1.0.0 Dependent
@tanstack/ai-grok 0.11.5 → 1.0.0 Dependent
@tanstack/ai-groq 0.4.5 → 1.0.0 Dependent
@tanstack/ai-isolate-node 0.1.33 → 1.0.0 Dependent
@tanstack/ai-isolate-quickjs 0.1.33 → 1.0.0 Dependent
@tanstack/ai-ollama 0.8.4 → 1.0.0 Dependent
@tanstack/ai-openai 0.14.4 → 1.0.0 Dependent
@tanstack/ai-openrouter 0.13.4 → 1.0.0 Dependent
@tanstack/ai-preact 0.9.8 → 1.0.0 Dependent
@tanstack/ai-react 0.15.8 → 1.0.0 Dependent
@tanstack/ai-react-ui 0.8.8 → 1.0.0 Dependent
@tanstack/ai-solid 0.13.8 → 1.0.0 Dependent
@tanstack/ai-solid-ui 0.7.8 → 1.0.0 Dependent
@tanstack/ai-svelte 0.13.8 → 1.0.0 Dependent
@tanstack/ai-vue 0.13.8 → 1.0.0 Dependent
@tanstack/openai-base 0.8.4 → 1.0.0 Dependent

🟨 Minor bumps

Package Version Reason
@tanstack/ai 0.31.0 → 0.32.0 Changeset

🟩 Patch bumps

Package Version Reason
@tanstack/ai-client 0.17.3 → 0.17.4 Dependent
@tanstack/ai-devtools-core 0.4.11 → 0.4.12 Dependent
@tanstack/ai-isolate-cloudflare 0.2.24 → 0.2.25 Dependent
@tanstack/ai-mcp 0.1.3 → 0.1.4 Dependent
@tanstack/ai-vue-ui 0.2.20 → 0.2.21 Dependent
@tanstack/preact-ai-devtools 0.1.54 → 0.1.55 Dependent
@tanstack/react-ai-devtools 0.2.54 → 0.2.55 Dependent
@tanstack/solid-ai-devtools 0.2.54 → 0.2.55 Dependent

@socket-security

socket-security Bot commented Jun 16, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Added@​opencode-ai/​sdk@​1.17.71001007597100
Added@​types/​dockerode@​3.3.471001007784100
Addeddockerode@​4.0.1210010010087100
Added@​agentclientprotocol/​sdk@​0.25.11001009098100
Added@​cloudflare/​sandbox@​0.12.196100100100100

View full report

@nx-cloud

nx-cloud Bot commented Jun 16, 2026

Copy link
Copy Markdown

View your CI Pipeline Execution ↗ for commit f82a981

Command Status Duration Result
nx run-many --targets=build --exclude=examples/... ✅ Succeeded 1m 36s View ↗

☁️ Nx Cloud last updated this comment at 2026-06-17 08:36:49 UTC

@pkg-pr-new

pkg-pr-new Bot commented Jun 16, 2026

Copy link
Copy Markdown

Open in StackBlitz

@tanstack/ai

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai@774

@tanstack/ai-angular

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-angular@774

@tanstack/ai-anthropic

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-anthropic@774

@tanstack/ai-claude-code

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-claude-code@774

@tanstack/ai-client

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-client@774

@tanstack/ai-code-mode

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-code-mode@774

@tanstack/ai-code-mode-skills

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-code-mode-skills@774

@tanstack/ai-codex

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-codex@774

@tanstack/ai-devtools-core

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-devtools-core@774

@tanstack/ai-elevenlabs

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-elevenlabs@774

@tanstack/ai-event-client

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-event-client@774

@tanstack/ai-fal

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-fal@774

@tanstack/ai-gemini

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-gemini@774

@tanstack/ai-gemini-cli

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-gemini-cli@774

@tanstack/ai-grok

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-grok@774

@tanstack/ai-groq

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-groq@774

@tanstack/ai-isolate-cloudflare

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-isolate-cloudflare@774

@tanstack/ai-isolate-node

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-isolate-node@774

@tanstack/ai-isolate-quickjs

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-isolate-quickjs@774

@tanstack/ai-mcp

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-mcp@774

@tanstack/ai-ollama

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-ollama@774

@tanstack/ai-openai

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-openai@774

@tanstack/ai-opencode

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-opencode@774

@tanstack/ai-openrouter

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-openrouter@774

@tanstack/ai-preact

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-preact@774

@tanstack/ai-react

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-react@774

@tanstack/ai-react-ui

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-react-ui@774

@tanstack/ai-sandbox

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-sandbox@774

@tanstack/ai-sandbox-cloudflare

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-sandbox-cloudflare@774

@tanstack/ai-sandbox-docker

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-sandbox-docker@774

@tanstack/ai-sandbox-local-process

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-sandbox-local-process@774

@tanstack/ai-solid

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-solid@774

@tanstack/ai-solid-ui

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-solid-ui@774

@tanstack/ai-svelte

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-svelte@774

@tanstack/ai-utils

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-utils@774

@tanstack/ai-vue

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-vue@774

@tanstack/ai-vue-ui

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-vue-ui@774

@tanstack/openai-base

npm i https://pkg.pr.new/TanStack/ai/@tanstack/openai-base@774

@tanstack/preact-ai-devtools

npm i https://pkg.pr.new/TanStack/ai/@tanstack/preact-ai-devtools@774

@tanstack/react-ai-devtools

npm i https://pkg.pr.new/TanStack/ai/@tanstack/react-ai-devtools@774

@tanstack/solid-ai-devtools

npm i https://pkg.pr.new/TanStack/ai/@tanstack/solid-ai-devtools@774

commit: f82a981

AlemTuzlak and others added 14 commits June 16, 2026 17:04
…ential leakage

Security review (PR #774):
- argument injection: insert '--' end-of-options separators before positionals (clone url/target, add paths) and reject url/ref/dir/path values beginning with '-' (flag-smuggling guard)
- secrets in argv: stop embedding the auth token in the clone URL (leaked via ps/logs); use a one-shot credential.helper that reads the token from the child ENV, single-quoted so the outer shell never expands it
- 4 unit tests pinning: token absent from argv + present in env, '--' separators, leading-dash rejection, quote escaping
- @tanstack/ai-sandbox-docker: SandboxHandle over a Docker container
- create/resume-by-id/restoreSnapshot(commit image)/destroy; durable fs across stop/start
- exec + duplex spawn via dockerode exec + stream demux; fs over base64 piping (binary-safe, no tar dep)
- commit-based snapshot + fork; host.docker.internal gateway for host MCP reachability; publishPorts -> ports.connect
- exec-backed git reused from core
- 3 integration tests (gated on a reachable daemon) — verified green against a real daemon: exec, fs+binary round-trip, snapshot, resume, spawn streaming, ensure+bootstrap
- pnpm-workspace: declare dockerode's optional native deps (cpu-features, ssh2) as not-built (JS fallback, local socket)
- claudeCodeText now declares requires:[SandboxCapability] and spawns the claude CLI INSIDE the sandbox via sandbox.process (claude -p --output-format stream-json), reusing translateSdkStream for the stdout NDJSON
- prompt fed via stdin (not argv); session id surfaced as before; emits a file.changed CUSTOM event with the git diff after the run
- permission-mode/allowed/disallowed/add-dir/max-turns/system-prompt mapped to CLI flags; default permission-mode bypassPermissions (sandbox is isolated)
- drop @anthropic-ai/claude-agent-sdk + @modelcontextprotocol/sdk deps; remove the in-process tool bridge (chat()-tools MCP proxy deferred — adapter rejects tools for now); provider-options self-contained
- spawnNdjson gains an  option to feed stdin
- deterministic test via a fake claude CLI in a real local-process sandbox (24 tests); types + lint clean
Runnable demo (examples/sandbox-coding-agent) that runs Claude Code inside a sandbox to fix a bug end-to-end via chat() + withSandbox:
- bootstraps a tiny git repo with a deliberate bug, asks the agent to fix it, streams output + prints the git diff
- Docker provider by default (installs the claude CLI in setup); SANDBOX=local runs on the host process
- README with prerequisites + run instructions for manual e2e verification
…lag mapping; changesets

- SandboxPolicyCapability: withSandbox provides the definition policy (conditionally); harness adapters read it via getOptional
- claude-code maps defineSandboxPolicy (default decision + fileWrite/network caps + tool-name command rules) onto --permission-mode/--allowedTools/--disallowedTools (best-effort; fine-grained command globs await the MCP permission-prompt tool)
- changesets for the sandbox layer + updated claude-code changeset for the in-sandbox behavior
- policy-map unit tests (5)
- docs/sandbox/overview.md: mental model, providers, defineWorkspace/defineSandboxPolicy, lifecycle/resume, events, the runnable example (no as-casts; latest model id)
- docs/config.json: new Sandboxes section (addedAt 2026-06-16)
- packages/ai-sandbox/skills/ai-sandbox: agent skill covering the sandbox APIs + critical rules
- ship skills in the package files
- test:docs green
…n-sandbox agent

- startHostToolBridge: host-side Streamable-HTTP MCP server exposing chat() server tools; the in-sandbox claude calls mcp__tanstack__<tool>, proxied back to the host where execute() runs (closures/DB/secrets). Per-run bearer token; binds for host.docker.internal reachability from Docker
- adapter wires --mcp-config when tools are present, picks localhost vs host.docker.internal by provider, and tears the bridge down after the run; tools no longer rejected
- 3 host-side tests via the MCP SDK client (list/call/error/auth) — verified green without needing claude
- docs + skill updated to describe the tool-proxy
- @tanstack/ai-sandbox-cloudflare: cloudflareSandbox() on @cloudflare/sandbox (edge, inside a Worker)
- uniform SandboxHandle: exec, base64-backed fs, exec-backed git, exposePort preview URLs (previewHostname), setEnvVars; spawn via startProcess+onOutput queue
- ephemeral disk + no GA snapshots -> durableFilesystem/snapshots false (withSandbox re-bootstraps across cold starts); background processes have no stdin (documented; stdin-fed harnesses need local-process/docker)
- compiles against the real @cloudflare/sandbox types; 7 deterministic handle tests against a mock Sandbox (fs round-trip, exec, spawn queue, stdin limitation, port). Runtime verification pending a Workers runtime
- align @cloudflare/workers-types version with the workspace (sherif)
- codexText declares requires:[SandboxCapability]; spawns 'codex exec --experimental-json' inside the sandbox (mirroring @openai/codex-sdk's own CLI invocation), prompt via stdin, JSONL thread events → existing translateThreadEvents
- sandbox mode / approval policy / reasoning effort / add-dir / skip-git-repo-check / config mapped to codex CLI flags; resume via 'resume <id>'
- drop @openai/codex-sdk + @modelcontextprotocol/sdk + the in-process tool bridge; provider-options self-contained; chat()-tools bridging deferred (rejects tools)
- deterministic fake-codex-CLI test in a real local-process sandbox (27 tests); types/lint/knip/sherif clean
autofix-ci Bot and others added 19 commits June 16, 2026 17:25
- geminiCliText declares requires:[SandboxCapability]; spawns 'gemini --acp' inside the sandbox and drives it over ACP via the sandbox's duplex process IO
- new spawnHandleToAcpTransport adapts a SpawnHandle into the Uint8Array WebStreams ndJsonStream needs; all @agentclientprotocol/sdk protocol handling reused unchanged
- drop local child_process spawn + @modelcontextprotocol/sdk + in-process tool bridge; chat()-tools bridging deferred (rejects tools); structuredOutput throws not-supported
- transport-adapter + requires-sandbox tests (36); types/lint/knip/sherif clean
- opencodeText declares requires:[SandboxCapability]; spawns 'opencode serve' inside the sandbox, waits for readiness, exposes the port, and connects @opencode-ai/sdk's HTTP client via baseUrl (reusing startOpencodeSession's connect path)
- new startOpencodeServerInSandbox helper (readiness detection + port exposure); Docker needs publishPorts:[port]
- drop @modelcontextprotocol/sdk + in-process tool bridge; chat()-tools bridging deferred (rejects tools); structuredOutput throws not-supported; permission governed by the dynamic handler
- server-helper + requires-sandbox tests (36); types/lint/knip/sherif clean
…4 harness adapters

- move startHostToolBridge + BRIDGED_MCP_SERVER_NAME + hostForSandbox into @tanstack/ai-sandbox core (shared); add @modelcontextprotocol/sdk dep there; tool-bridge test relocated to core
- claude-code: import bridge from core; build --mcp-config from the bridge (drop local bridge + dep)
- codex: bridge tools via --config mcp_servers.<name>.url + bearer_token
- gemini-cli: bridge tools via ACP newSession mcpServers (http + Authorization header)
- opencode: bridge tools via OPENCODE_CONFIG_CONTENT mcp.remote (url + bearer header) at server spawn
- all adapters no longer reject tools; bridged tool names feed the permission handlers; changesets updated
- types/eslint/lib/build/knip/sherif green across all 5 packages
- @tanstack/ai-sandbox: shared resolveApproval (policy + client approvals -> allow/deny/needs-approval), stable approvalId, buildApprovalRequestedEvent (AG-UI CUSTOM 'approval-requested')
- @tanstack/ai: TextOptions.approvals threaded from the engine's initialApprovals so harness adapters resolve ask-policy permission requests against the client's decisions (resume-based loop)
- 11 unit tests for the resolver/keying/event; @tanstack/ai 1033 tests still pass
Wire client-in-the-loop approvals through every in-sandbox harness adapter,
built on the shared approval primitives in @tanstack/ai-sandbox.

- core bridge: optional permission-prompt tool on startHostToolBridge, and
  export PermissionToolResult so adapters can type their resolver.
- claude-code: enforce the sandbox policy via --permission-prompt-tool; an
  `ask` action with no client decision yet emits an approval-requested event
  and denies, so the client approves and re-runs to continue.
- gemini-cli / opencode: resolveInteractivePermission consults policy + client
  approvals, collects approval-requested events, and yields them after the
  stream (coercing nullable ACP tool titles).
- codex: map defineSandboxPolicy onto codex exec`s coarse knobs (sandbox mode,
  approval_policy, network_access). codex exec is non-interactive with no
  per-action host callback, so the resume-based approval flow is not available
  for codex (documented); adds policy-map + tests.
- changesets updated to describe the interactive-approval behavior.
Add provider-agnostic sandbox file-event hooks and a runnable demo that
uses them.

Hooks (@tanstack/ai-sandbox):
- watchWorkspace(handle, { onEvent }) + watchWithHooks(handle, hooks) emit
  typed FileEvents (create/change/delete). A native fs.watch fast-path is
  used when the provider advertises it; otherwise a portable `find -printf`
  mtime snapshot-diff poll runs (no extra deps / image changes). .git and
  node_modules are ignored by default.
- withSandboxFileEvents() middleware surfaces events into the chat() stream
  as CUSTOM `sandbox.file` events, interleaved with the agent's output.
- local-process gains the native fs.watch seam (Node recursive watch on
  Windows/macOS; Linux falls back to the poll).

Example (examples/sandbox-issue-triage):
- Fetches the first open issue on TanStack/ai, clones the repo into a
  sandbox, runs Claude Code inside it to triage the issue and write
  ISSUE-REPORT.md, reads it back via sandbox.fs, and writes a local report
  with the observed file events appended. Two entrypoints: process + docker.

Docs/skill updated; changeset added.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants