Provider-neutral coding agent — 9 providers · 41 tools · 8 hooks · multi-model collaboration.
The one CLI that can do what no single-vendor CLI can: ask Claude, GPT, and Gemini the same question simultaneously, race them, or chain them as specialists. Works with any OpenAI-compatible provider.
· ✦ ·
▄██████▄ Forge v0.2.0
██████████ armature agent runtime
▀████████▀
██████ ▸ google/gemini-2.5-pro 1M ctx · 65K out [yolo]
████████ ▸ ~/Projects/my-app
41 tools · 8 hooks
npm install -g @armature/forge-cliAny ONE of these keys gets you started:
export GOOGLE_API_KEY=... # Google Gemini
export ANTHROPIC_API_KEY=... # Anthropic Claude
export OPENAI_API_KEY=... # OpenAI GPT
export POE_API_KEY=... # Poe (aggregator: all vendors via 1 key)
export OPENROUTER_API_KEY=... # OpenRouter (aggregator)forge chat # interactive REPL
forge chat "explain this codebase" # one-shot
forge run "fix the failing tests" # task execution
forge council "SQL or NoSQL for this?" -n 5 # 5 models + judge
forge race "write a CSV parser" # first model wins
forge pipeline "build REST API" --stages 5 # plan→code→review→fix→verify
forge stats # token usage and cost
forge session list # saved sessions
forge pr 123 # checkout + review PR
forge serve --port 9100 # headless HTTP server
forge providers # list configured providersNo single-vendor CLI can do this. Forge accesses 11 models from 9 vendors through one API key.
Ask N models the same question. A judge synthesizes the best answer.
forge council "is this code thread-safe?" -n 3
forge council "review for security issues" -n 5 -j claude-opus-4.6╭── Council: 3 models ──╮
● claude-opus-4.6... 4.2s
● gpt-5.4... 2.1s
● gemini-3.1-pro... 3.8s
★ Verdict (claude-opus-4.6 as judge)
All three agree on the race condition in line 42...
Confidence: HIGH (3/3 agree)
─ 3 models · 12.1s · agreement: high ─
N models race. First good answer wins, rest cancelled.
forge race "write a quicksort in Python" -n 5Chain models as specialists. Each stage feeds into the next.
forge pipeline "build auth middleware" --plan claude-opus-4.6 --code gpt-5.4 --review gemini-3.1-pro| Stage | Default Model | Role |
|---|---|---|
| Plan | claude-opus-4.6 | Architecture, data flow, API design |
| Code | gpt-5.4 | Fast implementation |
| Review | gemini-3.1-pro | Bug/security/perf review (2M context) |
| Fix | gpt-5.4 | Address review findings |
| Verify | claude-opus-4.6 | Confirm fix matches plan |
Works with any OpenAI-compatible endpoint. Configure in ~/.armature/config.json:
| Provider | Type | API Key Env |
|---|---|---|
| anthropic | Direct | ANTHROPIC_API_KEY |
| Direct | GOOGLE_API_KEY |
|
| openai | Direct | OPENAI_API_KEY |
| poe | Aggregator | POE_API_KEY |
| openrouter | Aggregator | OPENROUTER_API_KEY |
| deepseek | Direct | DEEPSEEK_API_KEY |
| groq | Direct | GROQ_API_KEY |
| xai | Direct | XAI_API_KEY |
| local | Direct | (Ollama at localhost:11434) |
Aggregators (Poe, OpenRouter) route to all vendors via one API key — ideal for council/race/pipeline. Direct providers connect to each vendor's own API.
Multi-model routing: aggregator first, direct fallback per model.
| Model | Vendor | Strength |
|---|---|---|
| claude-opus-4.6 | Anthropic | Deep reasoning, careful analysis |
| claude-sonnet-4.6 | Anthropic | Fast + capable (default) |
| gpt-5.4 | OpenAI | Fast code generation |
| gemini-3.1-pro | 2M context, multimodal | |
| gemini-3.1-flash-lite | Ultra-fast, cheap | |
| gemma-4-31b | Google/Meta | Open-source, local-friendly |
| glm-5 | Zhipu | Chinese language excellence |
| grok-4.20-multi-agent | xAI | Multi-agent native |
| qwen3.6-plus | Alibaba | Math, reasoning |
| kimi-k2.5 | Moonshot | Long-context reasoning |
| minimax-m2.7 | MiniMax | Creative generation |
Tools the model calls autonomously. Grouped by capability:
| Category | Tools | Count |
|---|---|---|
| File I/O | read, write, edit, multi_edit, patch, delete, move, copy, mkdir, file_info | 10 |
| Search | search, glob, find_definition, find_references, tree, count_lines, tool_search | 7+(1) |
| Git | status, diff, log, commit | 4 |
| Execution | run_command, run_background, check_port, sleep | 4 |
| Agent/Swarm | spawn_agent, delegate_task | 2 |
| Task Mgmt | task_create, task_update, task_list | 3 |
| Planning | create_plan, verify_plan | 2 |
| Interaction | ask_user, notify_user | 2 |
| Web | fetch_url, web_search | 2 |
| MCP | mcp_list_servers, mcp_list_resources, mcp_read_resource | 3 |
| Notebook | notebook_edit | 1 |
9 tools require confirmation in --safe mode: write, edit, multi_edit, patch, delete, move, run_command, run_background, git_commit.
Configure in .armature/hooks.json. Shell commands receive JSON stdin, return JSON stdout.
| Hook | When | Can Block? |
|---|---|---|
| PreToolUse | Before tool execution | Yes (exit 1 = block) |
| PostToolUse | After tool execution | No |
| SessionStart | REPL startup | No |
| SessionEnd | Clean exit | No |
| PreCompact | Before /compact | No |
| PostCompact | After /compact | No |
| UserPromptSubmit | Before prompt to model | Yes |
| SubagentStart | Sub-agent spawn | No |
| Command | Description |
|---|---|
/help |
Show all commands + tips |
/models |
Interactive model picker (1-11) |
/model set <name> |
Switch model mid-session |
/council <prompt> |
Multi-model council |
/race <prompt> |
Multi-model race |
/pipeline <prompt> |
Multi-model pipeline |
/clear |
Clear conversation |
/compact |
Keep last 2 turns |
/system <prompt> |
Set system prompt |
/diff |
Show git diff |
/git <cmd> |
Run git command |
/save [name] |
Save session |
/load [name] |
Load session |
/sessions |
List saved sessions |
/undo |
Revert last file write |
/hooks |
Show registered hooks |
/retry |
Retry last message |
/history /tokens /stats |
Session metrics |
/cwd |
Working directory |
/exit |
Exit with summary |
| Mode | Flag | Default |
|---|---|---|
| YOLO | (none) | Yes — auto-approve, actions visible |
| Safe | --safe |
No — interactive y/n + diff preview |
- Code blocks: box-drawing borders (╭╮╰╯│) + syntax highlighting (JS/TS, Python, Shell, JSON)
- Inline: bold, italic,
code(dark background) - Lists, blockquotes, headings, links, horizontal rules
Features that close the gap between "tool" and "agent":
| Capability | What It Does | Why It Matters |
|---|---|---|
| Project Context Loader | Auto-detects type, framework, test runner, deps | Agent knows the project from turn 1 |
| Smart Output Truncation | 8K limit with summary header (line count + file list) | Prevents context pollution from large grep results |
| Error Self-Correction | Failed tools return recovery hints ("use read_file first") | Model self-corrects without human intervention |
| Shell Injection Protection | All user inputs shellEscaped before exec | Security baseline for production agent |
| Unlimited Agent Loop | Auto-continue on truncation, incomplete text detection | Tasks complete without artificial limits |
| Multi-edit Atomicity | Failed batch edits leave file unchanged | No partial corruption on error |
Tested: 326 tests across 20 files, 10/10 SOTA benchmark.
┌─────────────────────────────────────────────────────┐
│ Forge CLI v0.2.0 │
│ 7,200+ LOC · 26 source files · 326 tests │
├─────────────────────────────────────────────────────┤
│ New in v0.2.0 │
│ providers · stats · session · pr · serve │
│ per-model routing · aggregator+direct fallback │
├─────────────────────────────────────────────────────┤
│ Multi-Model Engine │
│ council · race · pipeline │
│ 9 providers · aggregator or direct per-model │
├─────────────────────────────────────────────────────┤
│ Agent Runtime │
│ 41 tools · 8 hooks · YOLO/safe · sub-agents │
│ StreamMarkdown · session persistence · MCP client │
├─────────────────────────────────────────────────────┤
│ OpenAI-compat Provider + SQLite Usage Tracking │
│ 429 auto-retry · model-aware max_tokens · SSE │
│ headless serve mode · PR review workflow │
├─────────────────────────────────────────────────────┤
│ @armature/sdk (optional, any provider via shim) │
│ 51 tools · full MCP · OpenAI-compat shim │
└─────────────────────────────────────────────────────┘
CLI flags > ENV vars > .armature.json > ~/.armature/config.json
MIT
Maurice | maurice_wen@proton.me