agent-browser
Browser automation CLI designed for AI agents. Compact text output minimizes context usage. Fast Rust CLI with Node.js fallback.
npm install -g agent-browser # all platforms (fastest, native Rust CLI)
brew install agent-browser # macOS
# or try without installing
npx agent-browser open example.comFeatures
- Agent-first - Compact text output uses fewer tokens than JSON, designed for AI context efficiency
- Ref-based - Snapshot returns accessibility tree with refs for deterministic element selection
- Fast - Native Rust CLI for instant command parsing
- Complete - 50+ commands for navigation, forms, screenshots, network, storage
- Sessions - Multiple isolated browser instances with separate auth
- Cross-platform - macOS, Linux, Windows with native binaries
Works with
Claude Code, Cursor, GitHub Copilot, OpenAI Codex, Google Gemini, opencode, and any agent that can run shell commands.
Example
# Navigate and get snapshot
agent-browser open example.com
agent-browser snapshot -i
# Output:
# - heading "Example Domain" [ref=e1]
# - link "More information..." [ref=e2]
# Interact using refs
agent-browser click @e2
agent-browser screenshot page.png
agent-browser closeWhy refs?
The snapshot command returns a compact accessibility tree where each element
has a unique ref like @e1, @e2. This provides:
- Context-efficient - Text output uses ~200-400 tokens vs ~3000-5000 for full DOM
- Deterministic - Ref points to exact element from snapshot
- Fast - No DOM re-query needed
- AI-friendly - LLMs parse text output naturally
Architecture
Client-daemon architecture for optimal performance:
- Rust CLI - Parses commands, communicates with daemon
- Node.js Daemon - Manages Playwright browser instance
Daemon starts automatically and persists between commands.
Platforms
Native Rust binaries for macOS (ARM64, x64), Linux (ARM64, x64), and Windows (x64).