Browser automation MCP

A higher-level layer for browser automation built for AI agents

An MCP browser server hands your agent raw primitives to navigate, click, and snapshot, then leaves it to stitch them together. Kane CLI takes a plain-English objective, drives a real browser, and returns a verified pass or fail. Free to install.

npm install -g @testmuai/kane-cli

or read the documentation

Beyond MCP browser primitives

An MCP server for browser automation exposes low-level primitives. The agent still plans every click, tracks page structure across snapshots, recovers stale refs, and reads raw output to judge the result. That is reasoning spent on plumbing.

Kane CLI raises the contract. You give it an objective, not instructions. It drives a real Chrome browser, asserts each step, autoheals up to 50 steps when the UI shifts, and returns a verified pass or fail with video evidence.

It installs as an agent skill with one command and emits NDJSON, so there is no per-client MCP server to host. Any validated flow still exports to native Playwright when you want the code.

Kane CLI driving a browser flow from a natural language objective

What you get above an MCP browser server

The orchestration an MCP server leaves to your agent, handled by the tool.

Objectives, not primitives

Where a browser MCP server feeds the agent navigate, click, and snapshot calls to sequence, Kane CLI accepts the full journey as one plain-English objective and works out the path itself.

Verified pass or fail

Inline assertions check each step along the way, not just the closing screen, and hand back a binary verdict. The agent never parses raw MCP tool output to guess the outcome.

Autoheal up to 50 steps

When the layout drifts, Kane CLI re-resolves intent and presses on through up to 50 steps until the objective is verified, instead of giving up on a stale ref the way primitive calls do.

Agent-native NDJSON

The --agent flag streams one JSON object per event and closes with a run_end verdict carrying status, summary, and screenshots, ready for an agent loop to read with no MCP client in between.

No MCP client to configure

Nothing to register against an MCP host and no per-client server to keep running. Kane CLI installs as a skill and runs the same from a terminal, in CI, or inside an agent.

Export to Playwright

Any verified flow exports to native Playwright in one command, so choosing the objective layer over raw primitives never costs you access to the underlying code.

Build up confidence locally

Start in your terminal

Validate on the cloud

Release with confidence

An agent skill, not another MCP server to configure

Kane CLI and KaneAI share the same automation engine and dashboard.

A skill, not a pile of primitives

A browser MCP server hands the agent raw tool calls to sequence. Kane CLI installs as a skill that takes an objective and returns a verified result, so the sequencing lives in the tool rather than in the agent's context.

Verify what the agent ships

Have the agent generate the code, then send the objective to Kane CLI to confirm it works in a real browser before the PR opens. Primitive calls describe actions; Kane CLI confirms the result.

Proof attached to every run

Each objective returns a persistent video, a step trace, and a replay link to paste into a PR or bug report. No reconstructing what happened from a stream of snapshot output.

Give your agent browser automation in three steps

Install as an agent skill

One curl command drops the Kane CLI skill into Claude Code, Codex, or Gemini. Nothing to register against an MCP host and no per-client server to keep alive.

Describe the objective

Hand your agent the journey in plain English: open the app, sign in, create a record, confirm it appears. From there the objective, not a chain of primitive calls, is the input.

Read the verified result

The run streams back as NDJSON closing with a pass or fail, autohealed through layout drift, with a screenshot on failure. Your agent reads stdout directly, no MCP client to relay it.

Get Started With Kane CLI

🎉 Launch offer: Bonus credits for the first 3 months on paid plans

Choose the right plan for you

Local test authoring via CLI

Auto-heal & vision

View test cases on UI

Test Manager

Free

/month

200 Credits

Resets in every

30 days

Free tier

Starter

$19

/month

2000 Credits

Launch: 4,000 Credits (+100%)

Bonus for first 3 months

Free tier

Get the technical rundown

Blog

A look at Kane CLI. What we built, what it does, and where it is headed.

Read the blog

Documentation

Everything you need to install, configure, and run Kane CLI in under 2 minutes.

View documentation

GitHub

Browse the source, file issues, and follow the roadmap on GitHub.

Open GitHub

Frequently asked questions

Browser automation MCP is a Model Context Protocol server that surfaces raw browser primitives (navigate, click, type, snapshot) to an AI client. The agent fires those tools one call at a time, keeps the page tree in its own context across snapshots, and reads the raw return value to infer what changed. It puts a browser in reach of the agent, but the asserting, recovery, and reporting stay the agent's problem.

A browser MCP server gives the agent low-level primitives to sequence and interpret on its own. Kane CLI lifts that contract a level: state the goal in plain English and receive a verified pass or fail. It runs a real Chrome session, asserts every step inline, autoheals up to 50 steps when the layout moves, and streams structured NDJSON. Kane CLI ships as a CLI and an agent skill rather than an MCP server, so it sits above the primitive layer instead of inside it.

No. There is no MCP host to stand up and no per-client server to register the way a browser MCP server requires. Kane CLI runs from any shell, in your pipeline, or inside an AI coding agent. Add the --agent flag and it streams structured NDJSON that Claude Code, Codex, or Gemini reads straight off stdout. Nothing gets wired into an MCP client.

One command drops it in as a skill. For Claude Code, run mkdir -p ~/.claude/skills/kane-cli and curl the SKILL.md from the Kane CLI repo into that folder. Codex adds a section to AGENTS.md, and Gemini reads ~/.gemini/skills/kane-cli/SKILL.md. From then on the agent spots a browser task, assembles the command, parses the verdict, and digs into any failure without being told how.

Yes. Sign in with your TestMu AI credentials, add --headless and --timeout per run, and branch your pipeline on the exit code: 0 for pass, 1 for fail, 2 for setup or auth errors, and 3 for timeout. The same binary you run headed on a laptop runs headless in GitHub Actions, GitLab CI, Jenkins, or Bitbucket Pipelines, no command rewrite needed.

Installing and running the CLI is free. Local runs cost nothing; cloud runs on the TestMu AI grid draw against your TestMu AI plan. Begin on the free tier and take an objective end to end with no credit card.

Install Kane CLI as an agent skill

Skip the per-client MCP server. Point your AI coding agent at the Kane CLI guide and one curl command installs the skill, so it runs verified browser objectives on its own.

Point your agent to: testmuai.com/kane-cli/agents.md