CAPSOLVER
Blog
How to Solve CAPTCHA with Vercel Agent Browser โ€“ Step-by-Step Guide Using CapSolver

How to Solve CAPTCHA with Vercel Agent Browser โ€“ Step-by-Step Guide Using CapSolver

Logo of CapSolver

Ethan Collins

Pattern Recognition Specialist

18-Mar-2026

Solve CAPTCHA with Vercel Agent Browser

When your AI agent hits a CAPTCHA wall, the entire workflow breaks. Navigation stops, forms can't be submitted, and data extraction fails โ€” all because of a challenge designed to block automated access. Vercel Agent Browser is a fast, native Rust CLI for headless browser automation built specifically for AI agents. It features accessibility-first element selection, semantic locators, and a snapshot-ref workflow optimized for LLMs. But like any browser automation tool, it gets stuck on CAPTCHAs.

CapSolver changes this completely. By loading the CapSolver Chrome extension into Agent Browser using the built-in --extension flag, CAPTCHAs are resolved automatically and invisibly in the background. No manual solving. No complex API orchestration. Your CLI commands keep running as if the CAPTCHA was never there.

The best part? Agent Browser supports extensions in both headed and headless mode โ€” unlike Playwright, which requires headed mode for extensions. This means your production pipelines, CI/CD workflows, and serverless deployments all work with zero display requirements. Your agent focuses on what it does best โ€” navigating pages, extracting data, and automating workflows โ€” while CapSolver handles CAPTCHAs silently.

What is Vercel Agent Browser?

Vercel Agent Browser is a headless browser automation CLI built in Rust for maximum performance. Developed by Vercel Labs, it provides a command-line interface that controls Chrome without requiring Playwright or Node.js for the browser daemon. Its accessibility-first design uses semantic locators and snapshot refs โ€” making it the ideal tool for AI agents that need to interact with web pages.

Key Features

  • Native Rust CLI: Fast, single-binary tool with no runtime dependencies for the browser daemon.
  • Snapshot-Ref Workflow: Get an accessibility tree with element refs, then interact by ref โ€” deterministic, fast, and AI-friendly.
  • Semantic Locators: Find elements by ARIA role, text content, label, placeholder, or alt text โ€” no brittle CSS selectors.
  • Headless Extension Support: Load Chrome extensions in both headed and headless mode via Chrome's --headless=new.
  • Session Management: Isolated sessions, persistent profiles, encrypted state storage, and auth vault for credential management.
  • JSON Output Mode: Machine-readable output for agent pipelines with --json.
  • Cloud Providers: Built-in support for Browserless, Browserbase, Browser Use, Kernel, and iOS Simulator.
  • Security: Domain allowlists, action policies, content boundaries, and confirmation gates for safe AI agent deployments.

Agent Browser operates on any page โ€” including authenticated content, dynamic SPAs, and CAPTCHA-protected sites โ€” making it ideal for AI agent workflows, data collection, and automated testing.

What is CapSolver?

CapSolver is a leading AI-powered CAPTCHA solving service that automatically resolves diverse CAPTCHA challenges. With fast response times and broad compatibility, CapSolver integrates seamlessly into automated workflows.

Supported CAPTCHA Types

  • reCAPTCHA v2 (checkbox and invisible)
  • reCAPTCHA v3 & v3 Enterprise
  • Cloudflare Turnstile
  • Cloudflare 5-second Challenge
  • AWS WAF CAPTCHA
  • More

Why This Integration is Different

Most CAPTCHA-solving integrations require you to write boilerplate code: create tasks, poll for results, inject tokens into hidden fields. That's the standard approach with raw Playwright or Puppeteer scripts.

Agent Browser + CapSolver takes a fundamentally different approach:

Traditional (Code-Based) Agent Browser + CapSolver Extension
Write a CapSolver service class Add --extension flag to your command
Call createTask() / getTaskResult() Extension handles everything automatically
Inject tokens via JavaScript evaluation Token injection is invisible
Handle errors, retries, timeouts in code Extension manages retries internally
Different code for each CAPTCHA type Works for all types automatically
Headed mode required for extensions Works in both headed AND headless mode

The key insight: The CapSolver extension runs inside Agent Browser's Chrome instance. When Agent Browser navigates to a page with a CAPTCHA, the extension detects it, solves it in the background, and injects the token โ€” all before your next command executes. Your automation stays clean, focused, and CAPTCHA-free.

Prerequisites

Before setting up the integration, make sure you have:

  • Vercel Agent Browser installed (npm install -g agent-browser)
  • A CapSolver account with API key (sign up here)
  • Node.js 16+ (for npm installation)

Note: Unlike Playwright-based tools, Agent Browser supports extensions in both headed and headless mode. No Xvfb or virtual display required on servers.

Step-by-Step Setup

Step 1: Install Agent Browser

bash Copy
npm install -g agent-browser
agent-browser install  # Download Chrome from Chrome for Testing (first time only)

Alternative installation methods:

bash Copy
# macOS via Homebrew
brew install agent-browser
agent-browser install

# Via Cargo (Rust)
cargo install agent-browser
agent-browser install

On Linux, include system dependencies:

bash Copy
agent-browser install --with-deps

Step 2: Download the CapSolver Chrome Extension

Download the CapSolver Chrome extension and extract it to a dedicated directory:

  1. Go to the CapSolver Chrome Extension v1.17.0 release
  2. Download CapSolver.Browser.Extension-chrome-v1.17.0.zip
  3. Extract the zip:
bash Copy
mkdir -p ~/capsolver-extension
unzip CapSolver.Browser.Extension-chrome-v*.zip -d ~/capsolver-extension/
  1. Verify the extraction worked:
bash Copy
ls ~/capsolver-extension/manifest.json

You should see manifest.json โ€” this confirms the extension is in the right place.

Step 3: Configure Your CapSolver API Key

Open the extension config file at ~/capsolver-extension/assets/config.js and replace the apiKey value with your own:

javascript Copy
export const defaultConfig = {
  apiKey: 'CAP-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX', // โ† your key here
  useCapsolver: true,
  // ... rest of the config
};

You can get your API key from your CapSolver dashboard.

Step 4: Launch Agent Browser with the CapSolver Extension

Loading the extension is a single flag โ€” --extension:

bash Copy
agent-browser --extension ~/capsolver-extension open https://example.com/protected-page

That's it. The CapSolver extension is now active inside the browser and will auto-solve any CAPTCHA it encounters.

For headed mode (to visually see the browser):

bash Copy
agent-browser --extension ~/capsolver-extension --headed open https://example.com/protected-page

Step 5: Verify the Extension is Loaded

In headed mode, navigate to chrome://extensions to see the CapSolver extension listed and enabled:

bash Copy
agent-browser --extension ~/capsolver-extension --headed open chrome://extensions

In headless mode, check the browser console for CapSolver log messages:

bash Copy
agent-browser --extension ~/capsolver-extension open https://example.com
agent-browser console

How to Use It

Once setup is complete, using CapSolver with Agent Browser is straightforward โ€” just add the --extension flag and a wait command.

The Golden Rule

Don't write CAPTCHA-specific logic. Just add a wait after navigating to CAPTCHA-protected pages, and let the extension do its work.

Example 1: Form Submission Behind reCAPTCHA

bash Copy
# Navigate to the page with CapSolver extension loaded
agent-browser --extension ~/capsolver-extension open https://example.com/contact

# Get a snapshot to discover form elements
agent-browser snapshot -i
# Output:
# - textbox "Name" [ref=e1]
# - textbox "Email" [ref=e2]
# - textbox "Message" [ref=e3]
# - button "Submit" [ref=e4]

# Fill in the form
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "[email protected]"
agent-browser fill @e3 "Hello, I have a question about your services."

# Wait for CapSolver to resolve the CAPTCHA
agent-browser wait 30000

# Submit โ€” the CAPTCHA token is already injected
agent-browser click @e4

Example 2: Login Page with Cloudflare Turnstile

bash Copy
# Navigate to login page
agent-browser --extension ~/capsolver-extension open https://example.com/login

# Get interactive elements
agent-browser snapshot -i

# Fill credentials
agent-browser find label "Email" fill "[email protected]"
agent-browser find label "Password" fill "mypassword123"

# Wait for Turnstile to be resolved
agent-browser wait 20000

# Click login โ€” Turnstile already handled
agent-browser find role button click --name "Log in"

Example 3: Data Extraction from Protected Pages

bash Copy
# Navigate to protected page
agent-browser --extension ~/capsolver-extension open https://example.com/data

# Wait for any CAPTCHA challenge to clear
agent-browser wait 30000

# Extract page content using snapshot
agent-browser snapshot --json

# Or get specific element text
agent-browser get text "body"

Example 4: Chained Commands (Single Line)

Agent Browser supports command chaining for efficient automation:

bash Copy
# Open, wait for CAPTCHA, fill form, and submit โ€” all in one line
agent-browser --extension ~/capsolver-extension open https://example.com/contact && \
  agent-browser wait 30000 && \
  agent-browser snapshot -i && \
  agent-browser fill @e1 "John Doe" && \
  agent-browser fill @e2 "[email protected]" && \
  agent-browser click @e3

Example 5: Scripted Workflow with JSON Output

For AI agent pipelines, use --json for machine-readable output:

bash Copy
#!/bin/bash
EXTENSION=~/capsolver-extension

# Open page with extension
agent-browser --extension "$EXTENSION" open https://example.com/protected

# Wait for CAPTCHA to resolve
agent-browser wait 30000

# Get snapshot as JSON for AI processing
SNAPSHOT=$(agent-browser snapshot -i --json)

# Parse refs and interact
agent-browser click @e2
agent-browser get text "body" --json
CAPTCHA Type Typical Solve Time Recommended Wait
reCAPTCHA v2 (checkbox) 5-15 seconds 30-60 seconds
reCAPTCHA v2 (invisible) 5-15 seconds 30 seconds
reCAPTCHA v3 3-10 seconds 20-30 seconds
Cloudflare Turnstile 3-10 seconds 20-30 seconds

Tip: When in doubt, use 30 seconds. It's better to wait a bit longer than to submit too early. The extra time doesn't affect the result.

How It Works Behind the Scenes

Here's what happens when Agent Browser runs with the CapSolver extension loaded:

Copy
Your Agent Browser Commands
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
agent-browser --extension       โ”€โ”€โ–บ  Chrome launches with extension
  ~/capsolver-extension
  open https://...
                                           โ”‚
                                           โ–ผ
                               โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                               โ”‚  Page with CAPTCHA widget     โ”‚
                               โ”‚                               โ”‚
                               โ”‚  CapSolver Extension:         โ”‚
                               โ”‚  1. Content script detects    โ”‚
                               โ”‚     CAPTCHA on the page       โ”‚
                               โ”‚  2. Service worker calls      โ”‚
                               โ”‚     CapSolver API             โ”‚
                               โ”‚  3. Token received            โ”‚
                               โ”‚  4. Token injected into       โ”‚
                               โ”‚     hidden form field         โ”‚
                               โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                           โ”‚
                                           โ–ผ
agent-browser wait 30000         Extension resolves CAPTCHA...
                                           โ”‚
                                           โ–ผ
agent-browser snapshot -i        Agent Browser reads elements
agent-browser click @e2          Form submits WITH valid token
                                           โ”‚
                                           โ–ผ
                               "Verification successful!"

How the Extension Loads

When Agent Browser launches Chrome with the --extension flag:

  1. Chrome starts with the CapSolver extension loaded (using --headless=new in headless mode, which supports Manifest V3 extensions)
  2. The extension activates โ€” its service worker starts and content scripts inject into every page
  3. On pages with CAPTCHAs โ€” the content script detects the widget, calls the CapSolver API, and injects the solution token into the page
  4. Agent Browser operates normally โ€” snapshots, clicks, and data extraction work as usual, with CAPTCHAs already handled

Full Configuration Reference

Here's a complete setup with all configuration options for the Agent Browser + CapSolver integration:

CLI Flags

bash Copy
agent-browser \
  --extension ~/capsolver-extension \
  --headed \
  --session-name my-session \
  --profile ./browser-data \
  open https://example.com

Environment Variables

bash Copy
# Set extension path as environment variable (avoids repeating --extension)
export AGENT_BROWSER_EXTENSIONS=~/capsolver-extension

# Now every command automatically loads the extension
agent-browser open https://example.com
agent-browser wait 30000
agent-browser snapshot -i

Config File (agent-browser.json)

Create an agent-browser.json in your project directory for persistent defaults:

json Copy
{
  "extension": ["~/capsolver-extension"],
  "sessionName": "my-project",
  "headed": false
}

Configuration Options

Option Description
--extension <path> Path to unpacked CapSolver extension directory containing manifest.json. Repeatable for multiple extensions.
--headed Show browser window for visual debugging. Extensions work in both modes.
--session-name <name> Auto-save/restore cookies and localStorage across browser restarts.
--profile <path> Persistent browser profile directory (cookies, IndexedDB, cache).
AGENT_BROWSER_EXTENSIONS Environment variable alternative to --extension flag. Comma-separated paths for multiple extensions.

The CapSolver API key is configured directly in the extension's assets/config.js file (see Step 3 above).

Troubleshooting

Extension Not Loading

Symptom: CAPTCHAs aren't being solved automatically.

Possible causes:

  • Wrong extension path โ€” ensure manifest.json exists in the specified directory
  • Extension not compatible โ€” use the Chrome version of the CapSolver extension (not Firefox)

Solution: Verify the path and check that the extension loads:

bash Copy
# Verify manifest exists
ls ~/capsolver-extension/manifest.json

# Test in headed mode to visually confirm
agent-browser --extension ~/capsolver-extension --headed open chrome://extensions

CAPTCHA Not Solved (Form Fails)

Possible causes:

  • Insufficient wait time โ€” Increase to 60 seconds
  • Invalid API key โ€” Check your CapSolver dashboard
  • Insufficient balance โ€” Top up your CapSolver account
  • Extension not loaded โ€” See "Extension Not Loading" above

Debug with console logs:

bash Copy
agent-browser --extension ~/capsolver-extension open https://example.com
agent-browser wait 30000
agent-browser console  # Check for CapSolver messages

Chrome Not Found

Symptom: agent-browser can't find a Chrome executable.

Solution: Run the install command to download Chrome for Testing:

bash Copy
agent-browser install

Or point to a custom Chrome executable:

bash Copy
agent-browser --executable-path /path/to/chrome open https://example.com

Multiple Extensions

You can load multiple extensions by repeating the --extension flag:

bash Copy
agent-browser \
  --extension ~/capsolver-extension \
  --extension ~/another-extension \
  open https://example.com

Best Practices

  1. Use the AGENT_BROWSER_EXTENSIONS environment variable. Set it once in your shell profile or CI config, and every agent-browser command automatically loads CapSolver without repeating the flag.

  2. Always use generous wait times. More wait time is always safer. The CAPTCHA typically resolves in 5-20 seconds, but network latency, complex challenges, or retries can add time. 30-60 seconds is the sweet spot.

  3. Keep your automation scripts clean. Don't add CAPTCHA-specific logic to your commands. The extension handles everything โ€” your scripts should focus purely on navigation, interaction, and data extraction.

  4. Monitor your CapSolver balance. Each CAPTCHA resolution costs credits. Check your balance at capsolver.com/dashboard regularly to avoid interruptions.

  5. Use session persistence for repeat visits. Use --session-name or --profile to preserve cookies across runs. This can reduce CAPTCHA frequency since the site may recognize returning sessions.

  6. Leverage headless mode in production. Unlike Playwright, Agent Browser supports extensions in headless mode. No need for Xvfb or virtual displays on servers โ€” just run your commands directly.

Conclusion

The Vercel Agent Browser + CapSolver integration brings invisible CAPTCHA solving to the fastest, most AI-optimized browser automation CLI available. Instead of writing complex CAPTCHA-handling code, you simply:

  1. Download the CapSolver extension and configure your API key
  2. Add --extension ~/capsolver-extension to your Agent Browser commands
  3. Add a wait command before interacting with CAPTCHA-protected forms

The CapSolver Chrome extension handles the rest โ€” detecting CAPTCHAs, solving them via the CapSolver API, and injecting tokens into the page. Your Agent Browser commands never need to know about CAPTCHAs at all.

And unlike Playwright-based solutions that require headed mode and virtual displays, Agent Browser supports extensions in headless mode out of the box โ€” making it the simplest path to CAPTCHA-free automation in production.

Ready to get started? Sign up for CapSolver and use the bonus code AGENTBROWSER to get an extra 6% on your first top-up!

FAQ

Do I need to write CAPTCHA-specific code?

No. The CapSolver extension works entirely in the background within Agent Browser's Chrome instance. Just add an agent-browser wait 30000 before submitting forms, and the extension handles detection, solving, and token injection automatically.

Can I run this in headless mode?

Yes! This is a major advantage over Playwright-based solutions. Agent Browser uses Chrome's --headless=new mode, which supports Manifest V3 extensions. No Xvfb or virtual display required.

Do I need Playwright or Node.js?

No. Agent Browser is a standalone Rust binary. You only need Node.js for the npm install step. The browser daemon runs natively without any JavaScript runtime.

What CAPTCHA types does CapSolver support?

CapSolver supports reCAPTCHA v2 (checkbox and invisible), reCAPTCHA v3, Cloudflare Turnstile, AWS WAF CAPTCHA, and more. The extension automatically detects the CAPTCHA type and resolves it accordingly.

How much does CapSolver cost?

CapSolver offers competitive pricing based on CAPTCHA type and volume. Visit capsolver.com for current pricing.

Is Vercel Agent Browser free?

Yes. Agent Browser is open source under the Apache 2.0 license. The CLI and all features are free to use. Visit the GitHub repository for more details.

How long should I wait for the CAPTCHA to be solved?

For most CAPTCHAs, 30-60 seconds is sufficient. The actual solve time is typically 5-20 seconds, but adding extra buffer ensures reliability. When in doubt, use 30 seconds via agent-browser wait 30000.

Can I use this with AI agents?

Absolutely. Agent Browser was built specifically for AI agents (there are some choices to compare ). Use --json for machine-readable output, the snapshot-ref workflow for deterministic element selection, and command chaining for efficient multi-step automation. The CapSolver extension runs transparently alongside your agent's commands.

Compliance Disclaimer: The information provided on this blog is for informational purposes only. CapSolver is committed to compliance with all applicable laws and regulations. The use of the CapSolver network for illegal, fraudulent, or abusive activities is strictly prohibited and will be investigated. Our captcha-solving solutions enhance user experience while ensuring 100% compliance in helping solve captcha difficulties during public data crawling. We encourage responsible use of our services. For more information, please visit our Terms of Service and Privacy Policy.

More

Solve CAPTCHA with Vercel Agent Browser
How to Solve CAPTCHA with Vercel Agent Browser โ€“ Step-by-Step Guide Using CapSolver

Learn how to integrate CapSolver with Agent Browser to handle CAPTCHAs and build reliable AI automation workflows.

AI
Logo of CapSolver

Ethan Collins

18-Mar-2026

Integrating CapSolver with Web MCP: A Guide for Autonomous Agents
Integrating CapSolver with Web MCP: A Guide for Autonomous Agents

Enhance your AI agent's web automation capabilities. This guide details how to integrate CapSolver for efficient captcha solving within the Web MCP framework, ensuring reliable and compliant operations.

AI
Logo of CapSolver

Rajinder Singh

17-Mar-2026

CAPTCHA AI Powered by Large Models
CAPTCHA AI Powered by Large Models: Why It's More Suitable for Enterprise Scenarios

How AI visual models are reshaping CAPTCHA recognition and why enterprise-grade solvers need data, scale, and custom training.

AI
Logo of CapSolver

Ethan Collins

13-Mar-2026

WebMCP vs MCP: Whatโ€™s the Difference for AI Agents?
WebMCP vs MCP: Whatโ€™s the Difference for AI Agents?

Explore the key differences between WebMCP and MCP for AI agents, understanding their roles in web automation and structured data interaction. Learn how these protocols shape the future of AI agent capabilities.

AI
Logo of CapSolver

Emma Foster

12-Mar-2026

OpenClaw vs. Nanobot
OpenClaw vs. Nanobot: Choosing Your AI Agent for Automation

Compare OpenClaw and Nanobot, two leading AI agent frameworks, for efficient automation. Discover their features, performance, and how CapSolver enhances their capabilities.

AI
Logo of CapSolver

Nikolai Smirnov

11-Mar-2026

 Solve CAPTCHA in OpenClaw
How to Solve CAPTCHA in OpenClaw โ€“ Step-by-Step Guide with CapSolver Extension

Learn how to solve CAPTCHA in OpenClaw using the CapSolver Chrome extension for seamless AI browser automation.

AI
Logo of CapSolver

Lucas Mitchell

06-Mar-2026