Mar17, 2026

How to Solve CAPTCHA with Vercel Agent Browser – Step-by-Step Guide Using CapSolver

Ethan Collins

Pattern Recognition Specialist

When your AI agent hits a CAPTCHA wall, the entire workflow breaks. Navigation stops, forms can't be submitted, and data extraction fails — all because of a challenge designed to block automated access. Vercel Agent Browser is a fast, native Rust CLI for headless browser automation built specifically for AI agents. It features accessibility-first element selection, semantic locators, and a snapshot-ref workflow optimized for LLMs. But like any browser automation tool, it gets stuck on CAPTCHAs.

CapSolver changes this completely. By loading the CapSolver Chrome extension into Agent Browser using the built-in --extension flag, CAPTCHAs are resolved automatically and invisibly in the background. No manual solving. No complex API orchestration. Your CLI commands keep running as if the CAPTCHA was never there.

The best part? Agent Browser supports extensions in both headed and headless mode — unlike Playwright, which requires headed mode for extensions. This means your production pipelines, CI/CD workflows, and serverless deployments all work with zero display requirements. Your agent focuses on what it does best — navigating pages, extracting data, and automating workflows — while CapSolver handles CAPTCHAs silently.

What is Vercel Agent Browser?

Vercel Agent Browser is a headless browser automation CLI built in Rust for maximum performance. Developed by Vercel Labs, it provides a command-line interface that controls Chrome without requiring Playwright or Node.js for the browser daemon. Its accessibility-first design uses semantic locators and snapshot refs — making it the ideal tool for AI agents that need to interact with web pages.

Key Features

Native Rust CLI: Fast, single-binary tool with no runtime dependencies for the browser daemon.
Snapshot-Ref Workflow: Get an accessibility tree with element refs, then interact by ref — deterministic, fast, and AI-friendly.
Semantic Locators: Find elements by ARIA role, text content, label, placeholder, or alt text — no brittle CSS selectors.
Headless Extension Support: Load Chrome extensions in both headed and headless mode via Chrome's --headless=new.
Session Management: Isolated sessions, persistent profiles, encrypted state storage, and auth vault for credential management.
JSON Output Mode: Machine-readable output for agent pipelines with --json.
Cloud Providers: Built-in support for Browserless, Browserbase, Browser Use, Kernel, and iOS Simulator.
Security: Domain allowlists, action policies, content boundaries, and confirmation gates for safe AI agent deployments.

Agent Browser operates on any page — including authenticated content, dynamic SPAs, and CAPTCHA-protected sites — making it ideal for AI agent workflows, data collection, and automated testing.

What is CapSolver?

CapSolver is a leading AI-powered CAPTCHA solving service that automatically resolves diverse CAPTCHA challenges. With fast response times and broad compatibility, CapSolver integrates seamlessly into automated workflows.

Supported CAPTCHA Types

reCAPTCHA v2 (checkbox and invisible)
reCAPTCHA v3 & v3 Enterprise
Cloudflare Turnstile
Cloudflare 5-second Challenge
AWS WAF CAPTCHA
More

Why This Integration is Different

Most CAPTCHA-solving integrations require you to write boilerplate code: create tasks, poll for results, inject tokens into hidden fields. That's the standard approach with raw Playwright or Puppeteer scripts.

Agent Browser + CapSolver takes a fundamentally different approach:

Traditional (Code-Based)	Agent Browser + CapSolver Extension
Write a CapSolver service class	Add `--extension` flag to your command
Call `createTask()` / `getTaskResult()`	Extension handles everything automatically
Inject tokens via JavaScript evaluation	Token injection is invisible
Handle errors, retries, timeouts in code	Extension manages retries internally
Different code for each CAPTCHA type	Works for all types automatically
Headed mode required for extensions	Works in both headed AND headless mode

The key insight: The CapSolver extension runs inside Agent Browser's Chrome instance. When Agent Browser navigates to a page with a CAPTCHA, the extension detects it, solves it in the background, and injects the token — all before your next command executes. Your automation stays clean, focused, and CAPTCHA-free.

Prerequisites

Before setting up the integration, make sure you have:

Vercel Agent Browser installed (npm install -g agent-browser)
A CapSolver account with API key (sign up here)
Node.js 16+ (for npm installation)

Note: Unlike Playwright-based tools, Agent Browser supports extensions in both headed and headless mode. No Xvfb or virtual display required on servers.

Step-by-Step Setup

Step 1: Install Agent Browser

bash Copy

npm install -g agent-browser
agent-browser install  # Download Chrome from Chrome for Testing (first time only)

Alternative installation methods:

bash Copy

# macOS via Homebrew
brew install agent-browser
agent-browser install

# Via Cargo (Rust)
cargo install agent-browser
agent-browser install

On Linux, include system dependencies:

bash Copy

agent-browser install --with-deps

Step 2: Download the CapSolver Chrome Extension

Download the CapSolver Chrome extension and extract it to a dedicated directory:

Go to the CapSolver Chrome Extension v1.17.0 release
Download CapSolver.Browser.Extension-chrome-v1.17.0.zip
Extract the zip:

bash Copy

mkdir -p ~/capsolver-extension
unzip CapSolver.Browser.Extension-chrome-v*.zip -d ~/capsolver-extension/

Verify the extraction worked:

bash Copy

ls ~/capsolver-extension/manifest.json

You should see manifest.json — this confirms the extension is in the right place.

Step 3: Configure Your CapSolver API Key

Open the extension config file at ~/capsolver-extension/assets/config.js and replace the apiKey value with your own:

javascript Copy

export const defaultConfig = {
  apiKey: 'CAP-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX', // ← your key here
  useCapsolver: true,
  // ... rest of the config
};

You can get your API key from your CapSolver dashboard.

Step 4: Launch Agent Browser with the CapSolver Extension

Loading the extension is a single flag — --extension:

bash Copy

agent-browser --extension ~/capsolver-extension open https://example.com/protected-page

That's it. The CapSolver extension is now active inside the browser and will auto-solve any CAPTCHA it encounters.

For headed mode (to visually see the browser):

bash Copy

agent-browser --extension ~/capsolver-extension --headed open https://example.com/protected-page

Step 5: Verify the Extension is Loaded

In headed mode, navigate to chrome://extensions to see the CapSolver extension listed and enabled:

bash Copy

agent-browser --extension ~/capsolver-extension --headed open chrome://extensions

In headless mode, check the browser console for CapSolver log messages:

bash Copy

agent-browser --extension ~/capsolver-extension open https://example.com
agent-browser console

How to Use It

Once setup is complete, using CapSolver with Agent Browser is straightforward — just add the --extension flag and a wait command.

The Golden Rule

Don't write CAPTCHA-specific logic. Just add a wait after navigating to CAPTCHA-protected pages, and let the extension do its work.

Example 1: Form Submission Behind reCAPTCHA

bash Copy

# Navigate to the page with CapSolver extension loaded
agent-browser --extension ~/capsolver-extension open https://example.com/contact

# Get a snapshot to discover form elements
agent-browser snapshot -i
# Output:
# - textbox "Name" [ref=e1]
# - textbox "Email" [ref=e2]
# - textbox "Message" [ref=e3]
# - button "Submit" [ref=e4]

# Fill in the form
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser fill @e3 "Hello, I have a question about your services."

# Wait for CapSolver to resolve the CAPTCHA
agent-browser wait 30000

# Submit — the CAPTCHA token is already injected
agent-browser click @e4

bash Copy

# Navigate to login page
agent-browser --extension ~/capsolver-extension open https://example.com/login

# Get interactive elements
agent-browser snapshot -i

# Fill credentials
agent-browser find label "Email" fill "me@example.com"
agent-browser find label "Password" fill "mypassword123"

# Wait for Turnstile to be resolved
agent-browser wait 20000

# Click login — Turnstile already handled
agent-browser find role button click --name "Log in"

Example 3: Data Extraction from Protected Pages

bash Copy

# Navigate to protected page
agent-browser --extension ~/capsolver-extension open https://example.com/data

# Wait for any CAPTCHA challenge to clear
agent-browser wait 30000

# Extract page content using snapshot
agent-browser snapshot --json

# Or get specific element text
agent-browser get text "body"

Example 4: Chained Commands (Single Line)

Agent Browser supports command chaining for efficient automation:

bash Copy

# Open, wait for CAPTCHA, fill form, and submit — all in one line
agent-browser --extension ~/capsolver-extension open https://example.com/contact && \
  agent-browser wait 30000 && \
  agent-browser snapshot -i && \
  agent-browser fill @e1 "John Doe" && \
  agent-browser fill @e2 "john@example.com" && \
  agent-browser click @e3

Example 5: Scripted Workflow with JSON Output

For AI agent pipelines, use --json for machine-readable output:

bash Copy

#!/bin/bash
EXTENSION=~/capsolver-extension

# Open page with extension
agent-browser --extension "$EXTENSION" open https://example.com/protected

# Wait for CAPTCHA to resolve
agent-browser wait 30000

# Get snapshot as JSON for AI processing
SNAPSHOT=$(agent-browser snapshot -i --json)

# Parse refs and interact
agent-browser click @e2
agent-browser get text "body" --json

Recommended Wait Times

CAPTCHA Type	Typical Solve Time	Recommended Wait
reCAPTCHA v2 (checkbox)	5-15 seconds	30-60 seconds
reCAPTCHA v2 (invisible)	5-15 seconds	30 seconds
reCAPTCHA v3	3-10 seconds	20-30 seconds
Cloudflare Turnstile	3-10 seconds	20-30 seconds

Tip: When in doubt, use 30 seconds. It's better to wait a bit longer than to submit too early. The extra time doesn't affect the result.

How It Works Behind the Scenes

Here's what happens when Agent Browser runs with the CapSolver extension loaded:

Copy

Your Agent Browser Commands
───────────────────────────────────────────────────
agent-browser --extension       ──►  Chrome launches with extension
  ~/capsolver-extension
  open https://...
                                           │
                                           ▼
                               ┌─────────────────────────────┐
                               │  Page with CAPTCHA widget     │
                               │                               │
                               │  CapSolver Extension:         │
                               │  1. Content script detects    │
                               │     CAPTCHA on the page       │
                               │  2. Service worker calls      │
                               │     CapSolver API             │
                               │  3. Token received            │
                               │  4. Token injected into       │
                               │     hidden form field         │
                               └─────────────────────────────┘
                                           │
                                           ▼
agent-browser wait 30000         Extension resolves CAPTCHA...
                                           │
                                           ▼
agent-browser snapshot -i        Agent Browser reads elements
agent-browser click @e2          Form submits WITH valid token
                                           │
                                           ▼
                               "Verification successful!"

How the Extension Loads

When Agent Browser launches Chrome with the --extension flag:

Chrome starts with the CapSolver extension loaded (using --headless=new in headless mode, which supports Manifest V3 extensions)
The extension activates — its service worker starts and content scripts inject into every page
On pages with CAPTCHAs — the content script detects the widget, calls the CapSolver API, and injects the solution token into the page
Agent Browser operates normally — snapshots, clicks, and data extraction work as usual, with CAPTCHAs already handled

Full Configuration Reference

Here's a complete setup with all configuration options for the Agent Browser + CapSolver integration:

CLI Flags

bash Copy

agent-browser \
  --extension ~/capsolver-extension \
  --headed \
  --session-name my-session \
  --profile ./browser-data \
  open https://example.com

Environment Variables

bash Copy

# Set extension path as environment variable (avoids repeating --extension)
export AGENT_BROWSER_EXTENSIONS=~/capsolver-extension

# Now every command automatically loads the extension
agent-browser open https://example.com
agent-browser wait 30000
agent-browser snapshot -i

Config File (`agent-browser.json`)

Create an agent-browser.json in your project directory for persistent defaults:

json Copy

{
  "extension": ["~/capsolver-extension"],
  "sessionName": "my-project",
  "headed": false
}

Configuration Options

Option	Description
`--extension <path>`	Path to unpacked CapSolver extension directory containing `manifest.json`. Repeatable for multiple extensions.
`--headed`	Show browser window for visual debugging. Extensions work in both modes.
`--session-name <name>`	Auto-save/restore cookies and localStorage across browser restarts.
`--profile <path>`	Persistent browser profile directory (cookies, IndexedDB, cache).
`AGENT_BROWSER_EXTENSIONS`	Environment variable alternative to `--extension` flag. Comma-separated paths for multiple extensions.

The CapSolver API key is configured directly in the extension's assets/config.js file (see Step 3 above).

Troubleshooting

Extension Not Loading

Symptom: CAPTCHAs aren't being solved automatically.

Possible causes:

Wrong extension path — ensure manifest.json exists in the specified directory
Extension not compatible — use the Chrome version of the CapSolver extension (not Firefox)

Solution: Verify the path and check that the extension loads:

bash Copy

# Verify manifest exists
ls ~/capsolver-extension/manifest.json

# Test in headed mode to visually confirm
agent-browser --extension ~/capsolver-extension --headed open chrome://extensions

CAPTCHA Not Solved (Form Fails)

Possible causes:

Insufficient wait time — Increase to 60 seconds
Invalid API key — Check your CapSolver dashboard
Insufficient balance — Top up your CapSolver account
Extension not loaded — See "Extension Not Loading" above

Debug with console logs:

bash Copy

agent-browser --extension ~/capsolver-extension open https://example.com
agent-browser wait 30000
agent-browser console  # Check for CapSolver messages

Chrome Not Found

Symptom: agent-browser can't find a Chrome executable.

Solution: Run the install command to download Chrome for Testing:

bash Copy

agent-browser install

Or point to a custom Chrome executable:

bash Copy

agent-browser --executable-path /path/to/chrome open https://example.com

Multiple Extensions

You can load multiple extensions by repeating the --extension flag:

bash Copy

agent-browser \
  --extension ~/capsolver-extension \
  --extension ~/another-extension \
  open https://example.com

Best Practices

Use the AGENT_BROWSER_EXTENSIONS environment variable. Set it once in your shell profile or CI config, and every agent-browser command automatically loads CapSolver without repeating the flag.
Always use generous wait times. More wait time is always safer. The CAPTCHA typically resolves in 5-20 seconds, but network latency, complex challenges, or retries can add time. 30-60 seconds is the sweet spot.
Keep your automation scripts clean. Don't add CAPTCHA-specific logic to your commands. The extension handles everything — your scripts should focus purely on navigation, interaction, and data extraction.
Monitor your CapSolver balance. Each CAPTCHA resolution costs credits. Check your balance at capsolver.com/dashboard regularly to avoid interruptions.
Use session persistence for repeat visits. Use --session-name or --profile to preserve cookies across runs. This can reduce CAPTCHA frequency since the site may recognize returning sessions.
Leverage headless mode in production. Unlike Playwright, Agent Browser supports extensions in headless mode. No need for Xvfb or virtual displays on servers — just run your commands directly.

Conclusion

The Vercel Agent Browser + CapSolver integration brings invisible CAPTCHA solving to the fastest, most AI-optimized browser automation CLI available. Instead of writing complex CAPTCHA-handling code, you simply:

Download the CapSolver extension and configure your API key
Add --extension ~/capsolver-extension to your Agent Browser commands
Add a wait command before interacting with CAPTCHA-protected forms

The CapSolver Chrome extension handles the rest — detecting CAPTCHAs, solving them via the CapSolver API, and injecting tokens into the page. Your Agent Browser commands never need to know about CAPTCHAs at all.

And unlike Playwright-based solutions that require headed mode and virtual displays, Agent Browser supports extensions in headless mode out of the box — making it the simplest path to CAPTCHA-free automation in production.

Ready to get started? Sign up for CapSolver and use the bonus code AGENTBROWSER to get an extra 6% on your first top-up!

FAQ

Do I need to write CAPTCHA-specific code?

No. The CapSolver extension works entirely in the background within Agent Browser's Chrome instance. Just add an agent-browser wait 30000 before submitting forms, and the extension handles detection, solving, and token injection automatically.

Can I run this in headless mode?

Yes! This is a major advantage over Playwright-based solutions. Agent Browser uses Chrome's --headless=new mode, which supports Manifest V3 extensions. No Xvfb or virtual display required.

Do I need Playwright or Node.js?

No. Agent Browser is a standalone Rust binary. You only need Node.js for the npm install step. The browser daemon runs natively without any JavaScript runtime.

What CAPTCHA types does CapSolver support?

CapSolver supports reCAPTCHA v2 (checkbox and invisible), reCAPTCHA v3, Cloudflare Turnstile, AWS WAF CAPTCHA, and more. The extension automatically detects the CAPTCHA type and resolves it accordingly.

How much does CapSolver cost?

CapSolver offers competitive pricing based on CAPTCHA type and volume. Visit capsolver.com for current pricing.

Is Vercel Agent Browser free?

Yes. Agent Browser is open source under the Apache 2.0 license. The CLI and all features are free to use. Visit the GitHub repository for more details.

How long should I wait for the CAPTCHA to be solved?

For most CAPTCHAs, 30-60 seconds is sufficient. The actual solve time is typically 5-20 seconds, but adding extra buffer ensures reliability. When in doubt, use 30 seconds via agent-browser wait 30000.

Can I use this with AI agents?

Absolutely. Agent Browser was built specifically for AI agents (there are some choices to compare ). Use --json for machine-readable output, the snapshot-ref workflow for deterministic element selection, and command chaining for efficient multi-step automation. The CapSolver extension runs transparently alongside your agent's commands.

AIApr 28, 2026

AI Agents in Web Scraping & Competitive Intelligence Guide

Discover how AI agents transform web scraping and competitive intelligence. Learn about automated data collection, anti-bot challenges, and CAPTCHA solutions for scalable workflows.

Sora Fujimoto

AIApr 24, 2026

AI Agent vs Chatbot: Key Differences in Automation Capabilities

Discover the key differences between AI agent vs chatbot. Learn how agentic AI outperforms traditional AI in automation, decision-making, and complex workflows.

Mar17, 2026

How to Solve CAPTCHA with Vercel Agent Browser – Step-by-Step Guide Using CapSolver

Ethan Collins

Pattern Recognition Specialist

What is Vercel Agent Browser?

Key Features

Native Rust CLI: Fast, single-binary tool with no runtime dependencies for the browser daemon.
Snapshot-Ref Workflow: Get an accessibility tree with element refs, then interact by ref — deterministic, fast, and AI-friendly.
Semantic Locators: Find elements by ARIA role, text content, label, placeholder, or alt text — no brittle CSS selectors.
Headless Extension Support: Load Chrome extensions in both headed and headless mode via Chrome's --headless=new.
Session Management: Isolated sessions, persistent profiles, encrypted state storage, and auth vault for credential management.
JSON Output Mode: Machine-readable output for agent pipelines with --json.
Cloud Providers: Built-in support for Browserless, Browserbase, Browser Use, Kernel, and iOS Simulator.
Security: Domain allowlists, action policies, content boundaries, and confirmation gates for safe AI agent deployments.

Agent Browser operates on any page — including authenticated content, dynamic SPAs, and CAPTCHA-protected sites — making it ideal for AI agent workflows, data collection, and automated testing.

What is CapSolver?

Supported CAPTCHA Types

reCAPTCHA v2 (checkbox and invisible)
reCAPTCHA v3 & v3 Enterprise
Cloudflare Turnstile
Cloudflare 5-second Challenge
AWS WAF CAPTCHA
More

Why This Integration is Different

Agent Browser + CapSolver takes a fundamentally different approach:

Traditional (Code-Based)	Agent Browser + CapSolver Extension
Write a CapSolver service class	Add `--extension` flag to your command
Call `createTask()` / `getTaskResult()`	Extension handles everything automatically
Inject tokens via JavaScript evaluation	Token injection is invisible
Handle errors, retries, timeouts in code	Extension manages retries internally
Different code for each CAPTCHA type	Works for all types automatically
Headed mode required for extensions	Works in both headed AND headless mode

Prerequisites

Before setting up the integration, make sure you have:

Vercel Agent Browser installed (npm install -g agent-browser)
A CapSolver account with API key (sign up here)
Node.js 16+ (for npm installation)

Note: Unlike Playwright-based tools, Agent Browser supports extensions in both headed and headless mode. No Xvfb or virtual display required on servers.

Step-by-Step Setup

Step 1: Install Agent Browser

bash Copy

npm install -g agent-browser
agent-browser install  # Download Chrome from Chrome for Testing (first time only)

Alternative installation methods:

bash Copy

# macOS via Homebrew
brew install agent-browser
agent-browser install

# Via Cargo (Rust)
cargo install agent-browser
agent-browser install

On Linux, include system dependencies:

bash Copy

agent-browser install --with-deps

Step 2: Download the CapSolver Chrome Extension

Download the CapSolver Chrome extension and extract it to a dedicated directory:

Go to the CapSolver Chrome Extension v1.17.0 release
Download CapSolver.Browser.Extension-chrome-v1.17.0.zip
Extract the zip:

bash Copy

mkdir -p ~/capsolver-extension
unzip CapSolver.Browser.Extension-chrome-v*.zip -d ~/capsolver-extension/

Verify the extraction worked:

bash Copy

ls ~/capsolver-extension/manifest.json

You should see manifest.json — this confirms the extension is in the right place.

Step 3: Configure Your CapSolver API Key

Open the extension config file at ~/capsolver-extension/assets/config.js and replace the apiKey value with your own:

javascript Copy

export const defaultConfig = {
  apiKey: 'CAP-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX', // ← your key here
  useCapsolver: true,
  // ... rest of the config
};

You can get your API key from your CapSolver dashboard.

Step 4: Launch Agent Browser with the CapSolver Extension

Loading the extension is a single flag — --extension:

bash Copy

agent-browser --extension ~/capsolver-extension open https://example.com/protected-page

That's it. The CapSolver extension is now active inside the browser and will auto-solve any CAPTCHA it encounters.

For headed mode (to visually see the browser):

bash Copy

agent-browser --extension ~/capsolver-extension --headed open https://example.com/protected-page

Step 5: Verify the Extension is Loaded

In headed mode, navigate to chrome://extensions to see the CapSolver extension listed and enabled:

bash Copy

agent-browser --extension ~/capsolver-extension --headed open chrome://extensions

In headless mode, check the browser console for CapSolver log messages:

bash Copy

agent-browser --extension ~/capsolver-extension open https://example.com
agent-browser console

How to Use It

Once setup is complete, using CapSolver with Agent Browser is straightforward — just add the --extension flag and a wait command.

The Golden Rule

Don't write CAPTCHA-specific logic. Just add a wait after navigating to CAPTCHA-protected pages, and let the extension do its work.

Example 1: Form Submission Behind reCAPTCHA

bash Copy

# Navigate to the page with CapSolver extension loaded
agent-browser --extension ~/capsolver-extension open https://example.com/contact

# Get a snapshot to discover form elements
agent-browser snapshot -i
# Output:
# - textbox "Name" [ref=e1]
# - textbox "Email" [ref=e2]
# - textbox "Message" [ref=e3]
# - button "Submit" [ref=e4]

# Fill in the form
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser fill @e3 "Hello, I have a question about your services."

# Wait for CapSolver to resolve the CAPTCHA
agent-browser wait 30000

# Submit — the CAPTCHA token is already injected
agent-browser click @e4

bash Copy

# Navigate to login page
agent-browser --extension ~/capsolver-extension open https://example.com/login

# Get interactive elements
agent-browser snapshot -i

# Fill credentials
agent-browser find label "Email" fill "me@example.com"
agent-browser find label "Password" fill "mypassword123"

# Wait for Turnstile to be resolved
agent-browser wait 20000

# Click login — Turnstile already handled
agent-browser find role button click --name "Log in"

Example 3: Data Extraction from Protected Pages

bash Copy

# Navigate to protected page
agent-browser --extension ~/capsolver-extension open https://example.com/data

# Wait for any CAPTCHA challenge to clear
agent-browser wait 30000

# Extract page content using snapshot
agent-browser snapshot --json

# Or get specific element text
agent-browser get text "body"

Example 4: Chained Commands (Single Line)

Agent Browser supports command chaining for efficient automation:

bash Copy

# Open, wait for CAPTCHA, fill form, and submit — all in one line
agent-browser --extension ~/capsolver-extension open https://example.com/contact && \
  agent-browser wait 30000 && \
  agent-browser snapshot -i && \
  agent-browser fill @e1 "John Doe" && \
  agent-browser fill @e2 "john@example.com" && \
  agent-browser click @e3

Example 5: Scripted Workflow with JSON Output

For AI agent pipelines, use --json for machine-readable output:

bash Copy

#!/bin/bash
EXTENSION=~/capsolver-extension

# Open page with extension
agent-browser --extension "$EXTENSION" open https://example.com/protected

# Wait for CAPTCHA to resolve
agent-browser wait 30000

# Get snapshot as JSON for AI processing
SNAPSHOT=$(agent-browser snapshot -i --json)

# Parse refs and interact
agent-browser click @e2
agent-browser get text "body" --json

Recommended Wait Times

CAPTCHA Type	Typical Solve Time	Recommended Wait
reCAPTCHA v2 (checkbox)	5-15 seconds	30-60 seconds
reCAPTCHA v2 (invisible)	5-15 seconds	30 seconds
reCAPTCHA v3	3-10 seconds	20-30 seconds
Cloudflare Turnstile	3-10 seconds	20-30 seconds

Tip: When in doubt, use 30 seconds. It's better to wait a bit longer than to submit too early. The extra time doesn't affect the result.

How It Works Behind the Scenes

Here's what happens when Agent Browser runs with the CapSolver extension loaded:

Copy

Your Agent Browser Commands
───────────────────────────────────────────────────
agent-browser --extension       ──►  Chrome launches with extension
  ~/capsolver-extension
  open https://...
                                           │
                                           ▼
                               ┌─────────────────────────────┐
                               │  Page with CAPTCHA widget     │
                               │                               │
                               │  CapSolver Extension:         │
                               │  1. Content script detects    │
                               │     CAPTCHA on the page       │
                               │  2. Service worker calls      │
                               │     CapSolver API             │
                               │  3. Token received            │
                               │  4. Token injected into       │
                               │     hidden form field         │
                               └─────────────────────────────┘
                                           │
                                           ▼
agent-browser wait 30000         Extension resolves CAPTCHA...
                                           │
                                           ▼
agent-browser snapshot -i        Agent Browser reads elements
agent-browser click @e2          Form submits WITH valid token
                                           │
                                           ▼
                               "Verification successful!"

How the Extension Loads

When Agent Browser launches Chrome with the --extension flag:

Chrome starts with the CapSolver extension loaded (using --headless=new in headless mode, which supports Manifest V3 extensions)
The extension activates — its service worker starts and content scripts inject into every page
On pages with CAPTCHAs — the content script detects the widget, calls the CapSolver API, and injects the solution token into the page
Agent Browser operates normally — snapshots, clicks, and data extraction work as usual, with CAPTCHAs already handled

Full Configuration Reference

Here's a complete setup with all configuration options for the Agent Browser + CapSolver integration:

CLI Flags

bash Copy

agent-browser \
  --extension ~/capsolver-extension \
  --headed \
  --session-name my-session \
  --profile ./browser-data \
  open https://example.com

Environment Variables

bash Copy

# Set extension path as environment variable (avoids repeating --extension)
export AGENT_BROWSER_EXTENSIONS=~/capsolver-extension

# Now every command automatically loads the extension
agent-browser open https://example.com
agent-browser wait 30000
agent-browser snapshot -i

Config File (`agent-browser.json`)

Create an agent-browser.json in your project directory for persistent defaults:

json Copy

{
  "extension": ["~/capsolver-extension"],
  "sessionName": "my-project",
  "headed": false
}

Configuration Options

Option	Description
`--extension <path>`	Path to unpacked CapSolver extension directory containing `manifest.json`. Repeatable for multiple extensions.
`--headed`	Show browser window for visual debugging. Extensions work in both modes.
`--session-name <name>`	Auto-save/restore cookies and localStorage across browser restarts.
`--profile <path>`	Persistent browser profile directory (cookies, IndexedDB, cache).
`AGENT_BROWSER_EXTENSIONS`	Environment variable alternative to `--extension` flag. Comma-separated paths for multiple extensions.

The CapSolver API key is configured directly in the extension's assets/config.js file (see Step 3 above).

Troubleshooting

Extension Not Loading

Symptom: CAPTCHAs aren't being solved automatically.

Possible causes:

Wrong extension path — ensure manifest.json exists in the specified directory
Extension not compatible — use the Chrome version of the CapSolver extension (not Firefox)

Solution: Verify the path and check that the extension loads:

bash Copy

# Verify manifest exists
ls ~/capsolver-extension/manifest.json

# Test in headed mode to visually confirm
agent-browser --extension ~/capsolver-extension --headed open chrome://extensions

CAPTCHA Not Solved (Form Fails)

Possible causes:

Insufficient wait time — Increase to 60 seconds
Invalid API key — Check your CapSolver dashboard
Insufficient balance — Top up your CapSolver account
Extension not loaded — See "Extension Not Loading" above

Debug with console logs:

bash Copy

agent-browser --extension ~/capsolver-extension open https://example.com
agent-browser wait 30000
agent-browser console  # Check for CapSolver messages

Chrome Not Found

Symptom: agent-browser can't find a Chrome executable.

Solution: Run the install command to download Chrome for Testing:

bash Copy

agent-browser install

Or point to a custom Chrome executable:

bash Copy

agent-browser --executable-path /path/to/chrome open https://example.com

Multiple Extensions

You can load multiple extensions by repeating the --extension flag:

bash Copy

agent-browser \
  --extension ~/capsolver-extension \
  --extension ~/another-extension \
  open https://example.com

Best Practices

Use the AGENT_BROWSER_EXTENSIONS environment variable. Set it once in your shell profile or CI config, and every agent-browser command automatically loads CapSolver without repeating the flag.
Always use generous wait times. More wait time is always safer. The CAPTCHA typically resolves in 5-20 seconds, but network latency, complex challenges, or retries can add time. 30-60 seconds is the sweet spot.
Keep your automation scripts clean. Don't add CAPTCHA-specific logic to your commands. The extension handles everything — your scripts should focus purely on navigation, interaction, and data extraction.
Monitor your CapSolver balance. Each CAPTCHA resolution costs credits. Check your balance at capsolver.com/dashboard regularly to avoid interruptions.
Use session persistence for repeat visits. Use --session-name or --profile to preserve cookies across runs. This can reduce CAPTCHA frequency since the site may recognize returning sessions.
Leverage headless mode in production. Unlike Playwright, Agent Browser supports extensions in headless mode. No need for Xvfb or virtual displays on servers — just run your commands directly.

Conclusion

Download the CapSolver extension and configure your API key
Add --extension ~/capsolver-extension to your Agent Browser commands
Add a wait command before interacting with CAPTCHA-protected forms

Ready to get started? Sign up for CapSolver and use the bonus code AGENTBROWSER to get an extra 6% on your first top-up!

FAQ

Do I need to write CAPTCHA-specific code?

Can I run this in headless mode?

Yes! This is a major advantage over Playwright-based solutions. Agent Browser uses Chrome's --headless=new mode, which supports Manifest V3 extensions. No Xvfb or virtual display required.

Do I need Playwright or Node.js?

No. Agent Browser is a standalone Rust binary. You only need Node.js for the npm install step. The browser daemon runs natively without any JavaScript runtime.

What CAPTCHA types does CapSolver support?

How much does CapSolver cost?

CapSolver offers competitive pricing based on CAPTCHA type and volume. Visit capsolver.com for current pricing.

Is Vercel Agent Browser free?

Yes. Agent Browser is open source under the Apache 2.0 license. The CLI and all features are free to use. Visit the GitHub repository for more details.

How long should I wait for the CAPTCHA to be solved?

Can I use this with AI agents?

AIApr 28, 2026

AI Agents in Web Scraping & Competitive Intelligence Guide

Discover how AI agents transform web scraping and competitive intelligence. Learn about automated data collection, anti-bot challenges, and CAPTCHA solutions for scalable workflows.

Sora Fujimoto

AIApr 24, 2026

AI Agent vs Chatbot: Key Differences in Automation Capabilities

Discover the key differences between AI agent vs chatbot. Learn how agentic AI outperforms traditional AI in automation, decision-making, and complex workflows.

How to Solve CAPTCHA with Vercel Agent Browser – Step-by-Step Guide Using CapSolver

What is Vercel Agent Browser?

Key Features

What is CapSolver?

Supported CAPTCHA Types

Why This Integration is Different

Prerequisites

Step-by-Step Setup

Step 1: Install Agent Browser

Step 2: Download the CapSolver Chrome Extension

Step 3: Configure Your CapSolver API Key

Step 4: Launch Agent Browser with the CapSolver Extension

Step 5: Verify the Extension is Loaded

How to Use It

The Golden Rule

Example 1: Form Submission Behind reCAPTCHA

Example 2: Login Page with Cloudflare Turnstile

Example 3: Data Extraction from Protected Pages

Example 4: Chained Commands (Single Line)

Example 5: Scripted Workflow with JSON Output

Recommended Wait Times

How It Works Behind the Scenes

How the Extension Loads

Full Configuration Reference

CLI Flags

Environment Variables

Config File (agent-browser.json)

Configuration Options

Troubleshooting

Extension Not Loading

CAPTCHA Not Solved (Form Fails)

Chrome Not Found

Multiple Extensions

Best Practices

Conclusion

FAQ

Do I need to write CAPTCHA-specific code?

Can I run this in headless mode?

Do I need Playwright or Node.js?

What CAPTCHA types does CapSolver support?

How much does CapSolver cost?

Is Vercel Agent Browser free?

How long should I wait for the CAPTCHA to be solved?

Can I use this with AI agents?

More

AI Agents in Web Scraping & Competitive Intelligence Guide

AI Agent vs Chatbot: Key Differences in Automation Capabilities

How to Solve CAPTCHA with Vercel Agent Browser – Step-by-Step Guide Using CapSolver

What is Vercel Agent Browser?

Key Features

What is CapSolver?

Supported CAPTCHA Types

Why This Integration is Different

Prerequisites

Step-by-Step Setup

Step 1: Install Agent Browser

Step 2: Download the CapSolver Chrome Extension

Step 3: Configure Your CapSolver API Key

Step 4: Launch Agent Browser with the CapSolver Extension

Step 5: Verify the Extension is Loaded

How to Use It

The Golden Rule

Example 1: Form Submission Behind reCAPTCHA

Example 2: Login Page with Cloudflare Turnstile

Example 3: Data Extraction from Protected Pages

Example 4: Chained Commands (Single Line)

Example 5: Scripted Workflow with JSON Output

Recommended Wait Times

How It Works Behind the Scenes

How the Extension Loads

Full Configuration Reference

CLI Flags

Environment Variables

Config File (agent-browser.json)

Configuration Options

Troubleshooting

Extension Not Loading

CAPTCHA Not Solved (Form Fails)

Chrome Not Found

Multiple Extensions

Config File (`agent-browser.json`)

Config File (`agent-browser.json`)