CAPSOLVER
Blog
How to Solve CAPTCHA Challenges for AI Agents: Data Extraction with n8n, CapSolver, and OpenClaw

How to Solve CAPTCHA Challenges for AI Agents: Data Extraction with n8n, CapSolver, and OpenClaw

Logo of CapSolver

Ethan Collins

Pattern Recognition Specialist

20-Mar-2026

Data Extraction with n8n, CapSolver, and OpenClaw

Enable your AI assistant to trigger automated, server-side data extraction โ€” no browser injection, no code.

The Challenge: CAPTCHAs Block Your AI Agent's Efficiency

When your AI Agent navigates the web, CAPTCHAs are the primary obstacle. Protected pages block the agent, forms cannot be submitted, and tasks stall, awaiting human intervention. This significantly limits the efficiency and autonomy of AI Agents in automated data scraping and information processing.

To address this core issue, we offer two powerful solutions combining OpenClaw and CapSolver:

Approach 1 โ€” Browser Extension Integration

Load the CapSolver Chrome extension into OpenClaw's browser environment. The extension invisibly detects and solves CAPTCHAs client-side, without n8n's involvement, allowing the AI Agent to seamlessly bypass verification while navigating pages. (See our full guide on the extension approach)

Approach 2 โ€” Server-side n8n Automation Pipeline (Focus of this Guide)

OpenClaw triggers a single webhook request, and n8n then solves the CAPTCHA via the CapSolver API, submits the form, and returns clean page content to your AI Agent. In this process, the AI Agent never directly handles CAPTCHA verification.

What you'll build:

A server-side CAPTCHA automation pipeline that OpenClaw triggers via webhook. n8n will leverage CapSolver to solve the CAPTCHA, submit the form, and return processed page content to your AI Agent, ensuring smooth execution of data extraction tasks.


Prerequisites

Before you begin, ensure you have the following environment and tools:

  1. OpenClaw installed and the gateway running (openclaw gateway start)
  2. n8n running locally โ€” installation guide
  3. CapSolver account with API key โ€” sign up here
  4. CapSolver node available in n8n (official integration โ€” already built in)

Setting Up CapSolver in n8n

CapSolver is available as an official integration in n8n, requiring no additional community node installation. You can find it directly in the node panel when building your workflows. To enable the CapSolver node to authenticate with your account, you need to create a credential in n8n.

Open your n8n canvas, click + to add a node, and search for CapSolver. This node handles task creation, polling, and token retrieval in a single unit.

Steps to add your credentials:

  1. In n8n, go to Credentials โ†’ New Credential
  2. Search for CapSolver
  3. Paste your API key from the CapSolver dashboard
  4. Save

Important: Every CapSolver node in your workflows will reference this credential. You only need to create it once โ€” all your CAPTCHA-solving workflows will share the same credential. Furthermore, CapSolver officially provides a rich GitHub Skill repository, where you can explore more integrations and use cases related to CapSolver, further expanding your AI Agent capabilities.


Workflow: OpenClaw CAPTCHA Automation Pipeline

Everything below is an example. The URLs, field names, CAPTCHA types, success conditions, response structure โ€” all of it is specific to the demo site used here. Your real target will be different. Treat each node config as a starting point, not a finished setup.

How It Works

  1. Webhook โ€” Receives a POST request from OpenClaw (or any HTTP client).
  2. Schedule Trigger โ€” Fires automatically on a cron schedule (e.g., daily at 09:00).
  3. CapSolver โ€” Solves the CAPTCHA using the configured task type.
  4. HTTP Request โ€” Submits the solved token to the target site.
  5. If โ€” Checks whether the response indicates success or failure.
  6. Edit Fields โ€” Extracts pageText from the response.
  7. Save Result โ€” Persists the result to storage (optional).
  8. Respond to Webhook โ€” Returns the result to the caller.
Copy
Webhook โ”€โ”€โ”
           โ”œโ”€โ”€โ–บ Scrape site โ”€โ”€โ–บ HTTP Request โ”€โ”€โ–บ If โ”€โ”€โ–บ Edit Fields โ”€โ”€โ–บ Save Result โ”€โ”€โ–บ Respond to Webhook
Schedule โ”€โ”€โ”˜                                           Edit Fields1 โ”€โ”˜

Node Configuration Details

Create a new workflow called โ€œOpenClaw/Capsolver/n8n Scraperโ€ with the following nodes:

1. Webhook Node

  • Type: Webhook
  • HTTP Method: POST
  • Path: openclaw/scrape
  • Respond: Response Node (makes the call synchronous โ€” caller waits for the result)

2. Schedule Trigger Node

  • Type: Schedule Trigger
  • Cron: 0 9 * * * (daily at 09:00)
  • Connect to: Scrape site (same as Webhook)

Having two trigger nodes on one workflow is valid in n8n. Both feed into the same processing pipeline.

3. CapSolver Node

  • Type: CapSolver
  • Task Type: ReCaptchaV2TaskProxyless
  • Website URL: https://example.com/protected-page
  • Website Key: YOUR_SITE_KEY (find it in the page source โ€” look for data-sitekey)
  • Credentials: your CapSolver API key

Using reCAPTCHA v3? Switch Task Type to ReCaptchaV3TaskProxyless and add a Page Action field (e.g., login, submit, homepage). This is required for v3 โ€” it's the action name the site registers with Google. You'll find it in the page source near the grecaptcha.execute(...) call.

Keep in mind that each CAPTCHA type has its own set of parameters โ€” some fields that are optional in v2 become required in v3, and v3 may expose fields that don't exist in v2 at all (like minScore). Always check the CapSolver docs for the exact parameters required by your Task Type.

This node calls the CapSolver API, waits for the solve (typically 5โ€“20 seconds), and returns the token in $json.data.solution.gRecaptchaResponse.

4. HTTP Request Node

  • Method: POST
  • URL: https://example.com/protected-page
  • Body: form-urlencoded
    • g-recaptcha-response = ={{ $json.data.solution.gRecaptchaResponse }}
  • Headers: standard browser headers (User-Agent, Accept, Referer, Origin, etc.)

This submits the form with the solved token, exactly as a browser would.

Heads up: How the token is submitted varies by site. Most forms expect it in the request body as g-recaptcha-response, but some sites send it as a JSON field, a custom header, or even a cookie or different name. Use your browser's DevTools (Network tab) to inspect what a real submission looks like and mirror that in your HTTP Request node.

5. If Node (Success Check)

  • Condition: $json.data contains "recaptcha-success"
  • True branch โ†’ Edit Fields (success)
  • False branch โ†’ Edit Fields1 (failure)

6. Edit Fields / Edit Fields1 Nodes

Both branches set a single field:

  • pageText = {{ $json.data }}

The success and failure branches both pass pageText โ€” the caller can inspect the HTML to determine the outcome.

Adapt this to your page: How you parse and use the response data depends entirely on what you want and what the target site returns. Some pages return JSON, others return HTML, some redirect on success. You may want to extract a specific field, parse a table, check for a session cookie, or strip the HTML entirely. The success condition ("recaptcha-success") is also just an example โ€” your site will have its own indicator. These nodes are a starting point; expect to customize them for your use case.

7. Save Result Node

This node passes { pageText, savedAt } to the webhook response and optionally persists the result to storage.

Note: n8n's Code node runs in a sandboxed VM that blocks Node.js built-ins like require(\'fs\'). Use an Execute Command node instead to write to disk, or replace this node entirely with any n8n integration that fits your stack.

Option A โ€” Local JSON File (Execute Command Node):

Use two nodes chained together:

Node 7a โ€” Prepare Data (Code node):

javascript Copy
const item = $input.first().json;
const now = new Date();
const savedAt = now.toISOString();
const data = { pageText: item.pageText || \'\', savedAt };
const encoded = Buffer.from(JSON.stringify(data)).toString(\'base64\');
const cmd = \'python3 /path/to/save-result.py \' + encoded;
return [{ json: { cmd, pageText: data.pageText, savedAt } }];

Node 7b โ€” Save Result (Execute Command node):

  • Command: ={{ $json.cmd }}

Where save-result.py reads the base64 argument and appends to a local JSON file.

Option B โ€” Any n8n-Supported Storage:

n8n has native nodes for virtually every storage system. Replace Node 7 with any of these:

Storage n8n Node
Google Sheets Append a row with pageText + timestamp
Airtable Create a record
Notion Create a database entry
PostgreSQL / MySQL INSERT into a table
AWS S3 / Cloudflare R2 Upload a JSON file
Slack / Telegram Post the result to a channel

Just connect the node between Edit Fields and Respond to Webhook, and configure it to store $json.pageText and a timestamp.

8. Respond to Webhook Node

  • Respond With: JSON
  • Response Body: ={{ JSON.stringify($json) }}
  • Continue on Fail: enabled (required โ€” when triggered by Schedule, there's no active webhook to respond to)

Activate the workflow once it's built. The webhook path will be live at:

Copy
POST http://127.0.0.1:3005/webhook/openclaw/scrape

OpenClaw Integration

To connect OpenClaw to this workflow, create a trigger script and register it.

Create the trigger script:

bash Copy
cat > ~/.openclaw/scripts/extract-data << \'EOF\'
#!/usr/bin/env bash
curl -s -X POST http://127.0.0.1:3005/webhook/openclaw/scrape
EOF
chmod +x ~/.openclaw/scripts/extract-data

This is the only thing OpenClaw runs. No arguments, no site key, no URL โ€” the workflow knows what to scrape.

How OpenClaw gets the data: The script waits for n8n to finish (CapSolver solve + form submission), then receives { pageText, savedAt } directly in the Webhook response. No file reading involved โ€” the data comes back synchronously over HTTP. The response shape is just what this workflow returns โ€” if you need different fields (e.g., a parsed price, a login status, a structured JSON object), modify the Edit Fields and Save Result nodes to return whatever your use case requires.

Register the command in TOOLS.md:

Open ~/.openclaw/workspace/TOOLS.md and add the following entry so OpenClaw knows about the command:

markdown Copy
### extract-data

Run: `/root/.openclaw/scripts/extract-data`
Returns fresh `{ pageText, savedAt }` from the live pipeline. Return the `pageText` field from the JSON response.

Test Your AI Agent Automation Flow

Trigger from OpenClaw โ€” send this command to your AI Agent (via Discord, Telegram, WhatsApp, or any channel):

Copy
extract data

OpenClaw runs the extract-data script, which fires the webhook and waits. n8n solves the CAPTCHA, submits the form, and returns { pageText, savedAt } directly in the HTTP response. OpenClaw receives and summarizes the result โ€” typically within 10โ€“40 seconds.

Test from the terminal:

bash Copy
curl -s -X POST http://127.0.0.1:3005/webhook/openclaw/scrape

Scheduled runs โ€” the workflow also runs automatically every day at 09:00 via the Schedule Trigger node.


Adapting the Workflow to Your Target Site

This guide's workflow is built for a specific demo site. For your actual target, every part of the pipeline may require adjustment. Here's what to look at:

1. CAPTCHA Type

Not all sites use reCAPTCHA v2. Change the CapSolver node's Task Type to match what the target uses:

What you see on the site n8n Node Operation
"I'm not a robot" checkbox reCAPTCHA v2
Invisible reCAPTCHA (auto-fires) reCAPTCHA v2
reCAPTCHA v3 score reCAPTCHA v3
Cloudflare Turnstile widget Cloudflare Turnstile
Cloudflare Challenge (5s page) Cloudflare Challenge
GeeTest puzzle (v3) GeeTest V3
GeeTest puzzle (v4) GeeTest V4
DataDome bot protection DataDome
AWS WAF CAPTCHA AWS WAF
MTCaptcha MTCaptcha

Also update Website URL and Website Key to match your target. You can find the site key in the page source (look for the data-sitekey attribute, or the CapSolver browser extension auto-detects it).

2. How the Token Gets Submitted

This is the part that varies the most between sites. The demo site uses a simple form POST with the token in a body field. Your target might be different:

As a form field (most common)

Copy
POST /submit
Content-Type: application/x-www-form-urlencoded

g-recaptcha-response=TOKEN&other_field=value

In a JSON body

Copy
POST /api/login
Content-Type: application/json

{ "username": "...", "password": "...", "captchaToken": "TOKEN" }

In a header

Copy
POST /api/action
X-Captcha-Token: TOKEN

As a cookie

Copy
POST /submit
Cookie: cf_clearance=TOKEN

In the URL as a query parameter

Copy
GET /search?q=query&token=TOKEN

Inspect the network tab in your browser's dev tools when you manually solve the CAPTCHA on your target site. Look for the request that fires immediately after the solve โ€” that shows you exactly where the token goes.

3. The HTTP Request Node

Once you know how the token is submitted, configure the HTTP Request node accordingly:

  • Method: match the site (POST, GET, PUT)
  • URL: the exact endpoint that receives the form or API call
  • Headers: copy the browser headers from your network tab โ€” User-Agent, Referer, Origin, Accept, Content-Type are usually required
  • Body: use form-urlencoded, JSON, or multipart depending on the endpoint
  • Cookies: if the site uses session cookies, either pass them as headers or use a prior HTTP Request node to obtain them via a login step

4. Extracting the Data You Need

The workflow currently passes the full HTML of the response as pageText. Depending on your use case, you may want to post-process it:

  • Add a Code node after the HTTP Request to parse the HTML and extract specific fields (product name, price, status)
  • Use n8n's HTML Extract node to pull data from specific CSS selectors without writing code
  • Store structured fields instead of raw HTML โ€” easier to query and compare across runs

5. Multi-Step Flows

Some targets require more than one request:

  1. GET the page to obtain a CSRF token or session cookie
  2. Solve the CAPTCHA
  3. POST the form with CSRF token + captcha token + credentials

Chain multiple HTTP Request nodes in n8n to handle this. Pass values between nodes using $json expressions.


Troubleshooting

"Failed to reach n8n scraper"

json Copy
{"success": false, "error": "Failed to reach n8n scraper. Is the OpenClaw CAPTCHA Scraper workflow active?"}

Check: Is n8n running? Is the workflow activated? Open n8n and verify the workflow is Active (green toggle).

CapSolver Timeout / No Token

Possible causes:

  • Invalid API key โ€” check ~/.n8n/credentials
  • Insufficient balance โ€” top up at capsolver.com/dashboard
  • Network issue between n8n server and CapSolver API

pageText is Empty or Contains an Error Page

  • The HTTP Request URL or form field name may be wrong for your target
  • Check the g-recaptcha-response field name โ€” some sites use a different field name
  • Enable fullResponse: true in the HTTP Request node to see the status code

Scheduled Run Fails at "Respond to Webhook"

This is expected and safe. When triggered by Schedule, there's no active webhook context. Make sure Continue on Fail is enabled on the Respond to Webhook node.


Complete Configuration Reference

n8n Workflow Nodes Summary

Node Type Key Config
Webhook n8n-nodes-base.webhook POST, path: openclaw/scrape, responseMode: responseNode
Schedule Trigger n8n-nodes-base.scheduleTrigger Cron: 0 9 * * *
Scrape site n8n-nodes-capsolver.capSolver Task: ReCaptchaV2TaskProxyless
HTTP Request n8n-nodes-base.httpRequest POST to target URL with token in body
If n8n-nodes-base.if Check $json.data contains "recaptcha-success"
Edit Fields n8n-nodes-base.set pageText = $json.data
Save Result n8n-nodes-base.executeCommand or any storage node Persist result (file, DB, Sheets, etc.)
Respond to Webhook n8n-nodes-base.respondToWebhook JSON, continueOnFail: true

CAPTCHA Task Types

CAPTCHA n8n Node Operation
reCAPTCHA v2 (checkbox) reCAPTCHA v2
reCAPTCHA v2 (invisible) reCAPTCHA v2
reCAPTCHA v3 reCAPTCHA v3
Cloudflare Turnstile Cloudflare Turnstile
Cloudflare Challenge Cloudflare Challenge
GeeTest V3 GeeTest V3
GeeTest V4 GeeTest V4
DataDome DataDome
AWS WAF AWS WAF
MTCaptcha MTCaptcha

Conclusion

The OpenClaw + n8n + CapSolver pipeline provides a production-grade data extraction setup that:

  • Runs on demand when your AI Agent requests.
  • Runs automatically on a schedule without any human input.
  • Never requires a browser or display.
  • Keeps CAPTCHA handling completely invisible โ€” to you and to the AI Agent.

The AI Agent simply issues an "extract data" command and receives clean page content. CapSolver handles the difficult part, n8n orchestrates the flow, and OpenClaw serves as the interface.


Ready to get started? Sign up for CapSolver and use bonus code OPENCLAW for an extra 6% bonus on your first recharge!


Frequently Asked Questions

Do I need to tell OpenClaw about CapSolver or CAPTCHAs?

No. OpenClaw simply runs a script that fires an HTTP request. n8n handles everything else. Your AI Agent has no knowledge of CAPTCHAs โ€” it just triggers a job and reads the result.

Can I point this at a different site?

Yes, but you'll likely need to adjust more than just the URL. Every site submits the CAPTCHA token differently โ€” some use form fields, some JSON bodies, some headers or cookies. See the "Adapting the Workflow to Your Target Site" section above for a full breakdown of what to check and change.

What if my target uses Turnstile instead of reCAPTCHA?

Change the CapSolver node's Task Type to AntiTurnstileTaskProxyless. Then inspect your target's network requests to find where the Turnstile token gets submitted โ€” it's often in a hidden form field called cf-turnstile-response, but some implementations pass it in a JSON body, a header, or a cookie instead.

How many results are stored?

That depends on your storage choice. With a local JSON file, you can keep as many as you like. With Google Sheets or a database, every run appends a row indefinitely. Configure the Save Result node to match your retention needs.

Can I trigger this from a cron job instead of OpenClaw?

Yes โ€” the webhook endpoint is just an HTTP POST. Anything that can make an HTTP request can trigger it:

bash Copy
curl -s -X POST http://127.0.0.1:3005/webhook/openclaw/scrape

How much does each extraction cost?

Each run costs one CapSolver credit for the CAPTCHA solve. reCAPTCHA v2 is among the cheapest types. Check current pricing at capsolver.com.

Is OpenClaw free?

OpenClaw is open-source and free to self-host. You'll need API credits for your AI model provider and CapSolver for CAPTCHA solving.

Compliance Disclaimer: The information provided on this blog is for informational purposes only. CapSolver is committed to compliance with all applicable laws and regulations. The use of the CapSolver network for illegal, fraudulent, or abusive activities is strictly prohibited and will be investigated. Our captcha-solving solutions enhance user experience while ensuring 100% compliance in helping solve captcha difficulties during public data crawling. We encourage responsible use of our services. For more information, please visit our Terms of Service and Privacy Policy.

More

Solve CAPTCHAs with NanoClaw and CapSolver
How to Automatically Solve CAPTCHAs with NanoClaw and CapSolver

Step-by-step guide to use CapSolver with NanoClaw for automatically solving reCAPTCHA, Turnstile, AWS WAF, and other CAPTCHAs. Works with Claude AI agents, zero code, and multiple browsers.

AI
Logo of CapSolver

Ethan Collins

20-Mar-2026

Data Extraction with n8n, CapSolver, and OpenClaw
How to Solve CAPTCHA Challenges for AI Agents: Data Extraction with n8n, CapSolver, and OpenClaw

Learn how to automate CAPTCHA solving for AI agents using n8n, CapSolver, and OpenClaw. Build a server-side pipeline to extract data from protected websites without browser automation or manual steps.

AI
Logo of CapSolver

Ethan Collins

20-Mar-2026

 Solve CAPTCHA with TinyFish AgentQ
How to Solve CAPTCHA with TinyFish AgentQL โ€“ Step-by-Step Guide Using CapSolver

Learn how to integrate CapSolver with TinyFish AgentQL to automatically solve CAPTCHAs like reCAPTCHA and Cloudflare Turnstile. Step-by-step tutorial with Python and JavaScript SDK examples for seamless AI-powered web automation.

AI
Logo of CapSolver

Ethan Collins

19-Mar-2026

Solve CAPTCHA with Vercel Agent Browser
How to Solve CAPTCHA with Vercel Agent Browser โ€“ Step-by-Step Guide Using CapSolver

Learn how to integrate CapSolver with Agent Browser to handle CAPTCHAs and build reliable AI automation workflows.

AI
Logo of CapSolver

Ethan Collins

18-Mar-2026

Integrating CapSolver with Web MCP: A Guide for Autonomous Agents
Integrating CapSolver with Web MCP: A Guide for Autonomous Agents

Enhance your AI agent's web automation capabilities. This guide details how to integrate CapSolver for efficient captcha solving within the Web MCP framework, ensuring reliable and compliant operations.

AI
Logo of CapSolver

Rajinder Singh

17-Mar-2026

CAPTCHA AI Powered by Large Models
CAPTCHA AI Powered by Large Models: Why It's More Suitable for Enterprise Scenarios

How AI visual models are reshaping CAPTCHA recognition and why enterprise-grade solvers need data, scale, and custom training.

AI
Logo of CapSolver

Ethan Collins

13-Mar-2026