How to Solve CAPTCHA Challenges for AI Agents: Data Extraction with n8n, CapSolver, and OpenClaw

Ethan Collins
Pattern Recognition Specialist
20-Mar-2026

Enable your AI assistant to trigger automated, server-side data extraction โ no browser injection, no code.
The Challenge: CAPTCHAs Block Your AI Agent's Efficiency
When your AI Agent navigates the web, CAPTCHAs are the primary obstacle. Protected pages block the agent, forms cannot be submitted, and tasks stall, awaiting human intervention. This significantly limits the efficiency and autonomy of AI Agents in automated data scraping and information processing.
To address this core issue, we offer two powerful solutions combining OpenClaw and CapSolver:
Approach 1 โ Browser Extension Integration
Load the CapSolver Chrome extension into OpenClaw's browser environment. The extension invisibly detects and solves CAPTCHAs client-side, without n8n's involvement, allowing the AI Agent to seamlessly bypass verification while navigating pages. (See our full guide on the extension approach)
Approach 2 โ Server-side n8n Automation Pipeline (Focus of this Guide)
OpenClaw triggers a single webhook request, and n8n then solves the CAPTCHA via the CapSolver API, submits the form, and returns clean page content to your AI Agent. In this process, the AI Agent never directly handles CAPTCHA verification.
What you'll build:
A server-side CAPTCHA automation pipeline that OpenClaw triggers via webhook. n8n will leverage CapSolver to solve the CAPTCHA, submit the form, and return processed page content to your AI Agent, ensuring smooth execution of data extraction tasks.
Prerequisites
Before you begin, ensure you have the following environment and tools:
- OpenClaw installed and the gateway running (
openclaw gateway start) - n8n running locally โ installation guide
- CapSolver account with API key โ sign up here
- CapSolver node available in n8n (official integration โ already built in)
Setting Up CapSolver in n8n
CapSolver is available as an official integration in n8n, requiring no additional community node installation. You can find it directly in the node panel when building your workflows. To enable the CapSolver node to authenticate with your account, you need to create a credential in n8n.
Open your n8n canvas, click + to add a node, and search for CapSolver. This node handles task creation, polling, and token retrieval in a single unit.
Steps to add your credentials:
- In n8n, go to Credentials โ New Credential
- Search for CapSolver
- Paste your API key from the CapSolver dashboard
- Save
Important: Every CapSolver node in your workflows will reference this credential. You only need to create it once โ all your CAPTCHA-solving workflows will share the same credential. Furthermore, CapSolver officially provides a rich GitHub Skill repository, where you can explore more integrations and use cases related to CapSolver, further expanding your AI Agent capabilities.
Workflow: OpenClaw CAPTCHA Automation Pipeline
Everything below is an example. The URLs, field names, CAPTCHA types, success conditions, response structure โ all of it is specific to the demo site used here. Your real target will be different. Treat each node config as a starting point, not a finished setup.
How It Works
- Webhook โ Receives a POST request from OpenClaw (or any HTTP client).
- Schedule Trigger โ Fires automatically on a cron schedule (e.g., daily at 09:00).
- CapSolver โ Solves the CAPTCHA using the configured task type.
- HTTP Request โ Submits the solved token to the target site.
- If โ Checks whether the response indicates success or failure.
- Edit Fields โ Extracts
pageTextfrom the response. - Save Result โ Persists the result to storage (optional).
- Respond to Webhook โ Returns the result to the caller.
Webhook โโโ
โโโโบ Scrape site โโโบ HTTP Request โโโบ If โโโบ Edit Fields โโโบ Save Result โโโบ Respond to Webhook
Schedule โโโ Edit Fields1 โโ
Node Configuration Details
Create a new workflow called โOpenClaw/Capsolver/n8n Scraperโ with the following nodes:
1. Webhook Node
- Type: Webhook
- HTTP Method: POST
- Path:
openclaw/scrape - Respond: Response Node (makes the call synchronous โ caller waits for the result)
2. Schedule Trigger Node
- Type: Schedule Trigger
- Cron:
0 9 * * *(daily at 09:00) - Connect to: Scrape site (same as Webhook)
Having two trigger nodes on one workflow is valid in n8n. Both feed into the same processing pipeline.
3. CapSolver Node
- Type: CapSolver
- Task Type:
ReCaptchaV2TaskProxyless - Website URL:
https://example.com/protected-page - Website Key:
YOUR_SITE_KEY(find it in the page source โ look fordata-sitekey) - Credentials: your CapSolver API key
Using reCAPTCHA v3? Switch Task Type to
ReCaptchaV3TaskProxylessand add a Page Action field (e.g.,login,submit,homepage). This is required for v3 โ it's the action name the site registers with Google. You'll find it in the page source near thegrecaptcha.execute(...)call.Keep in mind that each CAPTCHA type has its own set of parameters โ some fields that are optional in v2 become required in v3, and v3 may expose fields that don't exist in v2 at all (like
minScore). Always check the CapSolver docs for the exact parameters required by your Task Type.
This node calls the CapSolver API, waits for the solve (typically 5โ20 seconds), and returns the token in $json.data.solution.gRecaptchaResponse.
4. HTTP Request Node
- Method: POST
- URL:
https://example.com/protected-page - Body: form-urlencoded
g-recaptcha-response=={{ $json.data.solution.gRecaptchaResponse }}
- Headers: standard browser headers (User-Agent, Accept, Referer, Origin, etc.)
This submits the form with the solved token, exactly as a browser would.
Heads up: How the token is submitted varies by site. Most forms expect it in the request body as
g-recaptcha-response, but some sites send it as a JSON field, a custom header, or even a cookie or different name. Use your browser's DevTools (Network tab) to inspect what a real submission looks like and mirror that in your HTTP Request node.
5. If Node (Success Check)
- Condition:
$json.datacontains"recaptcha-success" - True branch โ Edit Fields (success)
- False branch โ Edit Fields1 (failure)
6. Edit Fields / Edit Fields1 Nodes
Both branches set a single field:
pageText={{ $json.data }}
The success and failure branches both pass pageText โ the caller can inspect the HTML to determine the outcome.
Adapt this to your page: How you parse and use the response data depends entirely on what you want and what the target site returns. Some pages return JSON, others return HTML, some redirect on success. You may want to extract a specific field, parse a table, check for a session cookie, or strip the HTML entirely. The success condition (
"recaptcha-success") is also just an example โ your site will have its own indicator. These nodes are a starting point; expect to customize them for your use case.
7. Save Result Node
This node passes { pageText, savedAt } to the webhook response and optionally persists the result to storage.
Note: n8n's Code node runs in a sandboxed VM that blocks Node.js built-ins like
require(\'fs\'). Use an Execute Command node instead to write to disk, or replace this node entirely with any n8n integration that fits your stack.
Option A โ Local JSON File (Execute Command Node):
Use two nodes chained together:
Node 7a โ Prepare Data (Code node):
javascript
const item = $input.first().json;
const now = new Date();
const savedAt = now.toISOString();
const data = { pageText: item.pageText || \'\', savedAt };
const encoded = Buffer.from(JSON.stringify(data)).toString(\'base64\');
const cmd = \'python3 /path/to/save-result.py \' + encoded;
return [{ json: { cmd, pageText: data.pageText, savedAt } }];
Node 7b โ Save Result (Execute Command node):
- Command:
={{ $json.cmd }}
Where save-result.py reads the base64 argument and appends to a local JSON file.
Option B โ Any n8n-Supported Storage:
n8n has native nodes for virtually every storage system. Replace Node 7 with any of these:
| Storage | n8n Node |
|---|---|
| Google Sheets | Append a row with pageText + timestamp |
| Airtable | Create a record |
| Notion | Create a database entry |
| PostgreSQL / MySQL | INSERT into a table |
| AWS S3 / Cloudflare R2 | Upload a JSON file |
| Slack / Telegram | Post the result to a channel |
Just connect the node between Edit Fields and Respond to Webhook, and configure it to store $json.pageText and a timestamp.
8. Respond to Webhook Node
- Respond With: JSON
- Response Body:
={{ JSON.stringify($json) }} - Continue on Fail: enabled (required โ when triggered by Schedule, there's no active webhook to respond to)
Activate the workflow once it's built. The webhook path will be live at:
POST http://127.0.0.1:3005/webhook/openclaw/scrape
OpenClaw Integration
To connect OpenClaw to this workflow, create a trigger script and register it.
Create the trigger script:
bash
cat > ~/.openclaw/scripts/extract-data << \'EOF\'
#!/usr/bin/env bash
curl -s -X POST http://127.0.0.1:3005/webhook/openclaw/scrape
EOF
chmod +x ~/.openclaw/scripts/extract-data
This is the only thing OpenClaw runs. No arguments, no site key, no URL โ the workflow knows what to scrape.
How OpenClaw gets the data: The script waits for n8n to finish (CapSolver solve + form submission), then receives
{ pageText, savedAt }directly in the Webhook response. No file reading involved โ the data comes back synchronously over HTTP. The response shape is just what this workflow returns โ if you need different fields (e.g., a parsed price, a login status, a structured JSON object), modify the Edit Fields and Save Result nodes to return whatever your use case requires.
Register the command in TOOLS.md:
Open ~/.openclaw/workspace/TOOLS.md and add the following entry so OpenClaw knows about the command:
markdown
### extract-data
Run: `/root/.openclaw/scripts/extract-data`
Returns fresh `{ pageText, savedAt }` from the live pipeline. Return the `pageText` field from the JSON response.
Test Your AI Agent Automation Flow
Trigger from OpenClaw โ send this command to your AI Agent (via Discord, Telegram, WhatsApp, or any channel):
extract data
OpenClaw runs the extract-data script, which fires the webhook and waits. n8n solves the CAPTCHA, submits the form, and returns { pageText, savedAt } directly in the HTTP response. OpenClaw receives and summarizes the result โ typically within 10โ40 seconds.
Test from the terminal:
bash
curl -s -X POST http://127.0.0.1:3005/webhook/openclaw/scrape
Scheduled runs โ the workflow also runs automatically every day at 09:00 via the Schedule Trigger node.
Adapting the Workflow to Your Target Site
This guide's workflow is built for a specific demo site. For your actual target, every part of the pipeline may require adjustment. Here's what to look at:
1. CAPTCHA Type
Not all sites use reCAPTCHA v2. Change the CapSolver node's Task Type to match what the target uses:
| What you see on the site | n8n Node Operation |
|---|---|
| "I'm not a robot" checkbox | reCAPTCHA v2 |
| Invisible reCAPTCHA (auto-fires) | reCAPTCHA v2 |
| reCAPTCHA v3 score | reCAPTCHA v3 |
| Cloudflare Turnstile widget | Cloudflare Turnstile |
| Cloudflare Challenge (5s page) | Cloudflare Challenge |
| GeeTest puzzle (v3) | GeeTest V3 |
| GeeTest puzzle (v4) | GeeTest V4 |
| DataDome bot protection | DataDome |
| AWS WAF CAPTCHA | AWS WAF |
| MTCaptcha | MTCaptcha |
Also update Website URL and Website Key to match your target. You can find the site key in the page source (look for the data-sitekey attribute, or the CapSolver browser extension auto-detects it).
2. How the Token Gets Submitted
This is the part that varies the most between sites. The demo site uses a simple form POST with the token in a body field. Your target might be different:
As a form field (most common)
POST /submit
Content-Type: application/x-www-form-urlencoded
g-recaptcha-response=TOKEN&other_field=value
In a JSON body
POST /api/login
Content-Type: application/json
{ "username": "...", "password": "...", "captchaToken": "TOKEN" }
In a header
POST /api/action
X-Captcha-Token: TOKEN
As a cookie
POST /submit
Cookie: cf_clearance=TOKEN
In the URL as a query parameter
GET /search?q=query&token=TOKEN
Inspect the network tab in your browser's dev tools when you manually solve the CAPTCHA on your target site. Look for the request that fires immediately after the solve โ that shows you exactly where the token goes.
3. The HTTP Request Node
Once you know how the token is submitted, configure the HTTP Request node accordingly:
- Method: match the site (POST, GET, PUT)
- URL: the exact endpoint that receives the form or API call
- Headers: copy the browser headers from your network tab โ User-Agent, Referer, Origin, Accept, Content-Type are usually required
- Body: use form-urlencoded, JSON, or multipart depending on the endpoint
- Cookies: if the site uses session cookies, either pass them as headers or use a prior HTTP Request node to obtain them via a login step
4. Extracting the Data You Need
The workflow currently passes the full HTML of the response as pageText. Depending on your use case, you may want to post-process it:
- Add a Code node after the HTTP Request to parse the HTML and extract specific fields (product name, price, status)
- Use n8n's HTML Extract node to pull data from specific CSS selectors without writing code
- Store structured fields instead of raw HTML โ easier to query and compare across runs
5. Multi-Step Flows
Some targets require more than one request:
- GET the page to obtain a CSRF token or session cookie
- Solve the CAPTCHA
- POST the form with CSRF token + captcha token + credentials
Chain multiple HTTP Request nodes in n8n to handle this. Pass values between nodes using $json expressions.
Troubleshooting
"Failed to reach n8n scraper"
json
{"success": false, "error": "Failed to reach n8n scraper. Is the OpenClaw CAPTCHA Scraper workflow active?"}
Check: Is n8n running? Is the workflow activated? Open n8n and verify the workflow is Active (green toggle).
CapSolver Timeout / No Token
Possible causes:
- Invalid API key โ check
~/.n8n/credentials - Insufficient balance โ top up at capsolver.com/dashboard
- Network issue between n8n server and CapSolver API
pageText is Empty or Contains an Error Page
- The HTTP Request URL or form field name may be wrong for your target
- Check the
g-recaptcha-responsefield name โ some sites use a different field name - Enable
fullResponse: truein the HTTP Request node to see the status code
Scheduled Run Fails at "Respond to Webhook"
This is expected and safe. When triggered by Schedule, there's no active webhook context. Make sure Continue on Fail is enabled on the Respond to Webhook node.
Complete Configuration Reference
n8n Workflow Nodes Summary
| Node | Type | Key Config |
|---|---|---|
| Webhook | n8n-nodes-base.webhook |
POST, path: openclaw/scrape, responseMode: responseNode |
| Schedule Trigger | n8n-nodes-base.scheduleTrigger |
Cron: 0 9 * * * |
| Scrape site | n8n-nodes-capsolver.capSolver |
Task: ReCaptchaV2TaskProxyless |
| HTTP Request | n8n-nodes-base.httpRequest |
POST to target URL with token in body |
| If | n8n-nodes-base.if |
Check $json.data contains "recaptcha-success" |
| Edit Fields | n8n-nodes-base.set |
pageText = $json.data |
| Save Result | n8n-nodes-base.executeCommand or any storage node |
Persist result (file, DB, Sheets, etc.) |
| Respond to Webhook | n8n-nodes-base.respondToWebhook |
JSON, continueOnFail: true |
CAPTCHA Task Types
| CAPTCHA | n8n Node Operation |
|---|---|
| reCAPTCHA v2 (checkbox) | reCAPTCHA v2 |
| reCAPTCHA v2 (invisible) | reCAPTCHA v2 |
| reCAPTCHA v3 | reCAPTCHA v3 |
| Cloudflare Turnstile | Cloudflare Turnstile |
| Cloudflare Challenge | Cloudflare Challenge |
| GeeTest V3 | GeeTest V3 |
| GeeTest V4 | GeeTest V4 |
| DataDome | DataDome |
| AWS WAF | AWS WAF |
| MTCaptcha | MTCaptcha |
Conclusion
The OpenClaw + n8n + CapSolver pipeline provides a production-grade data extraction setup that:
- Runs on demand when your AI Agent requests.
- Runs automatically on a schedule without any human input.
- Never requires a browser or display.
- Keeps CAPTCHA handling completely invisible โ to you and to the AI Agent.
The AI Agent simply issues an "extract data" command and receives clean page content. CapSolver handles the difficult part, n8n orchestrates the flow, and OpenClaw serves as the interface.
Ready to get started? Sign up for CapSolver and use bonus code OPENCLAW for an extra 6% bonus on your first recharge!
Frequently Asked Questions
Do I need to tell OpenClaw about CapSolver or CAPTCHAs?
No. OpenClaw simply runs a script that fires an HTTP request. n8n handles everything else. Your AI Agent has no knowledge of CAPTCHAs โ it just triggers a job and reads the result.
Can I point this at a different site?
Yes, but you'll likely need to adjust more than just the URL. Every site submits the CAPTCHA token differently โ some use form fields, some JSON bodies, some headers or cookies. See the "Adapting the Workflow to Your Target Site" section above for a full breakdown of what to check and change.
What if my target uses Turnstile instead of reCAPTCHA?
Change the CapSolver node's Task Type to AntiTurnstileTaskProxyless. Then inspect your target's network requests to find where the Turnstile token gets submitted โ it's often in a hidden form field called cf-turnstile-response, but some implementations pass it in a JSON body, a header, or a cookie instead.
How many results are stored?
That depends on your storage choice. With a local JSON file, you can keep as many as you like. With Google Sheets or a database, every run appends a row indefinitely. Configure the Save Result node to match your retention needs.
Can I trigger this from a cron job instead of OpenClaw?
Yes โ the webhook endpoint is just an HTTP POST. Anything that can make an HTTP request can trigger it:
bash
curl -s -X POST http://127.0.0.1:3005/webhook/openclaw/scrape
How much does each extraction cost?
Each run costs one CapSolver credit for the CAPTCHA solve. reCAPTCHA v2 is among the cheapest types. Check current pricing at capsolver.com.
Is OpenClaw free?
OpenClaw is open-source and free to self-host. You'll need API credits for your AI model provider and CapSolver for CAPTCHA solving.
Compliance Disclaimer: The information provided on this blog is for informational purposes only. CapSolver is committed to compliance with all applicable laws and regulations. The use of the CapSolver network for illegal, fraudulent, or abusive activities is strictly prohibited and will be investigated. Our captcha-solving solutions enhance user experience while ensuring 100% compliance in helping solve captcha difficulties during public data crawling. We encourage responsible use of our services. For more information, please visit our Terms of Service and Privacy Policy.
More

How to Automatically Solve CAPTCHAs with NanoClaw and CapSolver
Step-by-step guide to use CapSolver with NanoClaw for automatically solving reCAPTCHA, Turnstile, AWS WAF, and other CAPTCHAs. Works with Claude AI agents, zero code, and multiple browsers.

Ethan Collins
20-Mar-2026

How to Solve CAPTCHA Challenges for AI Agents: Data Extraction with n8n, CapSolver, and OpenClaw
Learn how to automate CAPTCHA solving for AI agents using n8n, CapSolver, and OpenClaw. Build a server-side pipeline to extract data from protected websites without browser automation or manual steps.

Ethan Collins
20-Mar-2026

How to Solve CAPTCHA with TinyFish AgentQL โ Step-by-Step Guide Using CapSolver
Learn how to integrate CapSolver with TinyFish AgentQL to automatically solve CAPTCHAs like reCAPTCHA and Cloudflare Turnstile. Step-by-step tutorial with Python and JavaScript SDK examples for seamless AI-powered web automation.

Ethan Collins
19-Mar-2026

How to Solve CAPTCHA with Vercel Agent Browser โ Step-by-Step Guide Using CapSolver
Learn how to integrate CapSolver with Agent Browser to handle CAPTCHAs and build reliable AI automation workflows.

Ethan Collins
18-Mar-2026

Integrating CapSolver with Web MCP: A Guide for Autonomous Agents
Enhance your AI agent's web automation capabilities. This guide details how to integrate CapSolver for efficient captcha solving within the Web MCP framework, ensuring reliable and compliant operations.

Rajinder Singh
17-Mar-2026

CAPTCHA AI Powered by Large Models: Why It's More Suitable for Enterprise Scenarios
How AI visual models are reshaping CAPTCHA recognition and why enterprise-grade solvers need data, scale, and custom training.

Ethan Collins
13-Mar-2026


