PicoClaw Automation: A Guide to Integrating CapSolver API

Ethan Collins
Pattern Recognition Specialist
26-Feb-2026

When your AI assistant automates web tasks, CAPTCHAs are the number one blocker. Protected pages refuse to submit, login flows stall, and the entire automation pipeline halts waiting for a human to click a checkbox or identify traffic lights.
PicoClaw is an ultra-lightweight personal AI assistant written in Go that runs on $10 hardware with under 10MB of RAM. It connects to the messaging platforms you already use, and includes a built-in exec tool that lets the agent write and run scripts autonomously.
CapSolver provides an AI-powered CAPTCHA solving API. By combining PicoClaw's script execution capabilities with CapSolver's REST API, your agent can detect CAPTCHAs, solve them, inject tokens, and submit forms — all without human intervention.
The best part? You just tell the agent what you want done in plain language. It writes a Playwright script, extracts the sitekey, calls CapSolver, injects the token, and submits the form — all autonomously. And because PicoClaw is compiled Go, the entire orchestration layer fits inside 10MB of RAM on a $10 RISC-V board.
What is PicoClaw?
PicoClaw is an ultra-lightweight personal AI assistant built in Go 1.25.7 through a remarkable self-bootstrapping process: the AI agent itself drove the entire architectural migration from Python, producing 95% of the core code autonomously with human-in-the-loop refinement.
The Numbers
| Metric | PicoClaw | Typical AI Assistants |
|---|---|---|
| Language | Go | Python / TypeScript |
| RAM | < 10MB | 100MB – 1GB+ |
| Boot Time (0.8GHz core) | < 1 second | 30 – 500+ seconds |
| Hardware Cost | As low as $10 | 50 – 599 |
| Binary | Single static binary | Runtime + dependencies |
PicoClaw's tagline says it all: $10 Hardware. 10MB RAM. 1s Boot.
Key Features

- Ultra-lightweight: Under 10MB memory footprint — 99% smaller than comparable TypeScript agents
- True portability: Single self-contained binary across RISC-V, ARM64, and x86_64 architectures
- Built-in tools: The agent can read/write files, execute shell commands, search the web, fetch pages, send cross-channel messages, schedule cron jobs, and even interact with I2C/SPI hardware peripherals
- Provider-agnostic: Works with OpenAI, Anthropic, DeepSeek, Gemini, Qwen, Moonshot, Groq, vLLM, Ollama, Cerebras, Mistral, NVIDIA, and gateway providers like OpenRouter
- Skill system: Extend capabilities with SKILL.md files using JSON or YAML frontmatter
- Memory system: Daily notes and persistent long-term memory across conversations
- Hardware tools: I2C and SPI tools for direct embedded device interaction — unique to PicoClaw
The ExecTool
PicoClaw's ExecTool (defined in pkg/tools/shell.go) is what makes browser automation possible. It's a carefully sandboxed shell execution environment with 27+ security deny patterns compiled as Go regexps, a 60-second default timeout, workspace path restriction, and path traversal detection.
When you ask the agent to interact with a web page, it:
- Writes a Playwright script via the
write_filetool - Executes it via the
exectool (which callssh -con Linux) - Reads the output (stdout + stderr, truncated to 10KB)
- Reports the results back to you on your chat channel
The tool's guardCommand() method checks every command against compiled regexp deny patterns before execution, enforces workspace path restrictions, and detects path traversal attempts. Think of it as sandboxed command-line access — the agent can run Node.js scripts and local package installs, but cannot rm -rf, sudo, or docker run.
The Agent Loop
The core logic lives in pkg/tools/toolloop.go — a tight cycle: LLM Call -> Extract Tool Calls -> Execute Tools -> Append Results -> repeat until a final text response (or MaxIterations, default 20). This loop is shared between the main agent (pkg/agent/loop.go) and background subagents via spawn.
What is CapSolver?
CapSolver is a leading CAPTCHA solving service that provides AI-powered solutions for bypassing various CAPTCHA challenges. With support for multiple CAPTCHA types and fast response times, CapSolver integrates seamlessly into automated workflows.
Supported CAPTCHA Types
- reCAPTCHA v2 (image-based & invisible)
- reCAPTCHA v3 & v3 Enterprise
- Cloudflare Turnstile
- Cloudflare 5-second Challenge
- AWS WAF CAPTCHA
Why PicoClaw's Approach Is Different
Most CAPTCHA-solving integrations fall into two camps: code-level API integration where you write a dedicated service class, or browser extension where a Chrome extension handles everything invisibly. PicoClaw takes a third approach: agent-driven API integration on edge hardware.
The AI agent itself orchestrates the entire solve flow autonomously — writing a Playwright script, extracting the sitekey, calling the CapSolver API, and injecting the solution token — all through scripts it writes and executes on the fly. And critically, the Go-based orchestrator doing all of this coordination consumes under 10MB of RAM.
The Edge-Device Advantage
You can run CAPTCHA-busting automation on hardware that costs less than a coffee. A $9.90 LicheeRV-Nano running PicoClaw can receive a Telegram message, coordinate with CapSolver's cloud API, inject the token, and submit the form — all while using a fraction of the board's 64MB RAM. The heavy lifting (CAPTCHA recognition) happens on CapSolver's servers; PicoClaw just orchestrates. Always-on, 24/7, on a device the size of a postage stamp.
| Browser Extension Approach | PicoClaw's Agent-Driven Approach |
|---|---|
| Requires Chrome extension installed | No extension needed — just an API key |
| Needs a compatible Chrome build | Works with any headless browser |
| Extension detects CAPTCHAs automatically | Agent extracts sitekey from page DOM |
| Extension calls API in the background | Agent calls CapSolver REST API directly |
| Requires a display (Xvfb on servers) | Runs fully headless, no display needed |
| Heavy runtime (1GB+ RAM) | Ultra-light orchestrator (< 10MB RAM) |
| Requires x86_64 or ARM64 desktop | Runs on RISC-V, ARM, x86 — even $10 boards |
The key insight: PicoClaw's Go binary is so lightweight it runs on hardware most frameworks can't even boot on — yet it can orchestrate the full CAPTCHA-solving pipeline through Playwright scripts and CapSolver's REST API.
Prerequisites
Note: The examples below are tested on Ubuntu 22.04 / 24.04. Commands use
aptandbash— adjust for your distro if needed. For edge devices (RISC-V, ARM), cross-compile PicoClaw on your build machine or download a prebuilt binary from the releases page.
Before setting up the integration, make sure you have:
- Ubuntu 22.04+ (or any Linux distribution — PicoClaw's single binary runs anywhere)
- Go 1.25.7+ installed (only needed for building from source)
- PicoClaw installed and running (prebuilt binary or
make build) - A CapSolver account with API key (sign up here)
- Node.js 18+ installed (for running Playwright scripts via the
exectool) - Playwright installed in your workspace
Step-by-Step Setup
Step 1: Install PicoClaw
Option A: Prebuilt Binary (Fastest)
bash
# Download the latest release for your platform
# Replace v0.1.1 with the latest version from the Releases page
wget https://github.com/sipeed/picoclaw/releases/download/v0.1.1/picoclaw-linux-amd64
chmod +x picoclaw-linux-amd64
sudo mv picoclaw-linux-amd64 /usr/local/bin/picoclaw
# Run the interactive onboarding wizard
picoclaw onboard
Option B: Build From Source
bash
git clone https://github.com/sipeed/picoclaw.git
cd picoclaw
make deps
make build
make install
# Initialize config and workspace
picoclaw onboard
This creates ~/.picoclaw/config.json, ~/.picoclaw/workspace/ (scripts, skills, and memory).
Step 2: Set Your CapSolver API Key
Add your CapSolver API key as an environment variable:
bash
export CAPSOLVER_API_KEY="CAP-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
You can get your API key from your CapSolver dashboard.
For persistent configuration, add it to ~/.bashrc or ~/.zshrc.
Step 3: Install Browser Automation Tools
Install Playwright and its system dependencies on Ubuntu:
bash
# Install Playwright browser dependencies (Ubuntu)
sudo apt install -y libnss3 libatk-bridge2.0-0 libdrm2 libxcomposite1 \
libxdamage1 libxrandr2 libgbm1 libpango-1.0-0 libasound2t64
# Install Playwright in your PicoClaw workspace
cd ~/.picoclaw/workspace
npm init -y
npm install playwright
npx playwright install chromium
Edge device note: On resource-constrained boards, you may want to install Chromium on a more powerful machine and point PicoClaw to a remote browser via Playwright's
browserType.connect(). The PicoClaw agent itself needs only ~10MB RAM; the browser is the heavy part.
Step 4: Configure ExecTool for Browser Automation
PicoClaw's ExecTool has built-in deny patterns for safety. The defaults work well for CAPTCHA automation: node, npx, and local npm install are all allowed. Only npm install -g, sudo, docker run, and similar dangerous commands are blocked. No configuration changes are needed for the standard workflow.
Step 5: Start the Gateway
bash
# Start channel services (Telegram, Discord, etc.)
picoclaw gateway
# Or for interactive testing
picoclaw agent
Step 6: Verify the Setup
Send a test message to your agent through any connected channel:
What tools do you have available?
The agent should list exec among its tools — this is what it uses to run browser automation scripts. You can also verify Node.js access:
Run: node --version
The agent should execute this via the exec tool and return the Node.js version.
The Built-in CapSolver Skill
PicoClaw uses a skill system based on SKILL.md files with frontmatter metadata. Skills are loaded from three locations in priority order (defined in pkg/skills/loader.go):
- Workspace skills:
~/.picoclaw/workspace/skills/{name}/SKILL.md(project-level, highest priority) - Global skills:
~/.picoclaw/skills/{name}/SKILL.md(user-level) - Built-in skills:
skills/{name}/SKILL.md(bundled with the binary)
Workspace skills override global skills, which override built-in ones. The loader reads frontmatter as JSON or YAML and builds an XML summary for the system prompt.
The Skill File (~/.picoclaw/workspace/skills/capsolver/SKILL.md)
markdown
---
name: capsolver
description: "Solve CAPTCHAs using the CapSolver REST API with Playwright. Supports reCAPTCHA v2, reCAPTCHA v3, and Cloudflare Turnstile. Use when automating form submissions or page interactions that encounter CAPTCHA challenges."
metadata: {"picoclaw":{"emoji":"🔓","always":true,"requires":{"bins":["node"],"env":["CAPSOLVER_API_KEY"]}}}
---
# CapSolver
## IMPORTANT: Execution Required
When a user asks you to solve a CAPTCHA, you MUST immediately call a tool. Do NOT respond with text like "Running..." or "I'll start..." or "I've started the automation...".
Your FIRST action must be one of:
1. Call `write_file` to save a Node.js script, then call `exec` to run it
2. Call `spawn` with a detailed task description for background execution
If you respond with only text and no tool call, the user will see nothing happen. Always execute.
## API Endpoints
- **Create task**: `POST https://api.capsolver.com/createTask`
- **Get result**: `POST https://api.capsolver.com/getTaskResult`
## Task Types
| CAPTCHA | Task Type | Sitekey Location |
|---|---|---|
| reCAPTCHA v2 | `ReCaptchaV2TaskProxyLess` | `data-sitekey` attribute |
| reCAPTCHA v3 | `ReCaptchaV3TaskProxyLess` | `grecaptcha.execute` call or page source |
| Cloudflare Turnstile | `AntiTurnstileTaskProxyLess` | `data-sitekey` on Turnstile div |
Enterprise variants: `ReCaptchaV2EnterpriseTaskProxyLess`, `ReCaptchaV3EnterpriseTaskProxyLess`.
## Workflow
1. Navigate to the page with Playwright (headless Chromium)
2. Extract the sitekey from the DOM (`[data-sitekey]` attribute)
3. Call `createTask` with the sitekey and page URL
4. Poll `getTaskResult` every 2 seconds until `status: "ready"`
5. Inject the token into the page (hidden form field)
6. Submit the form
## Core Code Pattern
```javascript
const CAPSOLVER_API_KEY = process.env.CAPSOLVER_API_KEY;
// Step 1: Create task
const createRes = await fetch('https://api.capsolver.com/createTask', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
clientKey: CAPSOLVER_API_KEY,
task: {
type: 'ReCaptchaV2TaskProxyLess', // or ReCaptchaV3TaskProxyLess, AntiTurnstileTaskProxyLess
websiteURL: pageUrl,
websiteKey: siteKey
}
})
});
const { taskId } = await createRes.json();
// Step 2: Poll for result
let token;
while (true) {
await new Promise(r => setTimeout(r, 2000));
const res = await fetch('https://api.capsolver.com/getTaskResult', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ clientKey: CAPSOLVER_API_KEY, taskId })
});
const result = await res.json();
if (result.status === 'ready') { token = result.solution.gRecaptchaResponse || result.solution.token; break; }
if (result.status === 'failed') throw new Error('Solve failed');
}
// Step 3: Inject token (reCAPTCHA)
await page.evaluate((t) => {
document.querySelectorAll('textarea[name="g-recaptcha-response"]')
.forEach(el => { el.value = t; el.innerHTML = t; });
}, token);
```
For Turnstile, the token field is typically `input[name="cf-turnstile-response"]` and the solution is in `result.solution.token`.
## API Reference
All task types require `type`, `websiteURL`, `websiteKey`. Optional fields vary by type:
- **reCAPTCHA v2**: `isInvisible`, `pageAction`, `enterprisePayload`, `apiDomain`
- **reCAPTCHA v3**: `pageAction` (from `grecaptcha.execute(key, {action: "..."})`)
- **Cloudflare Turnstile**: `metadata.action`, `metadata.cdata`
Key points:
- Frontmatter uses JSON or YAML (
pkg/skills/loader.gotries JSON first, falls back to YAML) metadatacontains PicoClaw-specific config:emojifor display,alwaysto auto-load,requiresfor dependency checksSkillsLoader.BuildSkillsSummary()generates XML summaries injected into the system prompt- The "Execution Required" section forces tool calls instead of text-only responses
After creating the skill, verify with picoclaw skills — you should see capsolver listed.
How It Works
When you ask PicoClaw to interact with a CAPTCHA-protected page, here's the complete flow from message to result:
Your message PicoClaw Agent (Go, ~10MB RAM)
─────────────────────────────────────────────────────────────
"Go to that page, ──► Agent receives via MessageBus
fill the form, │ (pkg/bus/bus.go)
solve the captcha, ▼
and submit it" ContextBuilder injects skills
│ (pkg/agent/context.go)
▼
RunToolLoop starts
│ (pkg/tools/toolloop.go)
▼
Agent writes Node.js script
│ via write_file tool
▼
ExecTool runs the script
┌────────────────────────────┐
│ pkg/tools/shell.go │
│ guardCommand() → 27+ checks │
│ sh -c "node script.js" │
│ │
│ Headless Chromium │
│ 1. Navigate to page │
│ 2. Extract sitekey │
│ 3. POST /createTask ────────── CapSolver API
│ 4. Poll /getTaskResult ─────── (cloud)
│ 5. Inject token │
│ 6. Submit form │
│ 7. Screenshot │
└────────────────────────────┘
│
▼ stdout returned (max 10KB)
Agent reads output
│
▼
"Form submitted successfully!
Verification Success!"
The CapSolver API Flow
The core of the integration is two API calls:
1. Create a task — Send the CAPTCHA sitekey and page URL to CapSolver:
javascript
const response = await fetch('https://api.capsolver.com/createTask', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
clientKey: CAPSOLVER_API_KEY,
task: {
type: 'ReCaptchaV2TaskProxyLess',
websiteURL: pageUrl,
websiteKey: siteKey
}
})
});
2. Poll for the result — Check every 2 seconds until CapSolver returns the solved token:
javascript
const result = await fetch('https://api.capsolver.com/getTaskResult', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
clientKey: CAPSOLVER_API_KEY,
taskId: taskId
})
});
// result.solution.gRecaptchaResponse contains the token
3. Inject the token — Set it in the hidden form field that reCAPTCHA expects:
javascript
await page.evaluate((token) => {
const textarea = document.querySelector('textarea[name="g-recaptcha-response"]');
if (textarea) {
textarea.value = token;
textarea.innerHTML = token;
}
}, captchaToken);
Complete Working Example
Here's the actual Node.js script that PicoClaw's agent generates and executes to solve reCAPTCHA on the Google demo page. The agent writes this via write_file, then runs it with exec — all autonomously from a single Telegram message:
javascript
const { chromium } = require('playwright');
const https = require('https');
const CAPSOLVER_API_KEY = process.env.CAPSOLVER_API_KEY;
const PAGE_URL = '';
function httpsPost(url, data) {
return new Promise((resolve, reject) => {
const req = https.request(url, {
method: 'POST',
headers: { 'Content-Type': 'application/json' }
}, (res) => {
let body = '';
res.on('data', chunk => body += chunk);
res.on('end', () => resolve(JSON.parse(body)));
});
req.on('error', reject);
req.write(JSON.stringify(data));
req.end();
});
}
async function solveRecaptcha(siteKey, pageUrl) {
console.log('Creating CapSolver task...');
const createRes = await httpsPost('https://api.capsolver.com/createTask', {
clientKey: CAPSOLVER_API_KEY,
task: {
type: 'ReCaptchaV2TaskProxyLess',
websiteURL: pageUrl,
websiteKey: siteKey
}
});
if (createRes.errorId) {
throw new Error(`CapSolver error: ${createRes.errorDescription}`);
}
const { taskId } = createRes;
console.log(`Task ID: ${taskId}`);
let token;
while (true) {
await new Promise(r => setTimeout(r, 2000));
const res = await httpsPost('https://api.capsolver.com/getTaskResult', {
clientKey: CAPSOLVER_API_KEY,
taskId
});
if (res.status === 'ready') {
token = res.solution.gRecaptchaResponse;
console.log(`Token received! Length: ${token.length}`);
break;
}
if (res.status === 'failed') {
throw new Error(`CapSolver task failed: ${res.errorDescription}`);
}
console.log('Polling... status:', res.status);
}
if (!token) throw new Error('Failed to get token');
return token;
}
async function main() {
const browser = await chromium.launch({
headless: true,
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
const page = await browser.newPage();
try {
await page.goto(PAGE_URL, { waitUntil: 'domcontentloaded', timeout: 30000 });
const siteKey = await page.locator('[data-sitekey]').getAttribute('data-sitekey');
console.log(`Sitekey: ${siteKey}`);
const token = await solveRecaptcha(siteKey, PAGE_URL);
await page.evaluate((t) => {
document.querySelectorAll('textarea[name="g-recaptcha-response"]')
.forEach(el => { el.value = t; el.innerHTML = t; });
}, token);
await page.locator('input[type="submit"]').click();
await page.waitForTimeout(3000);
const body = await page.textContent('body');
console.log(body.includes('Success') ? 'SUCCESS!' : 'Result:', body.slice(0, 200));
await page.screenshot({ path: 'recaptcha_result.png' });
} finally {
await browser.close();
}
}
main().catch(err => {
console.error('Error:', err.message);
process.exit(1);
});
Run it directly:
bash
CAPSOLVER_API_KEY=CAP-XXX node solve_recaptcha.js
Or let PicoClaw's agent handle everything — just send a message on Telegram:
Solve the reCAPTCHA at https://example.com and submit the form.
The agent reads its capsolver skill, writes the script, runs it via exec, reads the output, and reports back.
How to Use It
Once the setup is complete, using CapSolver with PicoClaw is as simple as sending a message on any connected channel.
Example 1: Solve a reCAPTCHA Demo
Send this to your agent via Telegram, Discord, WhatsApp, or any connected channel:
Go to https://example.com and solve
the reCAPTCHA using the CapSolver API, then submit the form
and tell me if it succeeded.
What happens: The agent reads the capsolver skill, writes a Playwright script, runs it via exec (which passes guardCommand() checks and executes with a 60s timeout), and the script navigates the page, extracts the sitekey, calls CapSolver, injects the token, and submits. The result flows back to you through the MessageBus.
Example 2: Login to a Protected Site
Go to https://example.com/login, fill in the email with
"[email protected]" and password with "mypassword", detect and
solve any CAPTCHA on the page, then click Sign In and tell me
what happens.
Example 3: Submit a Contact Form
Open https://example.com/contact, fill in the name, email, and
message fields, solve the CAPTCHA, submit the form, and tell me
the confirmation message.
Example 4: Background Automation via Spawn
For longer-running tasks, use the spawn tool (pkg/tools/spawn.go) to delegate to a background subagent:
In the background, go to https://example.com/register, create
an account with my details, solve any CAPTCHAs you encounter,
and let me know when it's done.
Example 5: Edge Device Monitoring (Telegram on a $10 Board)
If PicoClaw is running on a LicheeRV-Nano or similar edge device, combine with the cron tool:
Every hour, check https://example.com/status — if there's a
CAPTCHA gate, solve it and report the status page content.
Why This Works
PicoClaw's agent has all the tools needed for autonomous CAPTCHA solving:
exec(pkg/tools/shell.go) — sandboxed shell execution with 27+ security deny patternswrite_file/read_file(pkg/tools/filesystem.go) — script management in the workspacespawn(pkg/tools/spawn.go) — background subagent delegation for long tasksweb_fetch(pkg/tools/web.go) — page content fetching for DOM analysis- Skill system (
pkg/skills/loader.go) —capsolverskill provides API docs in context - Memory (
pkg/agent/memory.go) — persists successful approaches across sessions
Performance Results
We tested the integration on Google's reCAPTCHA v2 demo page via a live Discord bot on Ubuntu 24.04. The PicoClaw agent (using glm-4.7 via z.ai) received a Discord message, autonomously wrote a Playwright script, solved the CAPTCHA, and reported back — all without human intervention:
| Metric | Value |
|---|---|
| PicoClaw agent memory usage | ~8 MB |
| LLM model | glm-4.7 (Zhipu AI via z.ai) |
| Agent iterations | 5 (understand → write script → execute → screenshot → encode) |
| Script generation (write_file) | < 1 second |
| Script execution (Playwright + CapSolver) | 24.2 seconds |
| Screenshot capture + base64 encoding | 16ms |
| Generated artifacts | solve_recaptcha_random.js (6KB), before_submit.png (22KB), after_submit.png (6KB) |
| End-to-end (Discord message to response) | ~30 seconds |
| Result | Verification Success |
Edge device note: On boards with limited RAM (e.g., the $9.90 LicheeRV-Nano with 64MB), PicoClaw itself fits easily (~8MB) but Chromium needs 100-300MB. Use Playwright's
connect()to offload the browser to a more capable machine while keeping PicoClaw's lightweight agent on the edge device.
Troubleshooting
"Cannot find module 'playwright'"
Playwright isn't installed in the workspace. Run:
bash
cd ~/.picoclaw/workspace && npm install playwright && npx playwright install chromium
Missing browser libraries on Ubuntu
If Chromium fails to launch with errors about missing shared libraries, install the system dependencies:
bash
sudo apt install -y libnss3 libatk-bridge2.0-0 libdrm2 libxcomposite1 \
libxdamage1 libxrandr2 libgbm1 libpango-1.0-0 libasound2t64
ExecTool deny patterns blocking npm install
PicoClaw's deny patterns block npm install -g (global installs), sudo, and apt install, but allow local npm install, node script.js, and npx playwright install. If you see "Command blocked by safety guard", you can either disable deny patterns or provide custom ones in ~/.picoclaw/config.json:
json
{ "tools": { "exec": { "enable_deny_patterns": false } } }
Or use a custom allowlist that excludes only the patterns you want blocked.
CAPTCHA solve timeout
- Check your CapSolver API key is valid
- Check your CapSolver account balance at capsolver.com/dashboard
- The script polls every 2 seconds until CapSolver returns
readyorfailed - If the
exectool's 60-second timeout is not enough, the script will be killed. You can increase it programmatically or use thespawntool for longer tasks (subagents have their own timeout)
ExecTool 60-second timeout too short
The default timeout in pkg/tools/shell.go is 60 seconds. For CAPTCHA automation, this can be tight. Use the spawn tool for longer tasks (subagents run independently), or modify the timeout in NewExecToolWithConfig() in the source (timeout: 120 * time.Second).
Sitekey not found
The script extracts the sitekey from the data-sitekey attribute. If no element is found, the agent can adapt and extract it from iframe URLs or page source.
Browser crashes in Docker/containers
Add --no-sandbox, --disable-setuid-sandbox, and --disable-dev-shm-usage to the Playwright launch args.
Agent doesn't use CapSolver
Verify: (1) CAPSOLVER_API_KEY env var is set before starting PicoClaw, (2) skill file exists at ~/.picoclaw/workspace/skills/capsolver/SKILL.md, (3) picoclaw skills shows it listed.
Best Practices
1. Set the API Key as an Environment Variable
Don't hardcode the key in scripts. Use process.env.CAPSOLVER_API_KEY so the agent can pick it up automatically. PicoClaw passes the parent process's environment to all exec tool invocations.
2. Use Headless Mode on Servers
PicoClaw's API-based approach works in fully headless environments — no Xvfb or virtual display needed. This is a significant advantage over extension-based approaches, especially on edge devices where display hardware doesn't exist.
3. Monitor Your CapSolver Balance
Each CAPTCHA solve costs credits. Check your balance at capsolver.com/dashboard regularly.
4. Keep Playwright Updated
CAPTCHA providers evolve. Keep Playwright and Chromium updated:
bash
cd ~/.picoclaw/workspace && npm update playwright && npx playwright install chromium
5. Use the Spawn Tool for Long-Running Tasks
Browser automation can take 30-60 seconds. Use spawn instead of relying on the agent's primary loop to avoid timeouts and keep the main agent responsive to other messages.
6. Leverage PicoClaw's Memory System
After a successful CAPTCHA solve, the agent saves the approach to ~/.picoclaw/workspace/memory/MEMORY.md. Next time, it recalls the exact pattern that worked.
7. Edge Device Deployment: Offload the Browser
On $10 boards with limited RAM, connect to a remote Chromium instance via chromium.connect('ws://server:9222'). This keeps PicoClaw's ~8MB footprint on the edge while the browser runs elsewhere.
8. Configure Workspace Restriction Carefully
PicoClaw's restrict_to_workspace setting limits file and exec operations to the workspace directory. Ensure your scripts and Playwright installation are within ~/.picoclaw/workspace/.
Conclusion
The PicoClaw + CapSolver integration represents a fundamentally different approach to CAPTCHA solving. Instead of heavy browser extensions on desktop machines, a Go-compiled agent running on $10 hardware orchestrates the entire solve flow:
- Navigate to the target page with Playwright
- Extract the sitekey from the
data-sitekeyattribute - Solve by calling CapSolver's REST API directly
- Inject the solution token into the hidden form field
- Submit the form and verify success
This gives you:
- No Chrome extension dependency — works with any headless browser
- Headless server support — no display or Xvfb needed
- Natural language control — just tell the agent what you want done via Telegram, Discord, or any of 12+ channels
- Edge-device deployment — run 24/7 on a $10 RISC-V board with under 10MB RAM
- Security by default — 27+ deny patterns in the ExecTool prevent dangerous commands
Bonus: Quick-Start Script
Save the complete working example from above to ~/.picoclaw/workspace/solve_captcha.js and run:
bash
CAPSOLVER_API_KEY=CAP-XXX node ~/.picoclaw/workspace/solve_captcha.js
Or simply send a Telegram message to your PicoClaw agent and let it handle everything autonomously.
Ready to get started? Sign up for CapSolver and use bonus code PICOCLAW for an extra 6% bonus on your first recharge!

FAQ
How does PicoClaw solve CAPTCHAs differently from browser extensions?
PicoClaw uses the CapSolver REST API directly. The agent writes and executes Node.js/Playwright scripts that call createTask and getTaskResult to obtain solution tokens, then injects them into the page DOM. No browser extension is needed. The entire orchestration happens through PicoClaw's ExecTool (pkg/tools/shell.go), which runs sh -c "node script.js" with 27+ security deny patterns, workspace path restriction, and a configurable timeout.
Do I need a special Chrome version?
No. Unlike extension-based approaches that require Chrome for Testing (since branded Chrome 137+ disabled extension loading), PicoClaw works with any Chromium build — including Playwright's bundled Chromium, standard Chromium packages, or headless Chrome. This is especially important on edge devices where you may only have access to distro-packaged Chromium.
Can PicoClaw really run on a $10 board?
Yes. PicoClaw uses under 10MB RAM and boots in under 1 second on a 0.6GHz core. It supports RISC-V, ARM64, and x86_64. CapSolver's cloud API handles the heavy work; PicoClaw just coordinates. Note: Chromium needs 100-300MB RAM, so sub-256MB boards should connect to a remote browser.
What CAPTCHA types does CapSolver support?
CapSolver supports reCAPTCHA v2 (checkbox and invisible), reCAPTCHA v3, reCAPTCHA Enterprise, Cloudflare Turnstile, AWS WAF CAPTCHA, and more. The PicoClaw integration uses ReCaptchaV2TaskProxyLess in the example, but the skill file documents all task types. The agent can adapt to any supported CAPTCHA type by modifying the task type parameter.
Can I use this on a headless server?
Yes — and this is where PicoClaw's approach shines. Since there's no browser extension involved, you don't need Xvfb or a virtual display. Playwright runs in fully headless mode out of the box. Combined with PicoClaw's tiny footprint, this makes it ideal for always-on server deployments.
How much does CapSolver cost?
CapSolver offers competitive pricing based on CAPTCHA type and volume. Visit capsolver.com for current pricing. Use bonus code PICOCLAW for an extra 6% on your first recharge.
Is PicoClaw free?
PicoClaw is open-source (MIT license) and free to run on your own hardware. You'll need API keys for the AI model provider of your choice and, for CAPTCHA solving, a CapSolver account with credits. The PicoClaw binary itself has zero runtime cost.
How long does CAPTCHA solving take?
In our Discord bot integration test with reCAPTCHA v2, the agent's Playwright script (including CapSolver API polling) executed in 24.2 seconds. The full end-to-end time from Discord message to response was ~30 seconds, including 5 LLM iterations for script generation, execution, and visual verification.
Will PicoClaw's deny patterns block my automation scripts?
No. The deny patterns in pkg/tools/shell.go block dangerous system commands (rm -rf, sudo, docker run), not regular Node.js execution. Running node script.js and local npm install are fully allowed. Only global installs (npm install -g) and package management commands are blocked.
Can I run multiple CAPTCHA solves in parallel?
Yes. Use PicoClaw's spawn tool to create multiple background subagents, each handling a different CAPTCHA task. The SubagentManager (pkg/tools/subagent.go) runs each independently and reports results back through the MessageBus.
How does PicoClaw compare to Nanobot for CAPTCHA solving?
PicoClaw was inspired by Nanobot (Python), rewritten in Go for extreme efficiency. Both use agent-driven CAPTCHA solving — the key difference is resources. Nanobot needs 100MB+ RAM and Python; PicoClaw needs under 10MB and ships as a single binary. For edge devices, PicoClaw is the clear choice.
Compliance Disclaimer: The information provided on this blog is for informational purposes only. CapSolver is committed to compliance with all applicable laws and regulations. The use of the CapSolver network for illegal, fraudulent, or abusive activities is strictly prohibited and will be investigated. Our captcha-solving solutions enhance user experience while ensuring 100% compliance in helping solve captcha difficulties during public data crawling. We encourage responsible use of our services. For more information, please visit our Terms of Service and Privacy Policy.
More

PicoClaw Automation: A Guide to Integrating CapSolver API
Learn to integrate CapSolver with PicoClaw for automated CAPTCHA solving on ultra-lightweight $10 edge hardware.

Ethan Collins
26-Feb-2026

How to Solve Captcha in Nanobot with CapSolver
Automate CAPTCHA solving with Nanobot and CapSolver. Use Playwright to solve reCAPTCHA and Cloudflare autonomously.

Ethan Collins
26-Feb-2026

How to Extract Structured Data From Popular Websites
Learn how to extract structured data from popular websites. Discover tools, techniques, and best practices for web scraping and data analysis.

Aloísio Vítor
12-Feb-2026

Data as a Service (DaaS): What It Is and Why It Matters in 2026
Understand Data as a Service (DaaS) in 2026. Explore its benefits, use cases, and how it transforms businesses with real-time insights and scalability.

Emma Foster
12-Feb-2026

How to Fix Common Web Scraping Errors in 2026
Master fixing diverse web scraper errors like 400, 401, 402, 403, 429, 5xx, and Cloudflare 1001 in 2026. Learn advanced strategies for IP rotation, headers, and adaptive rate limiting with CapSolver.

Lucas Mitchell
05-Feb-2026

How to Solve Captcha with Nanobrowser and CapSolver Integration
Solve reCAPTCHA and Cloudflare Turnstile automatically by integrating Nanobrowser with CapSolver for seamless AI automation.

Ethan Collins
04-Feb-2026

