How to Solve Captcha in Browser4 with CapSolver Integration

Blog

web scraping

Blog

web scraping

How to Solve Captcha in Browser4 with CapSolver Integration

Lucas Mitchell

Automation Engineer

21-Jan-2026

For web automation, Browser4 (from PulsarRPA ) has emerged as a lightning-fast, coroutine-safe browser engine designed for AI-powered data extraction. With capabilities supporting 100k-200k complex page visits per machine per day, Browser4 is built for serious scale. However, when extracting data from protected websites, CAPTCHA challenges become a significant barrier.

CapSolver provides the perfect complement to Browser4's automation capabilities, enabling your agents to navigate through CAPTCHA-protected pages seamlessly. This integration combines Browser4's high-throughput browser automation with industry-leading CAPTCHA solving.

What is Browser4?

Browser4 is a high-performance, coroutine-safe browser automation framework built in Kotlin. It's designed for AI applications requiring autonomous agent capabilities, extreme throughput, and hybrid data extraction combining LLM, machine learning, and selector-based approaches.

Key Features of Browser4

Extreme Throughput: 100k-200k complex page visits per machine per day
Coroutine-Safe: Built with Kotlin coroutines for efficient parallel processing
AI-Powered Agents: Autonomous browser agents capable of reasoning and executing multi-step tasks
Hybrid Extraction: Combines LLM intelligence, ML algorithms, and CSS/XPath selectors
X-SQL Queries: Extended SQL syntax for complex data extraction
Anti-Bot Features: Profile rotation, proxy support, and resilient scheduling

Core API Methods

Method	Description
`session.open(url)`	Loads a page and returns a PageSnapshot
`session.parse(page)`	Converts snapshot to in-memory document
`driver.selectFirstTextOrNull(selector)`	Retrieves text from live DOM
`driver.evaluate(script)`	Executes JavaScript in the browser
`session.extract(document, fieldMap)`	Maps CSS selectors to structured fields

What is CapSolver?

CapSolver is a leading CAPTCHA solving service that provides AI-powered solutions for bypassing various CAPTCHA challenges. With support for multiple CAPTCHA types and lightning-fast response times, CapSolver integrates seamlessly into automated workflows.

Supported CAPTCHA Types

Why Integrate CapSolver with Browser4?

When building Browser4 automation that interacts with protected websites—whether for data extraction, price monitoring, or market research—CAPTCHA challenges become a significant obstacle. Here's why the integration matters:

Uninterrupted High-Throughput Extraction: Maintain 100k+ daily page visits without CAPTCHA blocks
Scalable Operations: Handle CAPTCHAs across parallel coroutine executions
Seamless Workflow: Solve CAPTCHAs as part of your extraction pipeline
Cost-Effective: Pay only for successfully solved CAPTCHAs
High Success Rates: Industry-leading accuracy for all supported CAPTCHA types

Installation

Prerequisites

Java 17 or higher
Maven 3.6+ or Gradle
A CapSolver API key

Adding Dependencies

Maven (pom.xml):

xml Copy

<dependencies>
    <!-- Browser4/PulsarRPA -->
    <dependency>
        <groupId>ai.platon.pulsar</groupId>
        <artifactId>pulsar-boot</artifactId>
        <version>2.2.0</version>
    </dependency>

    <!-- HTTP Client for CapSolver -->
    <dependency>
        <groupId>com.squareup.okhttp3</groupId>
        <artifactId>okhttp</artifactId>
        <version>4.12.0</version>
    </dependency>

    <!-- JSON Parsing -->
    <dependency>
        <groupId>com.google.code.gson</groupId>
        <artifactId>gson</artifactId>
        <version>2.10.1</version>
    </dependency>

    <!-- Kotlin Coroutines -->
    <dependency>
        <groupId>org.jetbrains.kotlinx</groupId>
        <artifactId>kotlinx-coroutines-core</artifactId>
        <version>1.8.0</version>
    </dependency>
</dependencies>

Gradle (build.gradle.kts):

kotlin Copy

dependencies {
    implementation("ai.platon.pulsar:pulsar-boot:2.2.0")
    implementation("com.squareup.okhttp3:okhttp:4.12.0")
    implementation("com.google.code.gson:gson:2.10.1")
    implementation("org.jetbrains.kotlinx:kotlinx-coroutines-core:1.8.0")
}

Environment Setup

Create an application.properties file:

properties Copy

# CapSolver Configuration
CAPSOLVER_API_KEY=your_capsolver_api_key

# LLM Configuration (optional, for AI extraction)
OPENROUTER_API_KEY=your_openrouter_api_key

# Proxy Configuration (optional)
PROXY_ROTATION_URL=your_proxy_url

Creating a CapSolver Service for Browser4

Here's a reusable Kotlin service that integrates CapSolver with Browser4:

Basic CapSolver Service

kotlin Copy

import com.google.gson.Gson
import com.google.gson.JsonObject
import okhttp3.MediaType.Companion.toMediaType
import okhttp3.OkHttpClient
import okhttp3.Request
import okhttp3.RequestBody.Companion.toRequestBody
import kotlinx.coroutines.delay
import java.util.concurrent.TimeUnit

data class TaskResult(
    val gRecaptchaResponse: String? = null,
    val token: String? = null,
    val cookies: List<Map<String, String>>? = null,
    val userAgent: String? = null
)

class CapSolverService(private val apiKey: String) {
    private val client = OkHttpClient.Builder()
        .connectTimeout(30, TimeUnit.SECONDS)
        .readTimeout(30, TimeUnit.SECONDS)
        .build()

    private val gson = Gson()
    private val baseUrl = "https://api.capsolver.com"
    private val jsonMediaType = "application/json".toMediaType()

    private suspend fun createTask(taskData: Map<String, Any>): String {
        val payload = mapOf(
            "clientKey" to apiKey,
            "task" to taskData
        )

        val request = Request.Builder()
            .url("$baseUrl/createTask")
            .post(gson.toJson(payload).toRequestBody(jsonMediaType))
            .build()

        val response = client.newCall(request).execute()
        val result = gson.fromJson(response.body?.string(), JsonObject::class.java)

        if (result.get("errorId").asInt != 0) {
            throw Exception("CapSolver error: ${result.get("errorDescription").asString}")
        }

        return result.get("taskId").asString
    }

    private suspend fun getTaskResult(taskId: String, maxAttempts: Int = 60): TaskResult {
        val payload = mapOf(
            "clientKey" to apiKey,
            "taskId" to taskId
        )

        repeat(maxAttempts) {
            delay(2000)

            val request = Request.Builder()
                .url("$baseUrl/getTaskResult")
                .post(gson.toJson(payload).toRequestBody(jsonMediaType))
                .build()

            val response = client.newCall(request).execute()
            val result = gson.fromJson(response.body?.string(), JsonObject::class.java)

            when (result.get("status")?.asString) {
                "ready" -> {
                    val solution = result.getAsJsonObject("solution")
                    return TaskResult(
                        gRecaptchaResponse = solution.get("gRecaptchaResponse")?.asString,
                        token = solution.get("token")?.asString,
                        userAgent = solution.get("userAgent")?.asString
                    )
                }
                "failed" -> throw Exception("Task failed: ${result.get("errorDescription")?.asString}")
            }
        }

        throw Exception("Timeout waiting for CAPTCHA solution")
    }

    suspend fun solveReCaptchaV2(websiteUrl: String, websiteKey: String): String {
        val taskId = createTask(mapOf(
            "type" to "ReCaptchaV2TaskProxyLess",
            "websiteURL" to websiteUrl,
            "websiteKey" to websiteKey
        ))

        val result = getTaskResult(taskId)
        return result.gRecaptchaResponse ?: throw Exception("No gRecaptchaResponse in solution")
    }

    suspend fun solveReCaptchaV3(
        websiteUrl: String,
        websiteKey: String,
        pageAction: String = "submit"
    ): String {
        val taskId = createTask(mapOf(
            "type" to "ReCaptchaV3TaskProxyLess",
            "websiteURL" to websiteUrl,
            "websiteKey" to websiteKey,
            "pageAction" to pageAction
        ))

        val result = getTaskResult(taskId)
        return result.gRecaptchaResponse ?: throw Exception("No gRecaptchaResponse in solution")
    }

    suspend fun solveTurnstile(
        websiteUrl: String,
        websiteKey: String,
        action: String? = null,
        cdata: String? = null
    ): String {
        val taskData = mutableMapOf(
            "type" to "AntiTurnstileTaskProxyLess",
            "websiteURL" to websiteUrl,
            "websiteKey" to websiteKey
        )

        // Add optional metadata
        if (action != null || cdata != null) {
            val metadata = mutableMapOf<String, String>()
            action?.let { metadata["action"] = it }
            cdata?.let { metadata["cdata"] = it }
            taskData["metadata"] = metadata
        }

        val taskId = createTask(taskData)
        val result = getTaskResult(taskId)
        return result.token ?: throw Exception("No token in solution")
    }

    suspend fun checkBalance(): Double {
        val payload = mapOf("clientKey" to apiKey)

        val request = Request.Builder()
            .url("$baseUrl/getBalance")
            .post(gson.toJson(payload).toRequestBody(jsonMediaType))
            .build()

        val response = client.newCall(request).execute()
        val result = gson.fromJson(response.body?.string(), JsonObject::class.java)

        return result.get("balance")?.asDouble ?: 0.0
    }
}

Solving Different CAPTCHA Types

reCAPTCHA v2 with Browser4

kotlin Copy

import ai.platon.pulsar.context.PulsarContexts
import ai.platon.pulsar.skeleton.session.PulsarSession
import kotlinx.coroutines.runBlocking

class ReCaptchaV2Extractor(
    private val capSolver: CapSolverService
) {
    suspend fun extractWithCaptcha(targetUrl: String, siteKey: String): Map<String, Any?> {
        println("Solving reCAPTCHA v2...")

        // Solve the CAPTCHA first
        val token = capSolver.solveReCaptchaV2(targetUrl, siteKey)
        println("CAPTCHA solved, token length: ${token.length}")

        // Create session and open the page
        val session = PulsarContexts.createSession()
        val page = session.open(targetUrl)
        val driver = session.getOrCreateBoundDriver()

        // Inject the token into the hidden textarea using value property (safe)
        driver?.evaluate("""
            (function() {
                var el = document.querySelector('#g-recaptcha-response');
                if (el) el.value = arguments[0];
            })('$token');
        """)

        // Submit the form
        driver?.evaluate("document.querySelector('form').submit();")

        // Wait for navigation
        Thread.sleep(3000)

        // Extract data from the result page
        val document = session.parse(page)

        mapOf(
            "title" to document.selectFirstTextOrNull("h1"),
            "content" to document.selectFirstTextOrNull(".content"),
            "success" to (document.body().text().contains("success", ignoreCase = true))
        )
    }
}

fun main() = runBlocking {
    val apiKey = System.getenv("CAPSOLVER_API_KEY") ?: "your_api_key"
    val capSolver = CapSolverService(apiKey)

    val extractor = ReCaptchaV2Extractor(capSolver)

    val result = extractor.extractWithCaptcha(
        targetUrl = "https://example.com/protected-page",
        siteKey = "6LcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxABC"
    )

    println("Extraction result: $result")
}

reCAPTCHA v3 with Browser4

kotlin Copy

class ReCaptchaV3Extractor(
    private val capSolver: CapSolverService
) {
    suspend fun extractWithCaptchaV3(
        targetUrl: String,
        siteKey: String,
        action: String = "submit"
    ): Map<String, Any?> {
        println("Solving reCAPTCHA v3 with action: $action")

        // Solve reCAPTCHA v3 with custom page action
        val token = capSolver.solveReCaptchaV3(
            websiteUrl = targetUrl,
            websiteKey = siteKey,
            pageAction = action
        )

        println("Token obtained successfully")

        // Create session and open the page
        val session = PulsarContexts.createSession()
        val page = session.open(targetUrl)
        val driver = session.getOrCreateBoundDriver()

        // Inject token into hidden input (using safe value assignment)
        driver?.evaluate("""
            (function(tokenValue) {
                var input = document.querySelector('input[name="g-recaptcha-response"]');
                if (input) {
                    input.value = tokenValue;
                } else {
                    var hidden = document.createElement('input');
                    hidden.type = 'hidden';
                    hidden.name = 'g-recaptcha-response';
                    hidden.value = tokenValue;
                    var form = document.querySelector('form');
                    if (form) form.appendChild(hidden);
                }
            })('$token');
        """)

        // Click submit button
        driver?.evaluate("document.querySelector('#submit-btn').click();")

        Thread.sleep(3000)

        val document = session.parse(page)

        mapOf(
            "result" to document.selectFirstTextOrNull(".result-data"),
            "status" to "success"
        )
    }
}

Cloudflare Turnstile with Browser4

kotlin Copy

class TurnstileExtractor(
    private val capSolver: CapSolverService
) {
    suspend fun extractWithTurnstile(targetUrl: String, siteKey: String): Map<String, Any?> {
        println("Solving Cloudflare Turnstile...")

        // Solve with optional metadata (action and cdata)
        val token = capSolver.solveTurnstile(
            targetUrl,
            siteKey,
            action = "login",  // optional
            cdata = "0000-1111-2222-3333-example"  // optional
        )
        println("Turnstile solved!")

        val session = PulsarContexts.createSession()
        val page = session.open(targetUrl)
        val driver = session.getOrCreateBoundDriver()

        // Inject Turnstile token (using safe value assignment)
        driver?.evaluate("""
            (function(tokenValue) {
                var input = document.querySelector('input[name="cf-turnstile-response"]');
                if (input) input.value = tokenValue;
            })('$token');
        """)

        // Submit
        driver?.evaluate("document.querySelector('form').submit();")

        Thread.sleep(3000)

        val document = session.parse(page)

        mapOf(
            "title" to document.selectFirstTextOrNull("title"),
            "content" to document.selectFirstTextOrNull("body")?.take(500)
        )
    }
}

Integration with Browser4 X-SQL

Browser4's X-SQL provides powerful extraction capabilities. Here's how to combine it with CAPTCHA solving:

kotlin Copy

class XSqlCaptchaExtractor(
    private val capSolver: CapSolverService
) {
    suspend fun extractProductsWithCaptcha(
        targetUrl: String,
        siteKey: String
    ): List<Map<String, Any?>> {
        // Pre-solve CAPTCHA
        val token = capSolver.solveReCaptchaV2(targetUrl, siteKey)

        // Create session and establish authenticated session
        val session = PulsarContexts.createSession()
        val page = session.open(targetUrl)
        val driver = session.getOrCreateBoundDriver()

        driver?.evaluate("""
            (function(tokenValue) {
                var el = document.querySelector('#g-recaptcha-response');
                if (el) el.value = tokenValue;
                document.querySelector('form').submit();
            })('$token');
        """)

        Thread.sleep(3000)

        // Now parse the page and extract product data
        val document = session.parse(page)

        // Extract product data using built-in session methods
        val products = mutableListOf<Map<String, Any?>>()
        val productElements = document.select(".product-item")

        for ((index, element) in productElements.withIndex()) {
            if (index >= 50) break // LIMIT 50
            products.add(mapOf(
                "name" to element.selectFirstTextOrNull(".product-name"),
                "price" to element.selectFirstTextOrNull(".price")?.let {
                    """(\d+\.?\d*)""".toRegex().find(it)?.groupValues?.get(1)?.toDoubleOrNull() ?: 0.0
                },
                "rating" to element.selectFirstTextOrNull(".rating")
            ))
        }

        return products.map { row ->
            mapOf(
                "name" to row["name"],
                "price" to row["price"],
                "rating" to row["rating"],
                "image_url" to row["image_url"]
            )
        }
    }
}

Pre-Authentication Pattern

For sites requiring CAPTCHA before accessing content, use a pre-authentication workflow:

kotlin Copy

import okhttp3.Cookie
import okhttp3.CookieJar
import okhttp3.HttpUrl

class PreAuthenticator(
    private val capSolver: CapSolverService
) {
    data class AuthSession(
        val cookies: Map<String, String>,
        val userAgent: String?
    )

    suspend fun authenticateWithCaptcha(
        loginUrl: String,
        siteKey: String
    ): AuthSession {
        // Solve CAPTCHA
        val captchaToken = capSolver.solveReCaptchaV2(loginUrl, siteKey)

        // Submit CAPTCHA to get session cookies
        val client = OkHttpClient.Builder()
            .cookieJar(object : CookieJar {
                private val cookies = mutableListOf<Cookie>()

                override fun saveFromResponse(url: HttpUrl, cookieList: List<Cookie>) {
                    cookies.addAll(cookieList)
                }

                override fun loadForRequest(url: HttpUrl): List<Cookie> = cookies
            })
            .build()

        val formBody = okhttp3.FormBody.Builder()
            .add("g-recaptcha-response", captchaToken)
            .build()

        val request = Request.Builder()
            .url(loginUrl)
            .post(formBody)
            .build()

        val response = client.newCall(request).execute()

        // Extract cookies from response
        val responseCookies = response.headers("Set-Cookie")
            .associate { cookie ->
                val parts = cookie.split(";")[0].split("=", limit = 2)
                parts[0] to (parts.getOrNull(1) ?: "")
            }

        return AuthSession(
            cookies = responseCookies,
            userAgent = response.request.header("User-Agent")
        )
    }
}

class AuthenticatedExtractor(
    private val preAuth: PreAuthenticator,
    private val capSolver: CapSolverService
) {
    suspend fun extractWithAuth(
        loginUrl: String,
        targetUrl: String,
        siteKey: String
    ): Map<String, Any?> {
        // Pre-authenticate
        val authSession = preAuth.authenticateWithCaptcha(loginUrl, siteKey)
        println("Session established with ${authSession.cookies.size} cookies")

        // Create Browser4 session
        val session = PulsarContexts.createSession()

        // Configure session with cookies
        val cookieScript = authSession.cookies.entries.joinToString(";") { (k, v) ->
            "$k=$v"
        }

        val page = session.open(targetUrl)
        val driver = session.getOrCreateBoundDriver()

        // Set cookies
        driver?.evaluate("document.cookie = '$cookieScript';")

        // Reload with authenticated session
        driver?.evaluate("location.reload();")
        Thread.sleep(2000)

        // Extract data
        val document = session.parse(page)

        return mapOf(
            "authenticated" to true,
            "content" to document.selectFirstTextOrNull(".protected-content"),
            "userData" to document.selectFirstTextOrNull(".user-profile")
        )
    }
}

OpenRouter Integration for LLM-Powered Extraction

Browser4's AI capabilities can be enhanced with OpenRouter, a unified API gateway for accessing various LLM models. This enables intelligent content extraction that adapts to different page structures.

OpenRouter Service

kotlin Copy

import com.google.gson.Gson
import com.google.gson.JsonObject
import okhttp3.MediaType.Companion.toMediaType
import okhttp3.OkHttpClient
import okhttp3.Request
import okhttp3.RequestBody.Companion.toRequestBody
import java.util.concurrent.TimeUnit

data class ChatMessage(val role: String, val content: String)
data class ChatCompletion(val content: String, val model: String, val usage: TokenUsage)
data class TokenUsage(val promptTokens: Int, val completionTokens: Int, val totalTokens: Int)

class OpenRouterService(private val apiKey: String) {
    private val client = OkHttpClient.Builder()
        .connectTimeout(60, TimeUnit.SECONDS)
        .readTimeout(60, TimeUnit.SECONDS)
        .build()

    private val gson = Gson()
    private val baseUrl = "https://openrouter.ai/api/v1"
    private val jsonMediaType = "application/json".toMediaType()

    fun chat(
        messages: List<ChatMessage>,
        model: String = "openai/gpt-4o-mini"
    ): ChatCompletion {
        val payload = mapOf(
            "model" to model,
            "messages" to messages.map { mapOf("role" to it.role, "content" to it.content) }
        )

        val request = Request.Builder()
            .url("$baseUrl/chat/completions")
            .header("Authorization", "Bearer $apiKey")
            .post(gson.toJson(payload).toRequestBody(jsonMediaType))
            .build()

        val response = client.newCall(request).execute()
        val result = gson.fromJson(response.body?.string(), JsonObject::class.java)

        val choice = result.getAsJsonArray("choices")?.get(0)?.asJsonObject
        val content = choice?.getAsJsonObject("message")?.get("content")?.asString ?: ""

        val usage = result.getAsJsonObject("usage")

        return ChatCompletion(
            content = content,
            model = result.get("model")?.asString ?: model,
            usage = TokenUsage(
                promptTokens = usage?.get("prompt_tokens")?.asInt ?: 0,
                completionTokens = usage?.get("completion_tokens")?.asInt ?: 0,
                totalTokens = usage?.get("total_tokens")?.asInt ?: 0
            )
        )
    }

    fun extractStructuredData(html: String, schema: String): String {
        val prompt = """
            Extract the following data from this HTML content.
            Return ONLY valid JSON matching this schema: $schema

            HTML:
            ${html.take(4000)}
        """.trimIndent()

        return chat(listOf(ChatMessage("user", prompt))).content
    }

    fun listModels(): List<String> {
        val request = Request.Builder()
            .url("$baseUrl/models")
            .header("Authorization", "Bearer $apiKey")
            .build()

        val response = client.newCall(request).execute()
        val result = gson.fromJson(response.body?.string(), JsonObject::class.java)

        return result.getAsJsonArray("data")?.mapNotNull {
            it.asJsonObject.get("id")?.asString
        } ?: emptyList()
    }
}

LLM-Powered Extraction with CAPTCHA Solving

Combine CAPTCHA solving with intelligent content extraction:

kotlin Copy

class SmartExtractor(
    private val capSolver: CapSolverService,
    private val openRouter: OpenRouterService
) {
    suspend fun extractWithAI(
        targetUrl: String,
        siteKey: String?,
        extractionPrompt: String
    ): Map<String, Any?> {
        // Step 1: Solve CAPTCHA if needed
        val captchaToken = siteKey?.let {
            println("Solving CAPTCHA...")
            capSolver.solveReCaptchaV2(targetUrl, it)
        }

        // Step 2: Create session and open page
        val session = PulsarContexts.createSession()
        val page = session.open(targetUrl)
        val driver = session.getOrCreateBoundDriver()

        captchaToken?.let { token ->
            driver?.evaluate("""
                (function(tokenValue) {
                    var el = document.querySelector('#g-recaptcha-response');
                    if (el) el.value = tokenValue;
                    var form = document.querySelector('form');
                    if (form) form.submit();
                })('$token');
            """)
            Thread.sleep(3000)
        }

        // Step 3: Extract page content
        val document = session.parse(page)
        val pageContent = document.body().text().take(8000)

        // Step 4: Use LLM to extract structured data
        val llmResponse = openRouter.chat(listOf(
            ChatMessage("system", "You are a data extraction assistant. Extract structured data from web pages."),
            ChatMessage("user", """
                $extractionPrompt

                Page content:
                $pageContent
            """.trimIndent())
        ))

        println("LLM used ${llmResponse.usage.totalTokens} tokens")

        return mapOf(
            "url" to targetUrl,
            "captchaSolved" to (captchaToken != null),
            "extractedData" to llmResponse.content,
            "tokensUsed" to llmResponse.usage.totalTokens
        )
    }
}

// Usage
fun main() = runBlocking {
    val capSolver = CapSolverService(System.getenv("CAPSOLVER_API_KEY")!!)
    val openRouter = OpenRouterService(System.getenv("OPENROUTER_API_KEY")!!)

    val extractor = SmartExtractor(capSolver, openRouter)

    val result = extractor.extractWithAI(
        targetUrl = "https://example.com/products",
        siteKey = "6LcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxABC",
        extractionPrompt = """
            Extract all products with:
            - name
            - price (as number)
            - availability (in_stock/out_of_stock)
            - rating (1-5)
            Return as JSON array.
        """.trimIndent()
    )

    println("Extraction result: ${result["extractedData"]}")
}

Adaptive Selector Generation

Use LLM to generate CSS selectors for unknown page structures:

kotlin Copy

class AdaptiveExtractor(
    private val capSolver: CapSolverService,
    private val openRouter: OpenRouterService
) {
    suspend fun extractWithAdaptiveSelectors(
        targetUrl: String,
        siteKey: String?,
        dataFields: List<String>
    ): Map<String, Any?> {
        // Solve CAPTCHA first
        val token = siteKey?.let { capSolver.solveReCaptchaV2(targetUrl, it) }

        val session = PulsarContexts.createSession()
        val page = session.open(targetUrl)
        val driver = session.getOrCreateBoundDriver()

        token?.let { t ->
            driver?.evaluate("""
                (function(tokenValue) {
                    var el = document.querySelector('#g-recaptcha-response');
                    if (el) el.value = tokenValue;
                })('$t');
            """)
        }

        // Get page HTML structure
        val htmlSample = driver?.evaluate("document.body.innerHTML")?.toString()?.take(5000) ?: ""

        // Ask LLM to generate selectors
        val selectorPrompt = """
            Analyze this HTML and provide CSS selectors for these fields: ${dataFields.joinToString(", ")}

            HTML sample:
            $htmlSample

            Return JSON like: {"fieldName": "css-selector", ...}
        """.trimIndent()

        val selectorsJson = openRouter.chat(listOf(ChatMessage("user", selectorPrompt))).content
        val selectors = Gson().fromJson(selectorsJson, Map::class.java) as Map<String, String>

        // Extract using generated selectors
        val document = session.parse(page)
        val extractedData = selectors.mapValues { (_, selector) ->
            document.selectFirstTextOrNull(selector)
        }

        return mapOf(
            "url" to targetUrl,
            "selectors" to selectors,
            "data" to extractedData
        )
    }
}

Parallel Extraction with Coroutines

Browser4's coroutine-safe design enables efficient parallel CAPTCHA handling:

kotlin Copy

import kotlinx.coroutines.*
import kotlinx.coroutines.channels.Channel

data class ExtractionJob(
    val url: String,
    val siteKey: String?
)

data class ExtractionResult(
    val url: String,
    val data: Map<String, Any?>?,
    val captchaSolved: Boolean,
    val error: String?,
    val duration: Long
)

class ParallelExtractor(
    private val capSolver: CapSolverService,
    private val concurrency: Int = 5
) {
    suspend fun extractAll(jobs: List<ExtractionJob>): List<ExtractionResult> = coroutineScope {
        val channel = Channel<ExtractionJob>(Channel.UNLIMITED)
        val results = mutableListOf<ExtractionResult>()

        // Send all jobs to channel
        jobs.forEach { channel.send(it) }
        channel.close()

        // Process with limited concurrency
        val workers = (1..concurrency).map { workerId ->
            async {
                val workerResults = mutableListOf<ExtractionResult>()
                // Each worker creates its own session for thread safety
                val workerSession = PulsarContexts.createSession()

                for (job in channel) {
                    val startTime = System.currentTimeMillis()
                    var captchaSolved = false

                    try {
                        // Solve CAPTCHA if site key provided
                        val token = job.siteKey?.let {
                            captchaSolved = true
                            capSolver.solveReCaptchaV2(job.url, it)
                        }

                        // Extract data
                        val page = workerSession.open(job.url)

                        token?.let { t ->
                            val driver = workerSession.getOrCreateBoundDriver()
                            driver?.evaluate("""
                                (function(tokenValue) {
                                    var el = document.querySelector('#g-recaptcha-response');
                                    if (el) el.value = tokenValue;
                                })('$t');
                            """)
                        }

                        val document = workerSession.parse(page)

                        workerResults.add(ExtractionResult(
                            url = job.url,
                            data = mapOf(
                                "title" to document.selectFirstTextOrNull("title"),
                                "h1" to document.selectFirstTextOrNull("h1")
                            ),
                            captchaSolved = captchaSolved,
                            error = null,
                            duration = System.currentTimeMillis() - startTime
                        ))
                    } catch (e: Exception) {
                        workerResults.add(ExtractionResult(
                            url = job.url,
                            data = null,
                            captchaSolved = captchaSolved,
                            error = e.message,
                            duration = System.currentTimeMillis() - startTime
                        ))
                    }
                }

                workerResults
            }
        }

        workers.awaitAll().flatten()
    }
}

// Usage
fun main() = runBlocking {
    val capSolver = CapSolverService(System.getenv("CAPSOLVER_API_KEY")!!)

    val extractor = ParallelExtractor(capSolver, concurrency = 5)

    val jobs = listOf(
        ExtractionJob("https://site1.com/data", "6Lc..."),
        ExtractionJob("https://site2.com/data", null),
        ExtractionJob("https://site3.com/data", "6Lc..."),
    )

    val results = extractor.extractAll(jobs)

    val solved = results.count { it.captchaSolved }
    println("Completed ${results.size} extractions, solved $solved CAPTCHAs")

    results.forEach { r ->
        println("${r.url}: ${r.duration}ms - ${r.error ?: "success"}")
    }
}

Best Practices

1. Error Handling with Retries

kotlin Copy

suspend fun <T> withRetry(
    maxRetries: Int = 3,
    initialDelay: Long = 1000,
    block: suspend () -> T
): T {
    var lastException: Exception? = null

    repeat(maxRetries) { attempt ->
        try {
            return block()
        } catch (e: Exception) {
            lastException = e
            println("Attempt ${attempt + 1} failed: ${e.message}")
            delay(initialDelay * (attempt + 1))
        }
    }

    throw lastException ?: Exception("Max retries exceeded")
}

// Usage
val token = withRetry(maxRetries = 3) {
    capSolver.solveReCaptchaV2(url, siteKey)
}

2. Balance Management

kotlin Copy

suspend fun ensureSufficientBalance(
    capSolver: CapSolverService,
    minBalance: Double = 1.0
) {
    val balance = capSolver.checkBalance()

    if (balance < minBalance) {
        throw Exception("Insufficient CapSolver balance: $${"%.2f".format(balance)}. Please recharge.")
    }

    println("CapSolver balance: $${"%.2f".format(balance)}")
}

3. Token Caching

kotlin Copy

class TokenCache(private val ttlMs: Long = 90_000) {
    private data class CachedToken(val token: String, val timestamp: Long)

    private val cache = mutableMapOf<String, CachedToken>()

    private fun getKey(domain: String, siteKey: String) = "$domain:$siteKey"

    fun get(domain: String, siteKey: String): String? {
        val key = getKey(domain, siteKey)
        val cached = cache[key] ?: return null

        if (System.currentTimeMillis() - cached.timestamp > ttlMs) {
            cache.remove(key)
            return null
        }

        return cached.token
    }

    fun set(domain: String, siteKey: String, token: String) {
        val key = getKey(domain, siteKey)
        cache[key] = CachedToken(token, System.currentTimeMillis())
    }
}

// Usage with caching
class CachedCapSolver(
    private val capSolver: CapSolverService,
    private val cache: TokenCache = TokenCache()
) {
    suspend fun solveReCaptchaV2Cached(websiteUrl: String, websiteKey: String): String {
        val domain = java.net.URL(websiteUrl).host

        cache.get(domain, websiteKey)?.let {
            println("Using cached token")
            return it
        }

        val token = capSolver.solveReCaptchaV2(websiteUrl, websiteKey)
        cache.set(domain, websiteKey, token)

        return token
    }
}

Configuration Options

Setting	Description	Default
`CAPSOLVER_API_KEY`	Your CapSolver API key	-
`OPENROUTER_API_KEY`	OpenRouter API key for LLM features	-
`PROXY_ROTATION_URL`	Proxy rotation service URL	-
Browser4 uses `application.properties` for additional configuration

Conclusion

Integrating CapSolver with Browser4 creates a powerful combination for high-throughput web data extraction. Browser4's coroutine-safe architecture and extreme performance capabilities, combined with CapSolver's reliable CAPTCHA solving, enable extraction at scale.

Key integration patterns:

Direct Token Injection: Inject solved tokens via JavaScript evaluation
Pre-Authentication: Solve CAPTCHAs to establish sessions before extraction
Parallel Processing: Leverage coroutines for concurrent CAPTCHA handling
X-SQL Integration: Combine CAPTCHA solving with Browser4's powerful query language

Whether you're building price monitoring systems, market research pipelines, or data aggregation platforms, the Browser4 + CapSolver combination provides the reliability and scalability needed for production environments.

Ready to get started? Sign up for CapSolver and use bonus code BROWSER4 for an extra 6% bonus on your first recharge!

FAQ

What is Browser4?

Browser4 is a high-performance, coroutine-safe browser automation framework from PulsarRPA. It's built in Kotlin and designed for AI-powered data extraction, supporting 100k-200k complex page visits per machine per day.

How does CapSolver integrate with Browser4?

CapSolver integrates with Browser4 through a service class that solves CAPTCHAs via the CapSolver API. The solved tokens are then injected into pages using Browser4's JavaScript evaluation capabilities (driver.evaluate()).

What types of CAPTCHAs can CapSolver solve?

CapSolver supports reCAPTCHA v2, reCAPTCHA v3, Cloudflare Turnstile, Cloudflare Challenge (5s), AWS WAF, GeeTest v3/v4, and many more.

How much does CapSolver cost?

CapSolver offers competitive pricing based on the type and volume of CAPTCHAs solved. Visit capsolver.com for current pricing. Use code BROWSER4 for a 6% bonus.

What programming language does Browser4 use?

Browser4 is built in Kotlin and runs on the JVM (Java 17+). It can also be used from Java applications.

Can Browser4 handle parallel CAPTCHA solving?

Yes! Browser4's coroutine-safe design enables efficient parallel processing. Combined with CapSolver's API, you can solve multiple CAPTCHAs concurrently across different extraction jobs.

How do I find the CAPTCHA site key?

The site key is typically found in the page's HTML source:

reCAPTCHA: data-sitekey attribute on .g-recaptcha element
Turnstile: data-sitekey attribute on .cf-turnstile element
Or check network requests for the key in API calls

Compliance Disclaimer: The information provided on this blog is for informational purposes only. CapSolver is committed to compliance with all applicable laws and regulations. The use of the CapSolver network for illegal, fraudulent, or abusive activities is strictly prohibited and will be investigated. Our captcha-solving solutions enhance user experience while ensuring 100% compliance in helping solve captcha difficulties during public data crawling. We encourage responsible use of our services. For more information, please visit our Terms of Service and Privacy Policy.

Browser Automation for Developers: Mastering Selenium & CAPTCHA in 2026

Master browser automation for developers with this 2026 guide. Learn Selenium WebDriver Java, Actions Interface, and how to solve CAPTCHA using CapSolver.

web scraping

Adélia Cruz

02-Mar-2026

PicoClaw Automation: A Guide to Integrating CapSolver API

Learn to integrate CapSolver with PicoClaw for automated CAPTCHA solving on ultra-lightweight $10 edge hardware.

web scraping

Ethan Collins

26-Feb-2026

How to Solve Captcha in Nanobot with CapSolver

Automate CAPTCHA solving with Nanobot and CapSolver. Use Playwright to solve reCAPTCHA and Cloudflare autonomously.

web scraping

Ethan Collins

26-Feb-2026

How to Extract Structured Data From Popular Websites

Learn how to extract structured data from popular websites. Discover tools, techniques, and best practices for web scraping and data analysis.

web scraping

Aloísio Vítor

12-Feb-2026

Data as a Service (DaaS): What It Is and Why It Matters in 2026

Understand Data as a Service (DaaS) in 2026. Explore its benefits, use cases, and how it transforms businesses with real-time insights and scalability.

web scraping

Emma Foster

12-Feb-2026

How to Fix Common Web Scraping Errors in 2026

Master fixing diverse web scraper errors like 400, 401, 402, 403, 429, 5xx, and Cloudflare 1001 in 2026. Learn advanced strategies for IP rotation, headers, and adaptive rate limiting with CapSolver.

web scraping

Lucas Mitchell

05-Feb-2026

How to Solve Captcha in Browser4 with CapSolver Integration

What is Browser4?

Key Features of Browser4

Core API Methods

What is CapSolver?

Supported CAPTCHA Types

Why Integrate CapSolver with Browser4?

Installation

Prerequisites

Adding Dependencies

Environment Setup

Creating a CapSolver Service for Browser4

Basic CapSolver Service

Solving Different CAPTCHA Types

reCAPTCHA v2 with Browser4

reCAPTCHA v3 with Browser4

Cloudflare Turnstile with Browser4

Integration with Browser4 X-SQL

Pre-Authentication Pattern

OpenRouter Integration for LLM-Powered Extraction

OpenRouter Service

LLM-Powered Extraction with CAPTCHA Solving

Adaptive Selector Generation

Parallel Extraction with Coroutines

Best Practices

1. Error Handling with Retries

2. Balance Management

3. Token Caching

Configuration Options

Conclusion

FAQ

What is Browser4?

How does CapSolver integrate with Browser4?

What types of CAPTCHAs can CapSolver solve?

How much does CapSolver cost?

What programming language does Browser4 use?

Can Browser4 handle parallel CAPTCHA solving?

How do I find the CAPTCHA site key?

More

Browser Automation for Developers: Mastering Selenium & CAPTCHA in 2026

PicoClaw Automation: A Guide to Integrating CapSolver API

How to Solve Captcha in Nanobot with CapSolver

How to Extract Structured Data From Popular Websites

Data as a Service (DaaS): What It Is and Why It Matters in 2026

How to Fix Common Web Scraping Errors in 2026