CAPSOLVER
Blog
How to Solve Cloudflare with Playwright in 2024

How to Solve Cloudflare with Playwright in 2024

Logo of Capsolver

Ethan Collins

Pattern Recognition Specialist

12-Sep-2024

You know, there's a certain thrill in outsmarting obstacles—especially when those obstacles are digital gatekeepers like Cloudflare. If you’ve ever found yourself staring at a Cloudflare challenge while trying to automate a web task, you’re in good company. I’ve been there, many times. But in 2024, the game has changed, and so have the tools. Let me walk you through how I’ve been tackling Cloudflare with Playwright, and yeah, we’ll also talk about the sneaky newcomer on the block, Cloudflare Turnstile.

What is Cloudflare and Why It Matters

Before we dive into the nitty-gritty of solving Cloudflare challenges, let’s take a moment to understand what we’re up against. Cloudflare is a robust security service used by millions of websites to protect against malicious traffic, DDoS attacks, and a variety of other threats. When it detects unusual behavior—like an automated script trying to access a page—it throws up a challenge, often in the form of a CAPTCHA, to verify that you’re a human and not a bot.

But here’s the kicker: Cloudflare isn’t just about throwing up simple CAPTCHAs anymore. In 2024, they’ve rolled out something called Cloudflare Turnstile, a more sophisticated and adaptive challenge system that’s designed to be even more resilient against automation. It’s a tough nut to crack, but with the right approach, you can still come out on top.

Struggling with the repeated failure to completely solve the irritating captcha?

Discover seamless automatic captcha solving with Capsolver AI-powered Auto Web Unblock technology!

Claim Your Bonus Code for top captcha solutions; CapSolver: WEBS. After redeeming it, you will get an extra 5% bonus after each recharge, Unlimited

Why Playwright is the Tool of Choice in 2024

You might be wondering, “Why Playwright? Why not stick with good ol’ Selenium or Puppeteer?” And that’s a fair question. The answer is that Playwright has emerged as a powerhouse for web automation, offering features that make it particularly effective against modern challenges like those posed by Cloudflare.

Playwright supports multiple browser contexts, which means you can simulate different users more effectively. It also provides more control over browser behavior, making it easier to mimic real user interactions—something that’s crucial when dealing with Cloudflare’s advanced security measures.

Getting Started: Setting Up Playwright

First things first, if you haven’t already, you’ll need to install Playwright. Setting it up is straightforward:

npm install playwright

Once installed, you’re ready to start automating your web tasks. But if your goal is to get past Cloudflare challenges, especially their new Turnstile CAPTCHA, we’ll need to take a few extra steps. We’ll be leveraging CapSolver, a third-party API designed to solve CAPTCHAs like Turnstile, and integrate it with Playwright to access sites protected by Cloudflare.

Step 1: Grabbing the SiteKey

The first obstacle you’ll face with Turnstile CAPTCHA is obtaining the siteKey from the webpage. This key is essential for CapSolver to process the CAPTCHA and give you a valid token.

You can extract the siteKey by inspecting the webpage’s source or, to make life easier, you can use the CapSolver Extension. It automatically detects CAPTCHA parameters on the page. For a detailed guide on how to set this up, check out our blog post:
Identify Cloudflare Turnstile Parameters.

Once you have the siteKey, you’re ready to move to the next step.

Step 2: Calling CapSolver API to Solve the CAPTCHA

With the siteKey in hand, it’s time to use CapSolver’s API to solve the Turnstile CAPTCHA and retrieve a valid token. This token will allow us to bypass the challenge and proceed with our web scraping or automation tasks.

Here’s a sample code snippet using axios and Playwright to interact with CapSolver:

const axios = require('axios');
const playwright = require("playwright");

const api_key = "YOUR_API_KEY"; // Your CapSolver API Key
const site_key = "0xxxxxx"; // The siteKey you retrieved
const site_url = "https://xxx.xxx.xxx/xxx"; // The target website URL
const proxy = "http://xxx:[email protected]:x"; // Optional: Use your proxy if required

async function solveCaptcha() {
  const payload = {
    clientKey: api_key,
    task: {
      type: 'AntiTurnstileTaskProxyLess',
      websiteKey: site_key,
      websiteURL: site_url,
      metadata: {
        action: '', // Optional, specify if needed
        type: "turnstile"
      }
    }
  };

  try {
    const res = await axios.post("https://api.capsolver.com/createTask", payload);
    const task_id = res.data.taskId;
    if (!task_id) {
      console.log("Failed to create task:", res.data);
      return;
    }

    console.log("Task created, waiting for token...");

    while (true) {
      await new Promise(resolve => setTimeout(resolve, 1000)); // Wait for 1 second before checking again
      const getResultPayload = {clientKey: api_key, taskId: task_id};
      const resp = await axios.post("https://api.capsolver.com/getTaskResult", getResultPayload);
      
      if (resp.data.status === "ready") {
        console.log("CAPTCHA solved, token received:", resp.data.solution.token);
        return resp.data.solution.token;
      }

      if (resp.data.status === "failed" || resp.data.errorId) {
        console.log("CAPTCHA solving failed! Response:", resp.data);
        return;
      }
    }
  } catch (error) {
    console.error("Error solving CAPTCHA:", error);
  }
}

In this code, we create a task by sending a POST request to CapSolver’s API, passing the siteKey and the URL of the website we want to access. Once the task is created, we continuously check the status until CapSolver returns a solution token. This token is what we’ll use to prove to Cloudflare that we’re human.

Step 3: Injecting the CAPTCHA Token with Playwright

Now that we have the CAPTCHA token, we need to inject it into the session as a cookie using Playwright. This will allow us to navigate the site without being blocked by Cloudflare’s protection. Here’s how to do that:

const wait = (ms) => new Promise(resolve => setTimeout(resolve, ms));

async function accessSiteWithToken(){
  let clearanceCookie;

  // Solve CAPTCHA and get the token
  await solveCaptcha().then(token => {
    clearanceCookie = token;
  });

  const browser = await playwright.chromium.launch();
  const context = await browser.newContext();
  const page = await context.newPage();

  await wait(500);

  // Inject the token as a cookie
  await page.setCookie({
    name: "cf_clearance",
    value: clearanceCookie,
    url: site_url, // Ensure this matches the target URL
    domain: "xx.xx.xx" // Adjust domain as per the actual site
  });

  await wait(500);

  // Navigate to the website after setting the cookie
  await page.goto(site_url);
  
  // You can now scrape the content or interact with the page freely
  console.log("Successfully accessed the website!");

  await browser.close();
}

// Run the script to access the site
accessSiteWithToken().then();

Final Thoughts

Cloudflare has undoubtedly made it harder to scrape websites or automate tasks in 2024, but with tools like Playwright and CapSolver, the challenge is far from impossible. Playwright’s ability to simulate real user interactions combined with CapSolver’s CAPTCHA-solving API provides a powerful way to bypass these barriers without breaking a sweat.

Of course, it’s always a good idea to ensure you’re staying within the bounds of legal and ethical scraping practices. Some websites have strict policies regarding automated access, so make sure you’re aware of them before proceeding.

In the ever-evolving world of web automation, it’s all about staying ahead of the curve—and with Playwright and CapSolver, you’re equipped to do just that.

More

How to Automate Cloudflare Turnstile Solve for Web Crawling
How to Automate Cloudflare Turnstile Solve for Web Crawling

We will explore strategies for handling Cloudflare Turnstile CAPTCHA in web crawling and discuss techniques to automate its solution using Puppeteer and CapSolver in Python.

Cloudflare
Logo of Capsolver

Lucas Mitchell

27-Sep-2024

How to Use C# to Solve Cloudflare Turnstile CAPTCHA Challenges
How to Use C# to Solve Cloudflare Turnstile CAPTCHA Challenges

You'll know how to easily solve Cloudflare Turnstile's CAPTCHA challenge using C#, and want to know the specifics? Let's go!

Cloudflare
Logo of Capsolver

Lucas Mitchell

18-Sep-2024

How to Solve Cloudflare with Playwright in 2024
How to Solve Cloudflare with Playwright in 2024

Learn how to solve Cloudflare Turnstile using Playwright and CapSolver in 2024 for seamless web automation.

Cloudflare
Logo of Capsolver

Ethan Collins

12-Sep-2024

How to Solve Cloudflare with Puppeteer
How to Solve Cloudflare with Puppeteer

Learn how to effectively solve Cloudflare's security challenges using Puppeteer and CapSolver. This guide provides a step-by-step approach to bypass JavaScript checks and CAPTCHAs, enabling seamless web scraping and automation on Cloudflare-protected websites.

Cloudflare
Logo of Capsolver

Lucas Mitchell

26-Aug-2024

Understanding Cloudflare 1010 Error and How to Solve It
Understanding Cloudflare 1010 Error and How to Solve It

Learn how to resolve the Cloudflare 1010 error, commonly known as "Access Denied: Bad Bot." Understand the causes behind this error and discover practical solutions, including CapSolver integration, to bypass Cloudflare's security checks and ensure seamless access to websites.

Cloudflare
Logo of Capsolver

Lucas Mitchell

22-Aug-2024

How to solve cloudflare | Using Puppeteer Node.JS
How to solve cloudflare | Using Puppeteer Node.JS

We will explore how to effectively solve Cloudflare like Turnstile by using Puppeteer and Node.js and the help from Captcha solver

Cloudflare
Logo of Capsolver

Rajinder Singh

20-Aug-2024