Blog
How to solve AWS Captcha / Challenge with PHP

How to solve AWS Captcha / Challenge with PHP

Logo of Capsolver

CapSolver Blogger

How to use capsolver

14-Sep-2023

How to solve AWS Captcha / Challenge using PHP

Welcome to this comprehensive guide where we'll delve into the world of AWS CAPTCHA and how to implement a PHP solver for it.

What is AWS CAPTCHA?

Overview

AWS CAPTCHA is a feature offered by AWS WAF (Web Application Firewall) that allows you to configure rules to run CAPTCHA or Challenge actions against web requests. These actions are designed to verify whether the incoming web requests are from legitimate human users or bots.

Types of Actions

  • CAPTCHA: Requires the end user to solve a CAPTCHA puzzle to prove they are human. You can customize the behavior and placement of the puzzle in your client application. When captcha is needed to be solved, the URL will return 405 status code.

  • Challenge: Runs a silent challenge that verifies if the client session is a browser and not a bot. This verification runs in the background, without involving the end user. When challenge is needed to be solved, the URL will return 202 status code.

How It Works

Once the captcha / challenge is solved, will return a value of a cookie called aws-waf-token

🔎 Solving AWS Captcha / Challenge using PHP

Below is a step-by-step guide to implementing a PHP-based solution to solve AWS CAPTCHA / Challenges.

📕 Requirements

  • PHP
  • cURL
  • CAPSOLVER API Key

🛠️ Step 1: Setting up the variables

<?php
$PROXY = "http://username:password@host:port"; // Replace with your proxy
$PAGE_URL = "https://norway-meetup.aws.wslab.no/";  // Replace
$CLIENT_KEY = "YourPayPerUsage";  // Replace with your CAPSOLVER API Key

You must replace these variables values with your own values.

⚡ Step 2: Integrate Capsolver

function createTask($payload) {
    global $CLIENT_KEY;
    $ch = curl_init();
    echo("Creating task...\n");
    curl_setopt($ch, CURLOPT_URL, 'https://api.capsolver.com/createTask');
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_POST, true);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode(['clientKey' => $CLIENT_KEY, 'task' => $payload]));
    curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
    $response = curl_exec($ch);
    curl_close($ch);
    return json_decode($response, true);
}

function getTaskResult($taskId) {
    global $CLIENT_KEY;
    do {
        echo("Waiting for solution...\n");
        sleep(1);
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, 'https://api.capsolver.com/getTaskResult');
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_POST, true);
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
        curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode(['clientKey' => $CLIENT_KEY, 'taskId' => $taskId]));
        curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
        $response = curl_exec($ch);
        curl_close($ch);
        $data = json_decode($response, true);
        if($data['status'] == "ready") {
            return $data;
        }
    } while(true);
}

After the createTask and getTaskResult are integrated, let's start with how to handle the different scenarios (CAPTCHA / CHALLENGE) with CAPSOLVER
Depends if it's captcha or challenge, you will need to submit different values.
So let's make a function for each one of them.

function solveAwsChallenge($awsChallengeJS) {
    global $PAGE_URL, $PROXY;
    $payload = [
        'type' => "AntiAwsWafTask",
        'websiteURL' => $PAGE_URL,
        'awsKey' => "",
        'awsIv' => "",
        'awsContext' => "",
        'awsChallengeJS' => $awsChallengeJS,
        'proxy' => $PROXY
    ];
    $taskData = createTask($payload);
    return getTaskResult($taskData['taskId']);
}
function solveAwsCaptcha($awsChallengeJS, $key, $iv, $context) {
    global $PAGE_URL, $PROXY;
    $payload = [
        'type' => "AwsCaptchaTask",
        'websiteURL' => $PAGE_URL,
        'awsKey' => $key,
        'awsIv' => $iv,
        'awsContext' => $context,
        'awsChallengeJS' => $awsChallengeJS,
        'proxy' => $PROXY
    ];
    $taskData = createTask($payload);
    return getTaskResult($taskData['taskId']);
}

After the integration of Capsolver is done and the methods for handle the captcha / challenge are done, let's move to the next step.

🍃 Step 3: Handling AWS Captcha / Challenge on the website

Since there are 2 types (CAPTCHA / CHALLENGE), we will need to handle when captcha and when challenge appear on the website, for that, the status code will play a big point here.
Captcha will have 405 status code
Challenge will have 202 status code

Here is a example code:

$ch = curl_init($PAGE_URL);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);

if($httpCode == 202) {
    preg_match('/<script src="([^"]*token.awswaf.com[^"]*)"/', $response, $matches);
    $awsChallengeJS = $matches[1];
    $result = solveAwsChallenge($awsChallengeJS);
    $cookie = $result['solution']['cookie'];
} elseif($httpCode == 405) {
    preg_match('/<script src="([^"]*token.awswaf.com[^"]*)"/', $response, $matches);
    $awsChallengeJS = $matches[1];
    preg_match('/"key":"(.*?)"/', $response, $matches);
    $key = $matches[1];
    preg_match('/"iv":"(.*?)"/', $response, $matches);
    $iv = $matches[1];
    preg_match('/"context":"(.*?)"/', $response, $matches);
    $context = $matches[1];
    $result = solveAwsCaptcha($awsChallengeJS, $key, $iv, $context);
    $cookie = $result['solution']['cookie'];
}

// Use of the token returned by capsolver and converted to the cookie aws-waf-token
$ch = curl_init($PAGE_URL);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIE, "aws-waf-token=".$cookie);
$response = curl_exec($ch);
curl_close($ch);

echo $response;

This code will handle the different types, get the required values and convert the value from capsolver into the cookie aws-waf-token.

🔮 Full Code:

<?php
$PROXY = "http://username:password@host:port"; // Replace with your proxy
$PAGE_URL = "https://norway-meetup.aws.wslab.no/"; 
$CLIENT_KEY = "YourPayPerUsage";  // Replace with your CAPSOLVER API Key

function createTask($payload) {
    global $CLIENT_KEY;
    $ch = curl_init();
    echo("Creating task...\n");
    curl_setopt($ch, CURLOPT_URL, 'https://api.capsolver.com/createTask');
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_POST, true);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode(['clientKey' => $CLIENT_KEY, 'task' => $payload]));
    curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
    $response = curl_exec($ch);
    curl_close($ch);
    return json_decode($response, true);
}

function getTaskResult($taskId) {
    global $CLIENT_KEY;
    do {
        echo("Waiting for solution...\n");
        sleep(1);
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, 'https://api.capsolver.com/getTaskResult');
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_POST, true);
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
        curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode(['clientKey' => $CLIENT_KEY, 'taskId' => $taskId]));
        curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
        $response = curl_exec($ch);
        curl_close($ch);
        $data = json_decode($response, true);
        if($data['status'] == "ready") {
            return $data;
        }
    } while(true);
}

function solveAwsChallenge($awsChallengeJS) {
    global $PAGE_URL, $PROXY;
    $payload = [
        'type' => "AntiAwsWafTask",
        'websiteURL' => $PAGE_URL,
        'awsKey' => "",
        'awsIv' => "",
        'awsContext' => "",
        'awsChallengeJS' => $awsChallengeJS,
        'proxy' => $PROXY
    ];
    $taskData = createTask($payload);
    return getTaskResult($taskData['taskId']);
}
function solveAwsCaptcha($awsChallengeJS, $key, $iv, $context) {
    global $PAGE_URL, $PROXY;
    $payload = [
        'type' => "AwsCaptchaTask",
        'websiteURL' => $PAGE_URL,
        'awsKey' => $key,
        'awsIv' => $iv,
        'awsContext' => $context,
        'awsChallengeJS' => $awsChallengeJS,
        'proxy' => $PROXY
    ];
    $taskData = createTask($payload);
    return getTaskResult($taskData['taskId']);
}

$ch = curl_init($PAGE_URL);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);

if($httpCode == 202) {
    preg_match('/<script src="([^"]*token.awswaf.com[^"]*)"/', $response, $matches);
    $awsChallengeJS = $matches[1];
    $result = solveAwsChallenge($awsChallengeJS);
    $cookie = $result['solution']['cookie'];
} elseif($httpCode == 405) {
    preg_match('/<script src="([^"]*token.awswaf.com[^"]*)"/', $response, $matches);
    $awsChallengeJS = $matches[1];
    preg_match('/"key":"(.*?)"/', $response, $matches);
    $key = $matches[1];
    preg_match('/"iv":"(.*?)"/', $response, $matches);
    $iv = $matches[1];
    preg_match('/"context":"(.*?)"/', $response, $matches);
    $context = $matches[1];
    $result = solveAwsCaptcha($awsChallengeJS, $key, $iv, $context);
    $cookie = $result['solution']['cookie'];
}

// Further usage of the cookie...
$ch = curl_init($PAGE_URL);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIE, "aws-waf-token=".$cookie);
$response = curl_exec($ch);
curl_close($ch);

echo $response;

?>

More

Change the User-Agent in Selenium
Change the User-Agent in Selenium | Steps & Best Practices

Changing the User Agent in Selenium is a crucial step for many web scraping tasks. It helps to disguise the automation script as a regular browser...

The other captcha

13-Jun-2024

web crawler in python
Web Crawler in Python and How to Avoid Getting Blocked When Web Crawling

Web crawling, also known as web scraping, is the automated process of navigating through websites, extracting data, and storing it for various purposes such as data analysis, market research, and content aggregation...

The other captcha

11-Jun-2024

Web Scraping in C
Web Scraping in C#: Without Getting Blocked

Enhance your web scraping skills with C#. Master efficient data extraction using advanced libraries and techniques in our expert guide. Start now!

The other captcha

07-Jun-2024

How to Solve DataDome 403
How to Solve DataDome 403 Forbidden Error in Web Scraping | Complete Solution

Unlock the secrets to overcoming DataDome's 403 Forbidden error in web scraping, ensuring uninterrupted access to your valuable data.

The other captcha

05-Jun-2024

Scrapy vs. Beautiful Soup
Scrapy vs. Beautiful Soup | Web Scraping Tutorial 2024

Dive into the world of web scraping with Scrapy and Beautiful Soup, and master CAPTCHA challenges seamlessly with CapSolver.

The other captcha

31-May-2024

How to Solve Imperva Incapsula
How to Solve Imperva Incapsula When Web Scraping in 2024 | Complete Guide

Web scraping with Imperva Incapsula's security is challenging. This guide explores identifying Imperva-protected sites, reverse engineering, network detection, and using CapSolver for efficient solving in 2024.

The other captcha

29-May-2024