CAPSOLVER
Blog
How to Use Axios for Web Scraping

How to Use Axios for Web Scraping

Logo of Capsolver

Lucas Mitchell

Automation Engineer

23-Sep-2024

What is Axios?

Axios is a popular JavaScript library used for making HTTP requests from both the browser and Node.js. It simplifies making asynchronous HTTP requests and allows you to handle responses easily with promises.

Features:

  • Promise-based: Uses JavaScript promises, making it easier to manage asynchronous operations.
  • Browser and Node.js Support: Works seamlessly in both environments.
  • Automatic JSON Parsing: Automatically parses JSON responses.
  • Interceptors: Supports request and response interceptors for managing requests and handling responses globally.
  • Error Handling: Provides built-in mechanisms for handling errors.

Prerequisites

Before using Axios, ensure you have:

  • Node.js installed for server-side usage.
  • npm or yarn to install packages.

Installation

You can install Axios using npm or yarn:

npm install axios

or

yarn add axios

Basic Example: Making a GET Request

Here’s how to perform a simple GET request using Axios:

const axios = require('axios');

axios.get('https://httpbin.org/get')
  .then(response => {
    console.log('Status Code:', response.status);
    console.log('Response Body:', response.data);
  })
  .catch(error => {
    console.error('Error:', error);
  });

Web Scraping Example: Fetching JSON Data from an API

Let’s fetch data from an API and print the results:

const axios = require('axios');

axios.get('https://jsonplaceholder.typicode.com/posts')
  .then(response => {
    const posts = response.data;
    posts.forEach(post => {
      console.log(`${post.title} — ${post.body}`);
    });
  })
  .catch(error => {
    console.error('Error:', error);
  });

Handling Captchas with CapSolver and Axios

In this section, we will integrate CapSolver with Axios to bypass captchas. CapSolver provides an API for solving captchas like ReCaptcha V3 and hCaptcha.

We will demonstrate solving ReCaptcha V3 with CapSolver and using the solution in a request.

Example: Solving ReCaptcha V3 with CapSolver and Axios

First, install Axios and CapSolver:

npm install axios
npm install capsolver

Now, here’s how to solve a ReCaptcha V3 and use the solution in your request:

const axios = require('axios');
const CAPSOLVER_KEY = 'YourKey';
const PAGE_URL = 'https://antcpt.com/score_detector';
const PAGE_KEY = '6LcR_okUAAAAAPYrPe-HK_0RULO1aZM15ENyM-Mf';
const PAGE_ACTION = 'homepage';

async function createTask(url, key, pageAction) {
  try {
    const apiUrl = 'https://api.capsolver.com/createTask';
    const payload = {
      clientKey: CAPSOLVER_KEY,
      task: {
        type: 'ReCaptchaV3TaskProxyLess',
        websiteURL: url,
        websiteKey: key,
        pageAction: pageAction
      }
    };
    const headers = {
      'Content-Type': 'application/json',
    };
    const response = await axios.post(apiUrl, payload, { headers });
    return response.data.taskId;

  } catch (error) {
    console.error('Error creating CAPTCHA task:', error);
    throw error;
  }
}

async function getTaskResult(taskId) {
  try {
    const apiUrl = 'https://api.capsolver.com/getTaskResult';
    const payload = {
      clientKey: CAPSOLVER_KEY,
      taskId: taskId,
    };
    const headers = {
      'Content-Type': 'application/json',
    };
    let result;
    do {
      const response = await axios.post(apiUrl, payload, { headers });
      result = response.data;
      if (result.status === 'ready') {
        return result.solution;
      }
      await new Promise(resolve => setTimeout(resolve, 5000)); // wait 5 seconds before retrying
    } while (true);

  } catch (error) {
    console.error('Error getting CAPTCHA result:', error);
    throw error;
  }
}

function setSessionHeaders() {
  return {
    'cache-control': 'max-age=0',
    'sec-ch-ua': '"Not/A)Brand";v="99", "Google Chrome";v="107", "Chromium";v="107"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': 'Windows',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
    'sec-fetch-site': 'same-origin',
    'sec-fetch-mode': 'navigate',
    'sec-fetch-user': '?1',
    'sec-fetch-dest': 'document',
    'accept-encoding': 'gzip, deflate',
    'accept-language': 'en,fr-FR;q=0.9,fr;q=0.8,en-US;q=0.7',
  };
}

async function main() {
  const headers = setSessionHeaders();
  console.log('Creating CAPTCHA task...');
  const taskId = await createTask(PAGE_URL, PAGE_KEY, PAGE_ACTION);
  console.log(`Task ID: ${taskId}`);

  console.log('Retrieving CAPTCHA result...');
  const solution = await getTaskResult(taskId);
  const token = solution.gRecaptchaResponse;
  console.log(`Token Solution: ${token}`);

  const res = await axios.post('https://antcpt.com/score_detector/verify.php', { 'g-recaptcha-response': token }, { headers });
  const response = res.data;
  console.log(`Score: ${response.score}`);
}

main().catch(err => {
  console.error(err);
});

Handling Proxies with Axios

To route your requests through a proxy with Axios:

const axios = require('axios');

axios.get('https://httpbin.org/ip', {
  proxy: {
    host: 'proxyserver',
    port: 8080,
    auth: {
      username: 'username',
      password: 'password'
    }
  }
})
  .then(response => {
    console.log('Response Body:', response.data);
  })
  .catch(error => {
    console.error('Error:', error);
  });

Handling Cookies with Axios

You can handle cookies in Axios using the withCredentials option:

const axios = require('axios');

axios.get('https://httpbin.org/cookies/set?name=value', { withCredentials: true })
  .then(response => {
    console.log('Cookies:', response.headers['set-cookie']);
  })
  .catch(error => {
    console.error('Error:', error);
  });

Advanced Usage: Custom Headers and POST Requests

You can send custom headers and perform POST requests with Axios:

const axios = require('axios');

const headers = {
  'User-Agent': 'Mozilla/5.0 (compatible)',
  'Accept-Language': 'en-US,en;q=0.5',
};

const data = {
  username: 'testuser',
  password: 'testpass',
};

axios.post('https://httpbin.org/post', data, { headers })
  .then(response => {
    console.log('Response JSON:', response.data);
  })
  .catch(error => {
    console.error('Error:', error);
  });

Bonus Code

Claim your Bonus Code for top captcha solutions at CapSolver: scrape. After redeeming it, you will get an extra 5% bonus after each recharge, unlimited times.

Conclusion

With Axios, you can easily manage HTTP requests in both Node.js and browser environments. By integrating it with CapSolver, you can bypass captchas such as ReCaptcha V3 and hCaptcha, allowing access to restricted content.

More