How to Use Axios for Web Scraping
Lucas Mitchell
Automation Engineer
23-Sep-2024
What is Axios?
Axios is a popular JavaScript library used for making HTTP requests from both the browser and Node.js. It simplifies making asynchronous HTTP requests and allows you to handle responses easily with promises.
Features:
- Promise-based: Uses JavaScript promises, making it easier to manage asynchronous operations.
- Browser and Node.js Support: Works seamlessly in both environments.
- Automatic JSON Parsing: Automatically parses JSON responses.
- Interceptors: Supports request and response interceptors for managing requests and handling responses globally.
- Error Handling: Provides built-in mechanisms for handling errors.
Prerequisites
Before using Axios, ensure you have:
Installation
You can install Axios using npm or yarn:
bash
npm install axios
or
bash
yarn add axios
Basic Example: Making a GET Request
Here’s how to perform a simple GET request using Axios:
javascript
const axios = require('axios');
axios.get('https://httpbin.org/get')
.then(response => {
console.log('Status Code:', response.status);
console.log('Response Body:', response.data);
})
.catch(error => {
console.error('Error:', error);
});
Web Scraping Example: Fetching JSON Data from an API
Let’s fetch data from an API and print the results:
javascript
const axios = require('axios');
axios.get('https://jsonplaceholder.typicode.com/posts')
.then(response => {
const posts = response.data;
posts.forEach(post => {
console.log(`${post.title} — ${post.body}`);
});
})
.catch(error => {
console.error('Error:', error);
});
Handling Captchas with CapSolver and Axios
In this section, we will integrate CapSolver with Axios to bypass captchas. CapSolver provides an API for solving captchas like ReCaptcha V3.
We will demonstrate solving ReCaptcha V3 with CapSolver and using the solution in a request.
Example: Solving ReCaptcha V3 with CapSolver and Axios
First, install Axios and CapSolver:
bash
npm install axios
npm install capsolver
Now, here’s how to solve a ReCaptcha V3 and use the solution in your request:
javascript
const axios = require('axios');
const CAPSOLVER_KEY = 'YourKey';
const PAGE_URL = 'https://antcpt.com/score_detector';
const PAGE_KEY = '6LcR_okUAAAAAPYrPe-HK_0RULO1aZM15ENyM-Mf';
const PAGE_ACTION = 'homepage';
async function createTask(url, key, pageAction) {
try {
const apiUrl = 'https://api.capsolver.com/createTask';
const payload = {
clientKey: CAPSOLVER_KEY,
task: {
type: 'ReCaptchaV3TaskProxyLess',
websiteURL: url,
websiteKey: key,
pageAction: pageAction
}
};
const headers = {
'Content-Type': 'application/json',
};
const response = await axios.post(apiUrl, payload, { headers });
return response.data.taskId;
} catch (error) {
console.error('Error creating CAPTCHA task:', error);
throw error;
}
}
async function getTaskResult(taskId) {
try {
const apiUrl = 'https://api.capsolver.com/getTaskResult';
const payload = {
clientKey: CAPSOLVER_KEY,
taskId: taskId,
};
const headers = {
'Content-Type': 'application/json',
};
let result;
do {
const response = await axios.post(apiUrl, payload, { headers });
result = response.data;
if (result.status === 'ready') {
return result.solution;
}
await new Promise(resolve => setTimeout(resolve, 5000)); // wait 5 seconds before retrying
} while (true);
} catch (error) {
console.error('Error getting CAPTCHA result:', error);
throw error;
}
}
function setSessionHeaders() {
return {
'cache-control': 'max-age=0',
'sec-ch-ua': '"Not/A)Brand";v="99", "Google Chrome";v="107", "Chromium";v="107"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': 'Windows',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'navigate',
'sec-fetch-user': '?1',
'sec-fetch-dest': 'document',
'accept-encoding': 'gzip, deflate',
'accept-language': 'en,fr-FR;q=0.9,fr;q=0.8,en-US;q=0.7',
};
}
async function main() {
const headers = setSessionHeaders();
console.log('Creating CAPTCHA task...');
const taskId = await createTask(PAGE_URL, PAGE_KEY, PAGE_ACTION);
console.log(`Task ID: ${taskId}`);
console.log('Retrieving CAPTCHA result...');
const solution = await getTaskResult(taskId);
const token = solution.gRecaptchaResponse;
console.log(`Token Solution: ${token}`);
const res = await axios.post('https://antcpt.com/score_detector/verify.php', { 'g-recaptcha-response': token }, { headers });
const response = res.data;
console.log(`Score: ${response.score}`);
}
main().catch(err => {
console.error(err);
});
Handling Proxies with Axios
To route your requests through a proxy with Axios:
javascript
const axios = require('axios');
axios.get('https://httpbin.org/ip', {
proxy: {
host: 'proxyserver',
port: 8080,
auth: {
username: 'username',
password: 'password'
}
}
})
.then(response => {
console.log('Response Body:', response.data);
})
.catch(error => {
console.error('Error:', error);
});
Handling Cookies with Axios
You can handle cookies in Axios using the withCredentials
option:
javascript
const axios = require('axios');
axios.get('https://httpbin.org/cookies/set?name=value', { withCredentials: true })
.then(response => {
console.log('Cookies:', response.headers['set-cookie']);
})
.catch(error => {
console.error('Error:', error);
});
Advanced Usage: Custom Headers and POST Requests
You can send custom headers and perform POST requests with Axios:
javascript
const axios = require('axios');
const headers = {
'User-Agent': 'Mozilla/5.0 (compatible)',
'Accept-Language': 'en-US,en;q=0.5',
};
const data = {
username: 'testuser',
password: 'testpass',
};
axios.post('https://httpbin.org/post', data, { headers })
.then(response => {
console.log('Response JSON:', response.data);
})
.catch(error => {
console.error('Error:', error);
});
Bonus Code
Claim your Bonus Code for top captcha solutions at CapSolver: scrape. After redeeming it, you will get an extra 5% bonus after each recharge, unlimited times.
Conclusion
With Axios, you can easily manage HTTP requests in both Node.js and browser environments. By integrating it with CapSolver, you can solve captchas such as ReCaptcha V3, allowing access to restricted content.
Compliance Disclaimer: The information provided on this blog is for informational purposes only. CapSolver is committed to compliance with all applicable laws and regulations. The use of the CapSolver network for illegal, fraudulent, or abusive activities is strictly prohibited and will be investigated. Our captcha-solving solutions enhance user experience while ensuring 100% compliance in helping solve captcha difficulties during public data crawling. We encourage responsible use of our services. For more information, please visit our Terms of Service and Privacy Policy.
More
The Best 6 CAPTCHA Solver Tools for Automation
previewDiscover top CAPTCHA solvers for efficient, fast, and scalable automation workflows with key comparison criteria.
Lucas Mitchell
17-Jan-2025
What is the best reCAPTCHA v2 and v3 Solver while web scraping in 2025
In 2025, with the heightened sophistication of anti-bot systems, finding reliable reCAPTCHA solvers has become critical for successful data extraction.
Lucas Mitchell
17-Jan-2025
What Is CAPTCHA? And How to Solve It When Scraping Projects
How do you deal with them when working on a web scraping project? Let’s dive into this topic
Ethan Collins
03-Jan-2025
The Best 5 Captcha Solvers for Web Scraping in 2025
In 2025, the CAPTCHA-solving space is improving with faster performance, better integrations, and more flexible pricing.
Ethan Collins
02-Jan-2025
How to Start Web Scraping in R: A Complete Guide for 2025
Learn how to scrape data with R, set up your environment, handle dynamic content, and follow best practices for ethical scraping.
Lucas Mitchell
26-Nov-2024
How to Solve Cloudflare in PHP
Explore how to solve Cloudflare’s defenses effectively using PHP. We’ll compare two solutions: automation tools like Selenium Stealth and API-based solutions
Lucas Mitchell
26-Nov-2024