How to Use node-fetch for Web Scraping

Blog

All

Blog

All

How to Use node-fetch for Web Scraping

Lucas Mitchell

Automation Engineer

27-Sep-2024

What is node-fetch?

node-fetch is a lightweight JavaScript library that brings the window.fetch API to Node.js. It is often used for making HTTP requests from a Node.js environment, providing a modern and flexible way to handle network operations asynchronously.

Features:

Promise-based: Uses JavaScript promises to manage asynchronous operations in a simple manner.
Node.js Support: Specifically designed for Node.js environments.
Stream Support: Supports streams, making it highly suitable for handling large data.
Small and Efficient: Minimalistic design, focusing on performance and compatibility with modern JavaScript features.

Prerequisites

Before using node-fetch, ensure you have:

Node.js installed.
npm or yarn to manage your packages.

Installation

To use node-fetch, you need to install it using npm or yarn:

bash Copy

npm install node-fetch

bash Copy

yarn add node-fetch

Basic Example: Making a GET Request

Here’s how to perform a simple GET request using node-fetch:

javascript Copy

const fetch = require('node-fetch');

fetch('https://httpbin.org/get')
  .then(response => response.json())
  .then(data => {
    console.log('Response Body:', data);
  })
  .catch(error => {
    console.error('Error:', error);
  });

Web Scraping Example: Fetching JSON Data from an API

Let’s fetch data from an API and log the results:

javascript Copy

const fetch = require('node-fetch');

fetch('https://jsonplaceholder.typicode.com/posts')
  .then(response => response.json())
  .then(posts => {
    posts.forEach(post => {
      console.log(`${post.title} — ${post.body}`);
    });
  })
  .catch(error => {
    console.error('Error:', error);
  });

Handling Captchas with CapSolver and `node-fetch`

In this section, we’ll integrate CapSolver with node-fetch to handle captchas. CapSolver provides APIs for solving captchas like ReCaptcha V3, enabling automation of tasks that require solving such captchas.

Example: Solving ReCaptcha V3 with CapSolver and `node-fetch`

First, install node-fetch and CapSolver:

bash Copy

npm install node-fetch
npm install capsolver

Now, here’s how to solve a ReCaptcha V3 and use the solution in your request:

javascript Copy

const fetch = require('node-fetch');
const CAPSOLVER_KEY = 'YourKey';
const PAGE_URL = 'https://antcpt.com/score_detector';
const PAGE_KEY = '6LcR_okUAAAAAPYrPe-HK_0RULO1aZM15ENyM-Mf';
const PAGE_ACTION = 'homepage';

async function createTask(url, key, pageAction) {
  try {
    const apiUrl = 'https://api.capsolver.com/createTask';
    const payload = {
      clientKey: CAPSOLVER_KEY,
      task: {
        type: 'ReCaptchaV3TaskProxyLess',
        websiteURL: url,
        websiteKey: key,
        pageAction: pageAction
      }
    };
    
    const response = await fetch(apiUrl, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify(payload)
    });
    
    const data = await response.json();
    return data.taskId;

  } catch (error) {
    console.error('Error creating CAPTCHA task:', error);
    throw error;
  }
}

async function getTaskResult(taskId) {
  try {
    const apiUrl = 'https://api.capsolver.com/getTaskResult';
    const payload = {
      clientKey: CAPSOLVER_KEY,
      taskId: taskId,
    };

    let result;
    do {
      const response = await fetch(apiUrl, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify(payload)
      });

      result = await response.json();
      if (result.status === 'ready') {
        return result.solution;
      }

      await new Promise(resolve => setTimeout(resolve, 5000)); // wait 5 seconds
    } while (true);

  } catch (error) {
    console.error('Error fetching CAPTCHA result:', error);
    throw error;
  }
}

async function main() {
  console.log('Creating CAPTCHA task...');
  const taskId = await createTask(PAGE_URL, PAGE_KEY, PAGE_ACTION);
  console.log(`Task ID: ${taskId}`);

  console.log('Retrieving CAPTCHA result...');
  const solution = await getTaskResult(taskId);
  const token = solution.gRecaptchaResponse;
  console.log(`Token Solution: ${token}`);

  const res = await fetch('https://antcpt.com/score_detector/verify.php', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ 'g-recaptcha-response': token })
  });

  const response = await res.json();
  console.log(`Score: ${response.score}`);
}

main().catch(err => {
  console.error(err);
});

Handling Proxies with `node-fetch`

To route your requests through a proxy with node-fetch, you will need a proxy agent like https-proxy-agent. Here's how to implement it:

bash Copy

npm install https-proxy-agent

Example with a proxy:

javascript Copy

const fetch = require('node-fetch');
const HttpsProxyAgent = require('https-proxy-agent');

const proxyAgent = new HttpsProxyAgent('http://username:password@proxyserver:8080');

fetch('https://httpbin.org/ip', { agent: proxyAgent })
  .then(response => response.json())
  .then(data => {
    console.log('Response Body:', data);
  })
  .catch(error => {
    console.error('Error:', error);
  });

Handling Cookies with `node-fetch`

For cookie handling in node-fetch, you can use a library like fetch-cookie. Here's how to use it:

bash Copy

npm install fetch-cookie

Example:

javascript Copy

const fetch = require('node-fetch');
const fetchCookie = require('fetch-cookie');
const cookieFetch = fetchCookie(fetch);

cookieFetch('https://httpbin.org/cookies/set?name=value')
  .then(response => response.json())
  .then(data => {
    console.log('Cookies:', data);
  })
  .catch(error => {
    console.error('Error:', error);
  });

Advanced Usage: Custom Headers and POST Requests

You can customize headers and perform POST requests with node-fetch:

javascript Copy

const fetch = require('node-fetch');

const headers = {
  'User-Agent': 'Mozilla/5.0 (compatible)',
  'Accept-Language': 'en-US,en;q=0.5',
};

const data = {
  username: 'testuser',
  password: 'testpass',
};

fetch('https://httpbin.org/post', {
  method: 'POST',
  headers: headers,
  body: JSON.stringify(data),
})
  .then(response => response.json())
  .then(data => {
    console.log('Response JSON:', data);
  })
  .catch(error => {
    console.error('Error:', error);
  });

Bonus Code

Claim your Bonus Code for top captcha solutions at CapSolver: scrape. After redeeming it, you will get an extra 5% bonus after each recharge, unlimited times.

Conclusion

With node-fetch, you can effectively manage HTTP requests in Node.js. By integrating it with CapSolver, you can solve captchas such as ReCaptcha V3 and captcha, providing access to restricted content. Additionally, node-fetch offers customization through headers, proxy support, and cookie management, making it a versatile tool for web scraping and automation.

Compliance Disclaimer: The information provided on this blog is for informational purposes only. CapSolver is committed to compliance with all applicable laws and regulations. The use of the CapSolver network for illegal, fraudulent, or abusive activities is strictly prohibited and will be investigated. Our captcha-solving solutions enhance user experience while ensuring 100% compliance in helping solve captcha difficulties during public data crawling. We encourage responsible use of our services. For more information, please visit our Terms of Service and Privacy Policy.

AI-powered Image Recognition: The Basics and How to Solve it

Say goodbye to image CAPTCHA struggles – CapSolver Vision Engine solves them fast, smart, and hassle-free!

Lucas Mitchell

24-Apr-2025

Best User Agents for Web Scraping & How to Use Them

A guide to the best user agents for web scraping and their effective use to avoid detection. Explore the importance of user agents, types, and how to implement them for seamless and undetectable web scraping.

Ethan Collins

07-Mar-2025

What is a Captcha? Can Captcha Track You?

Ever wondered what a CAPTCHA is and why websites make you solve them? Learn how CAPTCHAs work, whether they track you, and why they’re crucial for web security. Plus, discover how to bypass CAPTCHAs effortlessly with CapSolver for web scraping and automation.

Lucas Mitchell

05-Mar-2025

Cloudflare TLS Fingerprinting: What It Is and How to Solve It

Learn about Cloudflare's use of TLS fingerprinting for security, how it detects and blocks bots, and explore effective methods to solve it for web scraping and automated browsing tasks.

Cloudflare

Lucas Mitchell

28-Feb-2025

Why do I keep getting asked to verify I'm not a robot?

Learn why Google prompts you to verify you're not a robot and explore solutions like using CapSolver’s API to solve CAPTCHA challenges efficiently.

Ethan Collins

27-Feb-2025

What is the best CAPTCHA solver in 2025

Discover the best CAPTCHA solver in 2025 with CapSolver, the ultimate tool for automated web scraping, CAPTCHA bypass, and data collection using advanced AI and machine learning. Enjoy bonus codes, seamless integration, and real-world examples to boost your scraping efficiency.

Aloísio Vítor

25-Feb-2025

How to Use node-fetch for Web Scraping

What is node-fetch?

Prerequisites

Installation

Basic Example: Making a GET Request

Web Scraping Example: Fetching JSON Data from an API

Handling Captchas with CapSolver and node-fetch

Example: Solving ReCaptcha V3 with CapSolver and node-fetch

Handling Proxies with node-fetch

Handling Cookies with node-fetch

Advanced Usage: Custom Headers and POST Requests

Bonus Code

Conclusion

More

AI-powered Image Recognition: The Basics and How to Solve it

Best User Agents for Web Scraping & How to Use Them

What is a Captcha? Can Captcha Track You?

Cloudflare TLS Fingerprinting: What It Is and How to Solve It

Why do I keep getting asked to verify I'm not a robot?

What is the best CAPTCHA solver in 2025

Handling Captchas with CapSolver and `node-fetch`

Example: Solving ReCaptcha V3 with CapSolver and `node-fetch`

Handling Proxies with `node-fetch`

Handling Cookies with `node-fetch`