Request
In web scraping and automation, a “request” is the instruction that tells a crawler or actor which webpage to load and process.
Definition
A request represents a directive to fetch a specific URL so that a scraping or automation tool can retrieve and examine the content at that address. In platforms like CapSolver, each request corresponds to a distinct URL you want an Actor to visit and potentially extract data from. Requests can be queued dynamically as your scraper discovers new links or decides to navigate deeper into a site’s structure. They form the backbone of crawl workflows by controlling which pages are visited and in what order. Properly managing requests enables scalable, efficient scraping while handling pagination, link discovery, and prioritized crawling.
Pros
- Provides explicit control over which URLs a scraper will visit.
- Enables dynamic exploration of sites via request queues.
- Helps structure complex scraping workflows with prioritized navigation.
- Supports scalable data extraction by queuing new targets as they’re found.
- Integrates cleanly with automation frameworks and SDKs.
Cons
- Requires careful management to avoid redundant or infinite crawling loops.
- Poorly configured requests can overload target sites or trigger anti-bot defenses.
- Complex sites may need advanced logic to generate meaningful requests.
- Handling errors and retries adds development overhead.
- Unrestricted queuing can lead to high resource consumption.
Use Cases
- Crawling a product catalog by enqueuing each category and item page URL.
- Following pagination links on search results to collect all listings.
- Feeding discovered URLs back into a scraper to expand a site map.
- Coordinating multiple Actors to process different segments of a large site.
- Extracting structured data from a set of predefined target pages.