Scrape
Use this feature when you need either the content of a single URL or structured fields extracted from that page.What it does
- Fetches a page by URL
- Returns content as markdown, HTML, or screenshot output
- Can clean the page output for easier downstream use
- Can render JavaScript for sites that need it
- Can also return structured JSON when you switch from
scrapetoextract
Common use cases
- Pull article content for summarization
- Capture product pages for structured extraction
- Clean blog posts before sending them to an LLM
- Turn live web pages into inputs for automations
- Extract fields like product title, price, or availability as JSON
Why use it
This is the core web extraction workflow: start with raw page content, then move to structured extraction when you need JSON instead of text.Scrape vs structured extraction
- Use
scrapewhen you want the page content back as markdown, HTML, or screenshot output - Use
extractwhen you want DumplingAI to return specific fields that match a schema - In practice, most users start with
scrapeto validate the page, then useextractfor production workflows
Related endpoints
Scrape API
Parameters, request examples, and response format
Web Scraping Tutorial
Learn how to use scraping in a larger workflow
Extract
Pull structured fields from a page using AI
Crawl
Capture multiple pages from a site instead of just one URL