> ## Documentation Index
> Fetch the complete documentation index at: https://docs.dumplingai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Scrape

> Fetch structured data and HTML from a URL using DumplingAI's scraper.

## Description

This endpoint allows users to scrape data from a specified URL, format the scraped data, and optionally clean it before returning the result.

## Endpoint

```
POST https://app.dumplingai.com/api/v1/scrape
```

## Headers

* **Content-Type:** `application/json`
* **Authorization:** Bearer `<API_KEY>` (required)

## Request Body

```json theme={null}
{
  "url": "string", // Required. The URL to scrape.
  "format": "string", // Optional. The format of the output. Valid values: "markdown", "html", "screenshot".
  "cleaned": "boolean", // Optional. Whether the output should be cleaned.
  "renderJs": "boolean" // Optional. Whether to render JavaScript before scraping. Default is true.
}
```

## Responses

### Success (200)

Returns the scraped data in the specified format.

```json theme={null}
{
  "title": "string",
  "metadata": "object",
  "url": "string",
  "format": "string", // "markdown", "html", "screenshot"
  "cleaned": "boolean",
  "content": "string"
}
```

* `Content-Type: application/json`
* **X-RateLimit-Limit:** The rate limit for the user.
* **X-RateLimit-Remaining:** The remaining number of requests for the user.

## Example Request

```bash theme={null}
curl -X POST https://app.dumplingai.com/api/v1/scrape \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "url": "https://example.com",
    "format": "markdown",
    "cleaned": true,
    "renderJs": true
  }'
```

## Rate Limiting

Rate limit headers (`X-RateLimit-Limit` and `X-RateLimit-Remaining`) are included in the response to indicate the user's current rate limit status.

## Notes

* This endpoint uses 10 credits per request.
* Disable JavaScript rendering by setting `renderJs` to `false` for faster results if you don't need it.


## OpenAPI

````yaml POST /api/v1/scrape
openapi: 3.0.3
info:
  title: DumplingAI API
  version: 1.0.0
  description: >
    REST API for DumplingAI's content intelligence and automation platform.

    All endpoints are grouped under `/api/v1`; most are secured via Bearer API
    keys unless an operation explicitly sets `security: []`.
servers:
  - url: https://app.dumplingai.com
    description: Production
security:
  - bearerAuth: []
tags:
  - name: YouTube
    description: Access metadata, search results, and transcripts from YouTube.
  - name: TikTok
    description: Retrieve TikTok profile, video, follower, and transcript data.
  - name: LinkedIn
    description: Programmatically fetch LinkedIn company and profile data.
  - name: Search
    description: Search-orientated endpoints spanning web, news, maps, and autocomplete.
  - name: Google
    description: Integrations with Google business listings and location data.
  - name: Scraping
    description: Webpage capture, crawling, and structured content extraction utilities.
  - name: Documents
    description: Document processing, conversion, and metadata utilities.
  - name: AI
    description: DumplingAI agent and knowledge base endpoints.
  - name: Developer Tools
    description: Utilities for executing sandboxed code via API.
paths:
  /api/v1/scrape:
    post:
      tags:
        - Scraping
      summary: Scrape webpage
      description: Fetch structured data and HTML from a URL using DumplingAI's scraper.
      operationId: scrapeWebpage
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ScrapeRequest'
            examples:
              default:
                value:
                  url: https://example.com/article
                  format: markdown
                  cleaned: true
                  renderJs: true
      responses:
        '200':
          description: Scraped content returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ScrapeResponse'
        '400':
          description: Invalid request payload.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
        '401':
          description: Missing or invalid API key.
        '500':
          description: Unexpected server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
components:
  schemas:
    ScrapeRequest:
      type: object
      required:
        - url
      properties:
        url:
          type: string
          format: uri
          description: The URL to scrape
        format:
          type: string
          enum:
            - markdown
            - html
            - screenshot
          default: markdown
          description: Output format for the scraped content
        cleaned:
          type: boolean
          default: true
          description: Whether to return cleaned/simplified content
        renderJs:
          type: boolean
          default: true
          description: Whether to execute JavaScript on the page before scraping
        requestSource:
          $ref: '#/components/schemas/RequestSource'
      additionalProperties: false
    ScrapeResponse:
      type: object
      properties:
        title:
          type: string
          description: Page title
        url:
          type: string
          format: uri
          description: Final URL after redirects
        content:
          type: string
          description: Scraped content in requested format
        metadata:
          type: object
          additionalProperties: true
          description: Additional page metadata
      additionalProperties: true
    ErrorResponse:
      type: object
      properties:
        error:
          type: string
          description: Human-readable description of what went wrong.
      required:
        - error
    RequestSource:
      type: string
      description: Optional identifier describing where the API request originated.
      enum:
        - API
        - WEB
        - MAKE_DOT_COM
        - ZAPIER
        - N8N
        - PLAYGROUND
        - DEFAULT_AUTOMATION
        - AGENT_PREVIEW
        - AGENT_LIVE
        - AUTOPILOT
        - STUDIO
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      bearerFormat: API Key

````