> ## Documentation Index
> Fetch the complete documentation index at: https://docs.dumplingai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Crawl Website

> Crawl a site or sitemap and return captured pages with metadata.

## Description

This endpoint crawls a website and returns structured content from multiple pages.

## Endpoint

```
POST https://app.dumplingai.com/api/v1/crawl
```

## Headers

* **Content-Type:** `application/json`
* **Authorization:** Bearer `<API_KEY>` (required)

## Request Body

```json theme={null}
{
  "url": "string", // Required. The website URL to crawl
  "limit": "number", // Optional. Max pages to crawl (default: 5)
  "depth": "number", // Optional. Crawl depth (default: 2)
  "format": "string" // Optional. Output format: "markdown", "text", or "raw" (default: "markdown")
}
```

## Responses

### Success (200)

```json theme={null}
{
  "url": "string",
  "format": "string",
  "depth": "number",
  "limit": "number",
  "pages": "number",
  "results": [
    {
      "content": "string",
      "url": "string",
      "status": "number"
    }
  ],
  "creditUsage": "number"
}
```

## Example Request

```bash theme={null}
curl -X POST https://app.dumplingai.com/api/v1/crawl \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
  "url": "https://example.com",
  "limit": 10,
  "depth": 3,
  "format": "markdown"
}'
```

## Notes

* Uses 10 credits per crawled page
* Uses anti-bot measures and stealth crawling techniques
* Limit is the max number of pages to crawl
* Depth refers to the distance between the base URL path and sub paths

## Rate Limiting

Rate limit headers (`X-RateLimit-Limit` and `X-RateLimit-Remaining`) are included in the response.


## OpenAPI

````yaml POST /api/v1/crawl
openapi: 3.0.3
info:
  title: DumplingAI API
  version: 1.0.0
  description: >
    REST API for DumplingAI's content intelligence and automation platform.

    All endpoints are grouped under `/api/v1`; most are secured via Bearer API
    keys unless an operation explicitly sets `security: []`.
servers:
  - url: https://app.dumplingai.com
    description: Production
security:
  - bearerAuth: []
tags:
  - name: YouTube
    description: Access metadata, search results, and transcripts from YouTube.
  - name: TikTok
    description: Retrieve TikTok profile, video, follower, and transcript data.
  - name: LinkedIn
    description: Programmatically fetch LinkedIn company and profile data.
  - name: Search
    description: Search-orientated endpoints spanning web, news, maps, and autocomplete.
  - name: Google
    description: Integrations with Google business listings and location data.
  - name: Scraping
    description: Webpage capture, crawling, and structured content extraction utilities.
  - name: Documents
    description: Document processing, conversion, and metadata utilities.
  - name: AI
    description: DumplingAI agent and knowledge base endpoints.
  - name: Developer Tools
    description: Utilities for executing sandboxed code via API.
paths:
  /api/v1/crawl:
    post:
      tags:
        - Scraping
      summary: Crawl website
      description: Crawl a site or sitemap and return captured pages with metadata.
      operationId: crawlWebsite
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CrawlRequest'
            examples:
              default:
                value:
                  url: https://example.com
                  limit: 5
                  depth: 2
      responses:
        '200':
          description: Crawl job accepted or results returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/CrawlResponse'
        '400':
          description: Invalid request payload.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
        '401':
          description: Missing or invalid API key.
        '500':
          description: Unexpected server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
components:
  schemas:
    CrawlRequest:
      type: object
      description: Parameters controlling a crawl job.
      required:
        - url
      properties:
        url:
          type: string
          format: uri
          description: Root URL to crawl.
        depth:
          type: integer
          minimum: 1
          default: 2
          description: Maximum crawl depth.
        limit:
          type: integer
          minimum: 1
          default: 5
          description: Maximum number of pages to fetch.
        format:
          type: string
          enum:
            - markdown
            - text
            - raw
          default: markdown
          description: Output format for the crawled pages.
        requestSource:
          type: string
          description: Optional request source identifier.
      additionalProperties: false
    CrawlResponse:
      type: object
      required:
        - url
        - format
        - depth
        - limit
        - pages
        - results
        - creditUsage
      properties:
        url:
          type: string
          format: uri
        format:
          type: string
          enum:
            - markdown
            - text
            - raw
        depth:
          type: integer
        limit:
          type: integer
        pages:
          type: integer
        results:
          type: array
          items:
            type: object
            additionalProperties: true
        creditUsage:
          type: integer
      additionalProperties: false
    ErrorResponse:
      type: object
      properties:
        error:
          type: string
          description: Human-readable description of what went wrong.
      required:
        - error
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      bearerFormat: API Key

````