Description
This endpoint converts PDF or DOCX documents into plain text. It supports input via URL or base64-encoded file content.Endpoint
Headers
- Content-Type:
application/json
- Authorization: Bearer
<API_KEY>
(required)
Request Body
Responses
Success (200)
Returns the extracted text from the document.- X-RateLimit-Limit: The rate limit for the user.
- X-RateLimit-Remaining: The remaining number of requests for the user.
Bad Request (400)
Returned if the request is invalid or the file format is unsupported.Internal Server Error (500)
Returned if there’s an error during the document processing.Example Request
Example Response
Notes
- Supported file formats: PDF and DOCX.
- Maximum file size may be limited (refer to your plan details).
- If using the URL method, ensure the file is publicly accessible.
- This endpoint uses 2 credits per request.
- The file type is automatically detected based on the file content.
- The “pages” field allows you to specify which pages to process:
- Use comma-separated values or ranges (e.g., “1, 2-” or “1, 2, 3-7”).
- The first page index is 1.
- Use ”!” before a number for inverted page numbers (e.g., “!1” for the last page).
- If not specified, all pages will be processed by default.
- The input must be in string format.
Rate Limiting
Rate limit headers (X-RateLimit-Limit
and X-RateLimit-Remaining
) are included in the response to indicate the user’s current rate limit status.Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
application/json
Indicates whether binary content is supplied via URL or base64-encoded string.
Available options:
url
, base64
Document URL or base64-encoded file content to convert to text.
Optional page ranges to extract (e.g., "1-3,5").
Optional identifier describing where the API request originated.
Available options:
API
, WEB
, MAKE_DOT_COM
, ZAPIER
, N8N
, PLAYGROUND
, DEFAULT_AUTOMATION
, AGENT_PREVIEW
, AGENT_LIVE
, AUTOPILOT
, STUDIO
Response
Plain text representation returned.
Plain text extracted from the source document.