Skip to main content
POST
/
api
/
v1
/
extract-audio
Extract audio metadata
curl --request POST \
  --url https://app.dumplingai.com/api/v1/extract-audio \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "inputMethod": "url",
  "audio": "https://example.com/podcast.mp3",
  "prompt": "Provide a summary and list the key action items discussed.",
  "jsonMode": false
}'
{
  "results": "<string>",
  "prompt": "<string>",
  "audioDuration": 123,
  "creditUsage": 123
}

Extract Audio API Documentation

Description

This endpoint extracts structured data from audio files based on a user-defined prompt. It supports input via URL or base64-encoded audio content and uses Large Language Models (LLMs) to interpret and extract relevant information from the audio.

Endpoint

POST https://app.dumplingai.com/api/v1/extract-audio

Headers

  • Content-Type: application/json
  • Authorization: Bearer <API_KEY> (required)

Request Body

{
  "inputMethod": "string", // Required. Either "url" or "base64".
  "audio": "string", // Required. URL or base64-encoded audio content.
  "prompt": "string", // Required. The prompt describing the data to extract.
  "jsonMode": boolean // Optional. Whether to return the result in JSON format. Default: false.
}

Responses

Success (200)

Returns the extracted data based on the provided prompt, along with additional information.
{
  "results": "string", // Extracted data based on the prompt
  "prompt": "string", // The original prompt used for extraction
  "audioDuration": number, // Duration of the audio in seconds
  "creditUsage": number // Total credits used for this request
}
  • Content-Type: application/json
  • X-RateLimit-Limit: The rate limit for the user.
  • X-RateLimit-Remaining: The remaining number of requests for the user.

Bad Request (400)

Returned if the request is invalid or the audio file exceeds size or duration limits.
{
  "error": "Error message describing the issue"
}

Unauthorized (401)

Returned if the API key is invalid or missing.
{
  "error": "Invalid or missing Authorization header"
}

Internal Server Error (500)

Returned if there’s an error during the audio extraction process.
{
  "error": "Failed to extract audio data: [error details]"
}

Example Request

curl -X POST https://app.dumplingai.com/api/v1/extract-audio \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
  "inputMethod": "url",
  "audio": "https://example.com/sample-audio.mp3",
  "prompt": "Summarize the main points discussed in this audio.",
  "jsonMode": false
}'

Notes

  • The maximum file size for an audio file is 100MB.
  • The maximum audio duration is 9.5 hours (34,200 seconds).
  • Supported audio formats: wav, mp3, aiff, aac, ogg, flac
  • Credit usage:
    • Base cost: 10 credits
    • Additional 2 credits per minute of audio duration (rounded up)
  • The total credit usage is returned in the response as creditUsage.
  • If using the URL method, ensure the audio file is publicly accessible.
  • The jsonMode parameter determines whether the output is formatted as JSON (true) or plain text (false).
  • The endpoint uses the Gemini 1.5 Flash model for audio analysis and data extraction.
  • Temporary files are created during processing and are deleted after use.
  • You can get a list of supported audio formats by calling:
GET /api/v1/extract-audio

Rate Limiting

Rate limit headers (X-RateLimit-Limit and X-RateLimit-Remaining) are included in the response to indicate the user’s current rate limit status.

Error Handling

  • If the required parameters (audio or prompt) are missing, a 400 Bad Request error is returned.
  • If the audio file size exceeds 100MB, a 400 Bad Request error is returned.
  • If the audio duration exceeds 9.5 hours, a 400 Bad Request error is returned.
  • If there’s an error during extraction, a 500 Internal Server Error is returned with details about the failure.

Security and Privacy

  • Uploaded audio files are temporarily stored and then deleted after processing.
  • Audio metadata (including duration) is checked using a separate Python service before processing.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
inputMethod
enum<string>
required

Indicates whether binary content is supplied via URL or base64-encoded string.

Available options:
url,
base64
audio
string
required

Audio URL or base64-encoded audio content to analyze.

prompt
string
required

Instructions describing the insights to extract from the audio.

jsonMode
boolean
default:false

When true, requests the model to respond with JSON-formatted output.

requestSource
enum<string>

Optional identifier describing where the API request originated.

Available options:
API,
WEB,
MAKE_DOT_COM,
ZAPIER,
N8N,
PLAYGROUND,
DEFAULT_AUTOMATION,
AGENT_PREVIEW,
AGENT_LIVE,
AUTOPILOT,
STUDIO

Response

Audio extraction results returned.

results
string
required

Model output returned from the extraction prompt.

prompt
string
required
audioDuration
number
required

Duration of the processed audio in seconds.

creditUsage
integer
required

Credits consumed while processing the request.

I