Skip to main content
POST
/
api
/
v1
/
extract-audio
Extract audio metadata
curl --request POST \
  --url https://app.dumplingai.com/api/v1/extract-audio \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "inputMethod": "url",
  "audio": "https://example.com/podcast.mp3",
  "prompt": "Provide a summary and list the key action items discussed.",
  "jsonMode": false
}'
{
  "results": "<string>",
  "prompt": "<string>",
  "audioDuration": 123,
  "creditUsage": 123
}

Extract Audio API Documentation

Description

This endpoint extracts structured data from audio files based on a user-defined prompt. It supports input via URL or base64-encoded audio content and uses Large Language Models (LLMs) to interpret and extract relevant information from the audio.

Endpoint

POST https://app.dumplingai.com/api/v1/extract-audio

Headers

  • Content-Type: application/json
  • Authorization: Bearer <API_KEY> (required)

Request Body

{
  "inputMethod": "string", // Required. Either "url" or "base64".
  "audio": "string", // Required. URL or base64-encoded audio content.
  "prompt": "string", // Required. The prompt describing the data to extract.
  "jsonMode": boolean // Optional. Whether to return the result in JSON format. Default: false.
}

Responses

Success (200)

Returns the extracted data based on the provided prompt, along with additional information.
{
  "results": "string", // Extracted data based on the prompt
  "prompt": "string", // The original prompt used for extraction
  "audioDuration": number, // Duration of the audio in seconds
  "creditUsage": number // Total credits used for this request
}
  • Content-Type: application/json
  • X-RateLimit-Limit: The rate limit for the user.
  • X-RateLimit-Remaining: The remaining number of requests for the user.

Bad Request (400)

Returned if the request is invalid or the audio file exceeds size or duration limits.
{
  "error": "Error message describing the issue"
}

Unauthorized (401)

Returned if the API key is invalid or missing.
{
  "error": "Invalid or missing Authorization header"
}

Internal Server Error (500)

Returned if there’s an error during the audio extraction process.
{
  "error": "Failed to extract audio data: [error details]"
}

Example Request

curl -X POST https://app.dumplingai.com/api/v1/extract-audio \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
  "inputMethod": "url",
  "audio": "https://example.com/sample-audio.mp3",
  "prompt": "Summarize the main points discussed in this audio.",
  "jsonMode": false
}'

Notes

  • The maximum file size for an audio file is 100MB.
  • The maximum audio duration is 9.5 hours (34,200 seconds).
  • Supported audio formats: wav, mp3, aiff, aac, ogg, flac
  • Credit usage:
    • Base cost: 100 credits
    • Additional 20 credits per minute of audio duration (rounded up)
  • The total credit usage is returned in the response as creditUsage.
  • If using the URL method, ensure the audio file is publicly accessible.
  • The jsonMode parameter determines whether the output is formatted as JSON (true) or plain text (false).
  • The endpoint uses the Gemini 1.5 Flash model for audio analysis and data extraction.
  • Temporary files are created during processing and are deleted after use.
  • You can get a list of supported audio formats by calling:
GET /api/v1/extract-audio

Rate Limiting

Rate limit headers (X-RateLimit-Limit and X-RateLimit-Remaining) are included in the response to indicate the user’s current rate limit status.

Error Handling

  • If the required parameters (audio or prompt) are missing, a 400 Bad Request error is returned.
  • If the audio file size exceeds 100MB, a 400 Bad Request error is returned.
  • If the audio duration exceeds 9.5 hours, a 400 Bad Request error is returned.
  • If there’s an error during extraction, a 500 Internal Server Error is returned with details about the failure.

Security and Privacy

  • Uploaded audio files are temporarily stored and then deleted after processing.
  • Audio metadata (including duration) is checked using a separate Python service before processing.

Authorizations

Authorization
string
header
required

Body

application/json
inputMethod
enum<string>
required
Available options:
url,
base64
audio
string
required
prompt
string
required
jsonMode
boolean
default:false
requestSource
enum<string>
Available options:
API,
WEB,
MAKE_DOT_COM,
ZAPIER,
N8N,
PLAYGROUND,
DEFAULT_AUTOMATION,
AGENT_PREVIEW,
AGENT_LIVE,
AUTOPILOT,
STUDIO

Response

results
string
required
prompt
string
required
audioDuration
number
required
creditUsage
integer
required