Endpoints
Extract Audio
Extract Audio API Documentation
Description
This endpoint extracts structured data from audio files based on a user-defined prompt. It supports input via URL or base64-encoded audio content and uses Large Language Models (LLMs) to interpret and extract relevant information from the audio.
Endpoint
POST /api/v1/extract-audio
Headers
- Content-Type:
application/json
- Authorization: Bearer
<API_KEY>
(required) - Request-Source:
string
(optional)
Request Body
{
"inputMethod": "string", // Required. Either "url" or "base64".
"audio": "string", // Required. URL or base64-encoded audio content.
"prompt": "string", // Required. The prompt describing the data to extract.
"jsonMode": boolean // Optional. Whether to return the result in JSON format. Default: false.
}
Responses
Success (200)
Returns the extracted data based on the provided prompt, along with additional information.
{
"results": "string", // Extracted data based on the prompt
"prompt": "string", // The original prompt used for extraction
"audioDuration": number, // Duration of the audio in seconds
"creditUsage": number // Total credits used for this request
}
- Content-Type: application/json
- X-RateLimit-Limit: The rate limit for the user.
- X-RateLimit-Remaining: The remaining number of requests for the user.
Bad Request (400)
Returned if the request is invalid or the audio file exceeds size or duration limits.
{
"error": "Error message describing the issue"
}
Unauthorized (401)
Returned if the API key is invalid or missing.
{
"error": "Invalid or missing Authorization header"
}
Internal Server Error (500)
Returned if there’s an error during the audio extraction process.
{
"error": "Failed to extract audio data: [error details]"
}
Example Request
curl -X POST https://app.dumplingai.com/api/v1/extract-audio \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Request-Source: API" \
-d '{
"inputMethod": "url",
"audio": "https://example.com/sample-audio.mp3",
"prompt": "Summarize the main points discussed in this audio.",
"jsonMode": false
}'
Notes
- The maximum file size for an audio file is 100MB.
- The maximum audio duration is 9.5 hours (34,200 seconds).
- Supported audio formats: wav, mp3, aiff, aac, ogg, flac
- Credit usage:
- Base cost: 10 credits
- Additional 2 credits per minute of audio duration (rounded up)
- The total credit usage is returned in the response as
creditUsage
. - If using the URL method, ensure the audio file is publicly accessible.
- The
jsonMode
parameter determines whether the output is formatted as JSON (true) or plain text (false). - The endpoint uses the Gemini 1.5 Flash model for audio analysis and data extraction.
- Temporary files are created during processing and are deleted after use.
- You can get a list of supported audio formats by calling:
GET /api/v1/extract-audio
Rate Limiting
Rate limit headers (X-RateLimit-Limit
and X-RateLimit-Remaining
) are included in the response to indicate the user’s current rate limit status.
Error Handling
- If the required parameters (
audio
orprompt
) are missing, a 400 Bad Request error is returned. - If the audio file size exceeds 100MB, a 400 Bad Request error is returned.
- If the audio duration exceeds 9.5 hours, a 400 Bad Request error is returned.
- If there’s an error during extraction, a 500 Internal Server Error is returned with details about the failure.
Security and Privacy
- Uploaded audio files are temporarily stored and then deleted after processing.
- Audio metadata (including duration) is checked using a separate Python service before processing.