Description

This endpoint allows users to query a knowledge base using vector similarity search. It returns chunks of text from the knowledge base that are most similar to the provided query.

Endpoint

POST /api/v1/knowledge-bases/query

Headers

  • Content-Type: application/json
  • Authorization: Bearer <API_KEY> (required for API access)

Request Body

{
  "knowledgeBaseId": "string", // Required. The ID of the knowledge base to query.
  "query": "string", // Required. The search query.
  "resultCount": "number" // Optional. Number of results to return. Default is 5.
}

Responses

Success (200)

Returns an array of similar chunks from the knowledge base.

[
  {
    "id": "string",
    "content": "string",
    "resource_name": "string",
    "similarity": "number"
  }
]
  • X-RateLimit-Limit: The rate limit for the user.
  • X-RateLimit-Remaining: The remaining number of requests for the user.

Bad Request (400)

Returned if required fields are missing or if the query is too long.

{
  "error": "Error message describing the issue"
}

Unauthorized (401)

Returned if the API key is invalid or missing.

{
  "error": "Invalid or missing Authorization header"
}

Internal Server Error (500)

Returned if there’s an error during the query process.

{
  "error": "Failed to query knowledge base"
}

Example Request

curl -X POST https://your-api-domain.com/api/v1/query-knowledge-base \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
  "knowledgeBaseId": "kb123456",
  "query": "What are the best practices for machine learning?",
  "resultCount": 3
}'

Notes

  • This endpoint uses 1 credit per request.
  • The query is limited to 4000 characters.
  • The resultCount parameter determines the number of similar chunks returned. The default is 5. Minimum is 1. Maximum is 25.
  • The similarity search uses vector embeddings to find the most relevant chunks of text.
  • Usage is recorded for each query, including the knowledge base ID and the number of credits used.

Rate Limiting

Rate limit headers (X-RateLimit-Limit and X-RateLimit-Remaining) are included in the response to indicate the user’s current rate limit status.