Endpoint for requesting text to embedding inference
slug, check specific limits and features, and verify LoRA availability.Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
application/json Text to embedding conversion parameters
Input text(s) to generate embeddings for. Can be a single string or array of strings (max 2048 items). Each input limited to 8192 tokens, total request limited to 300k tokens.
"This is a sample text for embedding generation."
The embedding model to use. Available models can be retrieved via the GET /api/v1/client/models endpoint.
"Bge_M3_FP16"
If true, the embedding result will be returned directly in the response instead of only download url. Optional parameter.
false
Optional HTTPS URL to receive webhook notifications for job status changes (processing, completed, failed). Must be HTTPS. Max 2048 characters. See Webhook Documentation for payload structure and authentication details.
2048"https://your-server.com/webhooks/deapi"
ID of the inference request.
Information from success endpoint