Embeddings
POST /v1/embeddings — vector embeddings from text
Generate vector embeddings from text. OpenAI-compatible request and response.
Endpoint:
POST /v1/embeddingsAuth: Bearer token or x-api-key. Requires scope inference.embeddings.
Example
from openai import OpenAI
client = OpenAI(
api_key="sk-mel-<YOUR_API_KEY>",
base_url="https://api.melious.ai/v1",
)
response = client.embeddings.create(
model="bge-m3",
input=["A Hanseatic city is a city that...", "Hamburg is a port city..."],
)
print(len(response.data), "vectors,", len(response.data[0].embedding), "dims")import OpenAI from "openai";
const client = new OpenAI({
apiKey: "sk-mel-<YOUR_API_KEY>",
baseURL: "https://api.melious.ai/v1",
});
const response = await client.embeddings.create({
model: "bge-m3",
input: ["A Hanseatic city is a city that...", "Hamburg is a port city..."],
});
console.log(response.data.length, "vectors,", response.data[0].embedding.length, "dims");curl https://api.melious.ai/v1/embeddings \
-H "Authorization: Bearer sk-mel-<YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{
"model": "bge-m3",
"input": ["A Hanseatic city is a city that...", "Hamburg is a port city..."]
}'Request
| Parameter | Type | Default | Description |
|---|---|---|---|
model | string | — | Model ID, optionally with a flavor suffix. See Routing. |
input | string | array | — | Text to embed. Arrays run as a single batched call. |
encoding_format | string | "float" | "float" for number[], "base64" for base64-encoded float32. |
dimensions | integer | model default | Request a specific dimensionality (model must support truncation). |
user | string | none | End-user identifier. |
preset | string | none | "quality" biases routing toward higher-quality providers. |
Embeddings default to price routing — latency differences between providers on embeddings are usually invisible and price differences aren't. Override with :speed or preset: "quality" if that's wrong for your workload.
Which models support embeddings? Filter at melious.ai/hub by type, or call GET /v1/models?include_meta=true and check _meta.type == "embeddings".
Response
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.012, -0.034, ...]
},
{ "object": "embedding", "index": 1, "embedding": [...] }
],
"model": "bge-m3",
"usage": { "prompt_tokens": 24, "total_tokens": 24 },
"environment_impact": { "energy_kwh": 0.00001, "carbon_g_co2": 0.004, "water_liters": 0.00002, "renewable_percent": 92, "pue": 1.15, "provider_id": "ionos", "location": "DE" },
"billing_cost": { "energy": "0.0000", "credits": "0.0", "paid_with": "energy" }
}data[].embedding
Array of floats when encoding_format: "float". Base64 of little-endian float32 bytes when "base64" — decode with numpy.frombuffer(base64.b64decode(s), dtype="<f4") or equivalent.
Embeddings from this endpoint are generally not normalized to unit length. For cosine similarity, normalize on your side:
import numpy as np
v = np.array(response.data[0].embedding)
v /= np.linalg.norm(v)usage.total_tokens
Same as prompt_tokens — there are no output tokens on embeddings. Kept for OpenAI shape compatibility.
Errors
VALIDATION_4002—modelorinputmissing.VALIDATION_4005— empty string or array.INFERENCE_3001— unknown model.INFERENCE_3207— any single input exceeds the model's max context.AUTH_1015— missinginference.embeddingsscope.
Related
Two-stage retrieval with Rerank • Models for capability discovery • Routing for flavors.