Melious
API Reference

Embeddings

POST /v1/embeddings — vector embeddings from text

Generate vector embeddings from text. OpenAI-compatible request and response.

Endpoint:

POST /v1/embeddings

Auth: Bearer token or x-api-key. Requires scope inference.embeddings.

Example

from openai import OpenAI

client = OpenAI(
    api_key="sk-mel-<YOUR_API_KEY>",
    base_url="https://api.melious.ai/v1",
)

response = client.embeddings.create(
    model="bge-m3",
    input=["A Hanseatic city is a city that...", "Hamburg is a port city..."],
)
print(len(response.data), "vectors,", len(response.data[0].embedding), "dims")
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "sk-mel-<YOUR_API_KEY>",
  baseURL: "https://api.melious.ai/v1",
});

const response = await client.embeddings.create({
  model: "bge-m3",
  input: ["A Hanseatic city is a city that...", "Hamburg is a port city..."],
});
console.log(response.data.length, "vectors,", response.data[0].embedding.length, "dims");
curl https://api.melious.ai/v1/embeddings \
  -H "Authorization: Bearer sk-mel-<YOUR_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bge-m3",
    "input": ["A Hanseatic city is a city that...", "Hamburg is a port city..."]
  }'

Request

ParameterTypeDefaultDescription
modelstringModel ID, optionally with a flavor suffix. See Routing.
inputstring | arrayText to embed. Arrays run as a single batched call.
encoding_formatstring"float""float" for number[], "base64" for base64-encoded float32.
dimensionsintegermodel defaultRequest a specific dimensionality (model must support truncation).
userstringnoneEnd-user identifier.
presetstringnone"quality" biases routing toward higher-quality providers.

Embeddings default to price routing — latency differences between providers on embeddings are usually invisible and price differences aren't. Override with :speed or preset: "quality" if that's wrong for your workload.

Which models support embeddings? Filter at melious.ai/hub by type, or call GET /v1/models?include_meta=true and check _meta.type == "embeddings".

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.012, -0.034, ...]
    },
    { "object": "embedding", "index": 1, "embedding": [...] }
  ],
  "model": "bge-m3",
  "usage": { "prompt_tokens": 24, "total_tokens": 24 },
  "environment_impact": { "energy_kwh": 0.00001, "carbon_g_co2": 0.004, "water_liters": 0.00002, "renewable_percent": 92, "pue": 1.15, "provider_id": "ionos", "location": "DE" },
  "billing_cost": { "energy": "0.0000", "credits": "0.0", "paid_with": "energy" }
}

data[].embedding

Array of floats when encoding_format: "float". Base64 of little-endian float32 bytes when "base64" — decode with numpy.frombuffer(base64.b64decode(s), dtype="<f4") or equivalent.

Embeddings from this endpoint are generally not normalized to unit length. For cosine similarity, normalize on your side:

import numpy as np

v = np.array(response.data[0].embedding)
v /= np.linalg.norm(v)

usage.total_tokens

Same as prompt_tokens — there are no output tokens on embeddings. Kept for OpenAI shape compatibility.

Errors

  • VALIDATION_4002model or input missing.
  • VALIDATION_4005 — empty string or array.
  • INFERENCE_3001 — unknown model.
  • INFERENCE_3207 — any single input exceeds the model's max context.
  • AUTH_1015 — missing inference.embeddings scope.

Two-stage retrieval with RerankModels for capability discovery • Routing for flavors.

On this page