Melious
API Reference

Chat completions

POST /v1/chat/completions — OpenAI-compatible chat endpoint

Generate a chat response. OpenAI-compatible in shape, plus Melious-specific routing and environment fields.

Endpoint:

POST /v1/chat/completions

Auth: Bearer token or x-api-key. Requires scope inference.chat.

Example

from openai import OpenAI

client = OpenAI(
    api_key="sk-mel-<YOUR_API_KEY>",
    base_url="https://api.melious.ai/v1",
)

response = client.chat.completions.create(
    model="glm-4.7",
    messages=[
        {"role": "system", "content": "You are concise."},
        {"role": "user", "content": "Name three Hanseatic cities."},
    ],
)
print(response.choices[0].message.content)
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "sk-mel-<YOUR_API_KEY>",
  baseURL: "https://api.melious.ai/v1",
});

const response = await client.chat.completions.create({
  model: "glm-4.7",
  messages: [
    { role: "system", content: "You are concise." },
    { role: "user", content: "Name three Hanseatic cities." },
  ],
});
console.log(response.choices[0].message.content);
curl https://api.melious.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-mel-<YOUR_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-4.7",
    "messages": [
      {"role": "system", "content": "You are concise."},
      {"role": "user", "content": "Name three Hanseatic cities."}
    ]
  }'

Request

Core parameters

ParameterTypeDefaultDescription
modelstringModel ID, optionally with a flavor suffix like :eco. See Routing.
messagesarrayThe conversation, oldest first. Each entry has a role and content.
max_tokensintegermodel maxCaps the completion length.
temperaturenumbermodel defaultSampling temperature, [0, 2]. Lower is closer to deterministic. Set this or top_p, not both.
top_pnumber1Nucleus sampling cutoff, [0, 1].
top_kintegerunsetRestrict to top-K tokens (provider-specific).
min_pnumberunsetMinimum-probability tail cut (provider-specific).
frequency_penaltynumber0Penalize repeated tokens, [-2, 2].
presence_penaltynumber0Penalize tokens already present, [-2, 2].
stopstring | arraynullStop sequences.
seedintegernoneDeterministic sampling (best-effort — not all providers honor it).
userstringnoneEnd-user identifier for abuse monitoring.
ninteger1Number of completions, [1, 10].

Streaming

ParameterTypeDefaultDescription
streambooleanfalseEnable SSE streaming.
stream_optionsobjectnullE.g. {"include_usage": true} to get the final usage on the last chunk.

See Streaming for the full shape.

Tools

ParameterTypeDefaultDescription
toolsarraynullFunction definitions the model may call.
tool_choicestring | object"auto""auto", "none", "required", or {"type": "function", "function": {"name": "..."}}.

Tool flow and examples: Tool calling.

Structured output

ParameterTypeDefaultDescription
response_formatobjectnull{"type": "json_object"} or {"type": "json_schema", "json_schema": {...}}.

Walkthrough: Structured outputs.

Log probabilities

ParameterTypeDefaultDescription
logprobsbooleanfalseReturn log probabilities. Off by default — turning it on roughly doubles the payload, so we only do that when you ask.
top_logprobsintegerunsetNumber of top alternatives per position, [0, 20].

Reasoning

ParameterTypeDefaultDescription
reasoning_effortstringmodel default"low", "medium", "high" for reasoning models. Ignored by non-reasoning models.

Melious-specific

ParameterTypeDefaultDescription
presetstringnone"reasoning" biases routing for reasoning models; "non_reasoning" biases toward speed. Overridden by a :flavor suffix on model.
request_idstringautoClient-provided correlation ID. If omitted, we generate one.
blueprint_idUUIDnoneLoad a vault-stored blueprint (requires X-Vault-Key header).
blueprint_configobjectnoneInline blueprint configuration. Takes precedence over blueprint_id.
variablesobjectnoneOverride blueprint variables.
skill_idsarraynoneLoad vault skills by ID into the request. Requires X-Vault-Key.
skill_configsarraynoneInline skill configurations.

Blueprints and skills are part of the vault-encrypted composition system — most API callers don't need them. They exist so Studio's runtime can round-trip through the same endpoint.

messages

Each entry is an object:

  • role"system", "user", "assistant", or "tool".
  • content — a string, or an array of content parts (for vision).

Vision content parts:

{
  "role": "user",
  "content": [
    {"type": "text", "text": "What's in this image?"},
    {"type": "image_url", "image_url": {"url": "https://example.com/cat.jpg"}}
  ]
}

image_url.url accepts public URLs or base64 data URIs (data:image/jpeg;base64,...). URLs are fetched and re-encoded server-side before inference, so the provider never sees your URL. Model has to support vision — check _meta.capabilities.vision on GET /v1/models/{id}?include_meta=true.

Tool messages:

{
  "role": "tool",
  "tool_call_id": "call_abc",
  "content": "{\"temperature\": 22}"
}

Response

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1699999999,
  "model": "glm-4.7",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hamburg, Lübeck, Bremen.",
        "tool_calls": null
      },
      "finish_reason": "stop",
      "logprobs": null
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 7,
    "total_tokens": 19
  },
  "system_fingerprint": "...",
  "environment_impact": { "energy_kwh": 0.00015, "carbon_g_co2": 0.06, "water_liters": 0.0002, "renewable_percent": 85, "pue": 1.18, "provider_id": "ovhcloud", "location": "FR" },
  "billing_cost": { "energy": "0.0008", "credits": "0.0", "paid_with": "energy" }
}

choices[].finish_reason

  • stop — model stopped naturally or hit a stop sequence.
  • lengthmax_tokens hit.
  • tool_calls — model wants to call a tool.
  • content_filter — blocked by provider safety filter.
  • blocked — blocked by a Melious blueprint railguard.
  • blueprint_override — blueprint replied directly, no model call.

environment_impact

See Environmental impact for field definitions.

billing_cost

See Pricing. Decimals are strings to avoid float drift.

Errors

Every error returns the standard {"error": {"code", "message", "details"}} shape. Common codes on this endpoint:

  • VALIDATION_4002messages or model missing.
  • INFERENCE_3001 — unknown model ID.
  • INFERENCE_3201 — asked for vision on a non-vision model.
  • INFERENCE_3202 — passed tools to a model that doesn't support them.
  • INFERENCE_3207 — input exceeds the model's context window.
  • INFERENCE_3208 — rejected by the provider's content filter.
  • INFERENCE_3103 — all providers failed (transient; retry).
  • BILLING_2001 / BILLING_2003 — out of energy / credits.
  • AUTH_1015 — key is missing inference.chat scope.

Full list and retry guidance: Errors.

StreamingTool callingVisionStructured outputsRouting.

On this page