Chat Completions

Generate AI responses using the OpenAI-compatible chat completions endpoint. Supports text, vision, function calling, structured output, and streaming.

POST /v1/chat/completions

Quick Example

from openai import OpenAI

client = OpenAI(
    api_key="sk-mel-your-api-key-here",
    base_url="https://api.melious.ai/v1"
)

response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

print(response.choices[0].message.content)

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-mel-your-api-key-here',
  baseURL: 'https://api.melious.ai/v1'
});

const response = await client.chat.completions.create({
  model: 'gpt-oss-120b',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is the capital of France?' }
  ]
});

console.log(response.choices[0].message.content);

curl https://api.melious.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-mel-your-api-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-oss-120b",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

Request Parameters

Required Parameters

Parameter	Type	Description
`model`	string	Model ID (e.g., `gpt-oss-120b`, `qwen3-235b-a22b-instruct`)
`messages`	array	Array of message objects

Optional Parameters

Parameter	Type	Default	Description
`temperature`	number	1.0	Sampling temperature (0-2)
`max_tokens`	integer	Model max	Maximum tokens to generate
`top_p`	number	1.0	Nucleus sampling parameter
`frequency_penalty`	number	0	Frequency penalty (-2 to 2)
`presence_penalty`	number	0	Presence penalty (-2 to 2)
`stop`	string/array	null	Stop sequences
`stream`	boolean	false	Enable streaming
`n`	integer	1	Number of completions
`seed`	integer	null	Random seed for reproducibility
`user`	string	null	End-user identifier
`tools`	array	null	Available function tools
`tool_choice`	string/object	auto	Tool selection mode
`response_format`	object	null	Structured output format
`logprobs`	boolean	false	Return log probabilities
`top_logprobs`	integer	null	Number of top logprobs

Melious Extensions

Parameter	Type	Default	Description
`preset`	string	`balanced`	Routing preset (speed/price/quality/environment)
`filters`	object	null	Provider constraints

Message Format

Message Roles

Role	Description
`system`	System instructions (first message)
`user`	User input
`assistant`	AI responses (for conversation history)
`tool`	Tool/function call results

Text Message

{
  "role": "user",
  "content": "Hello, how are you?"
}

Multi-turn Conversation

{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is 2+2?"},
    {"role": "assistant", "content": "2+2 equals 4."},
    {"role": "user", "content": "And 3+3?"}
  ]
}

Vision (Image Input)

Send images for visual understanding with vision-capable models.

Supported Models

Vision-capable models that can process images:

mistral-small-3.2-24b-instruct (Google)
mistral-small-3.2-24b-instruct (Mistral)

Image URL

{
  "role": "user",
  "content": [
    {"type": "text", "text": "What's in this image?"},
    {
      "type": "image_url",
      "image_url": {"url": "https://example.com/image.jpg"}
    }
  ]
}

Base64 Image

{
  "role": "user",
  "content": [
    {"type": "text", "text": "Describe this image."},
    {
      "type": "image_url",
      "image_url": {"url": "data:image/jpeg;base64,/9j/4AAQ..."}
    }
  ]
}

Multiple Images

{
  "role": "user",
  "content": [
    {"type": "text", "text": "Compare these two images."},
    {"type": "image_url", "image_url": {"url": "https://example.com/image1.jpg"}},
    {"type": "image_url", "image_url": {"url": "https://example.com/image2.jpg"}}
  ]
}

Python Example

response = client.chat.completions.create(
    model="mistral-small-3.2-24b-instruct",  # Vision-capable model
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/image.jpg"}
                }
            ]
        }
    ],
    max_tokens=300
)

Function Calling

Let models call your functions to perform actions or retrieve data.

Define Tools

{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "City name, e.g., San Francisco, CA"
            },
            "unit": {
              "type": "string",
              "enum": ["celsius", "fahrenheit"],
              "description": "Temperature unit"
            }
          },
          "required": ["location"]
        }
      }
    }
  ]
}

Using Tool Slugs

Instead of defining tools inline, pass tool slugs from your collection:

{
  "model": "auto",
  "messages": [{"role": "user", "content": "Search for AI news"}],
  "tools": ["web-search", "scrape-url"]
}

Get available tools with GET /v1/tools/openai. Add tools to your collection first via POST /v1/user/tools/{slug}.

Mixed formats: Combine slugs with full definitions:

{
  "tools": [
    "web-search",
    {"type": "function", "function": {"name": "custom_tool", "description": "...", "parameters": {}}}
  ]
}

Tool Choice

Value	Description
`"auto"`	Model decides whether to call tools
`"none"`	Never call tools
`"required"`	Must call at least one tool
`{"type": "function", "function": {"name": "get_weather"}}`	Force specific function

Complete Example

import json
from openai import OpenAI

client = OpenAI(
    api_key="sk-mel-your-api-key-here",
    base_url="https://api.melious.ai/v1"
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    }
]

# Step 1: Initial request
response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=tools
)

# Step 2: Check for tool calls
message = response.choices[0].message
if message.tool_calls:
    # Execute your function
    tool_call = message.tool_calls[0]
    args = json.loads(tool_call.function.arguments)

    # Your function implementation
    weather_result = {"temperature": 22, "condition": "sunny"}

    # Step 3: Send result back
    response = client.chat.completions.create(
        model="gpt-oss-120b",
        messages=[
            {"role": "user", "content": "What's the weather in Paris?"},
            message,
            {
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(weather_result)
            }
        ],
        tools=tools
    )

print(response.choices[0].message.content)

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-mel-your-api-key-here',
  baseURL: 'https://api.melious.ai/v1'
});

const tools = [
  {
    type: 'function',
    function: {
      name: 'get_weather',
      description: 'Get current weather',
      parameters: {
        type: 'object',
        properties: {
          location: { type: 'string' }
        },
        required: ['location']
      }
    }
  }
];

// Step 1: Initial request
let response = await client.chat.completions.create({
  model: 'gpt-oss-120b',
  messages: [{ role: 'user', content: "What's the weather in Paris?" }],
  tools
});

// Step 2: Check for tool calls
const message = response.choices[0].message;
if (message.tool_calls) {
  const toolCall = message.tool_calls[0];
  const args = JSON.parse(toolCall.function.arguments);

  // Your function implementation
  const weatherResult = { temperature: 22, condition: 'sunny' };

  // Step 3: Send result back
  response = await client.chat.completions.create({
    model: 'gpt-oss-120b',
    messages: [
      { role: 'user', content: "What's the weather in Paris?" },
      message,
      {
        role: 'tool',
        tool_call_id: toolCall.id,
        content: JSON.stringify(weatherResult)
      }
    ],
    tools
  });
}

console.log(response.choices[0].message.content);

Structured Output

Get responses in a specific JSON format.

JSON Object Mode

{
  "model": "gpt-oss-120b",
  "messages": [...],
  "response_format": {"type": "json_object"}
}

When using JSON mode, instruct the model in your prompt to produce JSON output.

JSON Schema Mode

{
  "model": "gpt-oss-120b",
  "messages": [
    {"role": "user", "content": "Extract the person's name and age."}
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "person_info",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "name": {"type": "string"},
          "age": {"type": "integer"}
        },
        "required": ["name", "age"]
      }
    }
  }
}

Python Example

response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[
        {"role": "user", "content": "List 3 famous scientists with their fields."}
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "scientists",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "scientists": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "name": {"type": "string"},
                                "field": {"type": "string"}
                            },
                            "required": ["name", "field"]
                        }
                    }
                },
                "required": ["scientists"]
            }
        }
    }
)

import json
data = json.loads(response.choices[0].message.content)

Automatic JSON Repair

Melious Feature: When using json_object or json_schema modes, malformed JSON responses are automatically repaired.

Melious automatically fixes common JSON issues from LLM outputs:

Issue	Example	Auto-Fixed
Markdown wrapping	```json {...}```	`{...}`
Missing brackets	`{"key": "value"`	`{"key": "value"}`
Trailing commas	`{"a": 1, "b": 2,}`	`{"a": 1, "b": 2}`
Single quotes	`{'key': 'value'}`	`{"key": "value"}`
Unquoted keys	`{key: "value"}`	`{"key": "value"}`
Comments	`{"key": "value" // comment}`	`{"key": "value"}`

This means you get valid JSON even when models occasionally produce malformed output - no manual parsing or error handling required.

Streaming

Get responses in real-time as they're generated.

stream = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[{"role": "user", "content": "Write a short story."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

const stream = await client.chat.completions.create({
  model: 'gpt-oss-120b',
  messages: [{ role: 'user', content: 'Write a short story.' }],
  stream: true
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) process.stdout.write(content);
}

curl https://api.melious.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-mel-your-api-key-here" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "gpt-oss-120b",
    "messages": [{"role": "user", "content": "Write a short story."}],
    "stream": true
  }'

See Streaming for more details.

Response Format

Non-Streaming Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699999999,
  "model": "gpt-oss-120b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop",
      "logprobs": null
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 8,
    "total_tokens": 23
  },
  "environment_impact": {
    "carbon_g_co2": 0.06,
    "water_liters": 0.0002,
    "energy_kwh": 0.00015,
    "renewable_percent": 85,
    "pue": 1.18,
    "provider_id": "nebius",
    "location": "NL"
  }
}

Finish Reasons

Reason	Description
`stop`	Natural end or stop sequence
`length`	Hit max_tokens limit
`tool_calls`	Model wants to call tools
`content_filter`	Content filtered

Routing Presets

Optimize routing based on your priorities:

{
  "model": "gpt-oss-120b",
  "messages": [...],
  "preset": "environment"
}

Preset	Optimizes For
`balanced`	All metrics (default)
`speed`	Lowest latency
`price`	Lowest cost
`quality`	Highest quality
`environment`	Lowest carbon footprint

Provider Filters

Restrict to specific regions or constraints:

{
  "model": "gpt-oss-120b",
  "messages": [...],
  "filters": {
    "countries": ["NL", "FR", "DE"],
    "max_carbon_intensity": 200,
    "max_input_cost": 1.0,
    "min_speed_tps": 500
  }
}

Filter	Type	Description
`countries`	string[]	Allowed country codes (e.g., `["NL", "FR", "DE"]`)
`max_carbon_intensity`	number	Max g CO2/kWh grid intensity
`max_input_cost`	number	Max EUR per million input tokens
`min_speed_tps`	number	Min tokens per second

Error Handling

Common Errors

Error Code	Description	Solution
`AUTH_INVALID_API_KEY`	Invalid API key	Check your API key
`BILLING_INSUFFICIENT_ENERGY`	Not enough energy	Top up or wait for regeneration
`VALIDATION_REQUIRED_FIELD`	Missing field	Add required `model` or `messages`
`INFERENCE_PROVIDER_ERROR`	Provider failed	Retry or use different preset

Error Response

{
  "error": {
    "message": "Model field is required",
    "type": "invalid_request_error",
    "code": "VALIDATION_REQUIRED_FIELD"
  }
}

Best Practices

Set Temperature Appropriately

0.0-0.3: Factual, deterministic responses
0.7-1.0: Creative, varied responses
1.0-2.0: Very creative, may be incoherent

Use System Messages

Guide model behavior with clear system instructions:

messages=[
    {"role": "system", "content": "You are a helpful coding assistant. Always include code examples."},
    {"role": "user", "content": "..."}
]

Set max_tokens

Prevent unexpectedly long responses and control costs:

response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[...],
    max_tokens=500
)

Enable Streaming for Long Responses

Improve perceived latency with streaming:

stream = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[...],
    stream=True
)

Chat Completions

On this page