Chat Completions
OpenAI-compatible chat completions with vision, function calling, and streaming
Chat Completions
Generate AI responses using the OpenAI-compatible chat completions endpoint. Supports text, vision, function calling, structured output, and streaming.
POST /v1/chat/completionsQuick Example
from openai import OpenAI
client = OpenAI(
api_key="sk-mel-your-api-key-here",
base_url="https://api.melious.ai/v1"
)
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
]
)
print(response.choices[0].message.content)import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'sk-mel-your-api-key-here',
baseURL: 'https://api.melious.ai/v1'
});
const response = await client.chat.completions.create({
model: 'gpt-oss-120b',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'What is the capital of France?' }
]
});
console.log(response.choices[0].message.content);curl https://api.melious.ai/v1/chat/completions \
-H "Authorization: Bearer sk-mel-your-api-key-here" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-oss-120b",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
]
}'Request Parameters
Required Parameters
| Parameter | Type | Description |
|---|---|---|
model | string | Model ID (e.g., gpt-oss-120b, qwen3-235b-a22b-instruct) |
messages | array | Array of message objects |
Optional Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
temperature | number | 1.0 | Sampling temperature (0-2) |
max_tokens | integer | Model max | Maximum tokens to generate |
top_p | number | 1.0 | Nucleus sampling parameter |
frequency_penalty | number | 0 | Frequency penalty (-2 to 2) |
presence_penalty | number | 0 | Presence penalty (-2 to 2) |
stop | string/array | null | Stop sequences |
stream | boolean | false | Enable streaming |
n | integer | 1 | Number of completions |
seed | integer | null | Random seed for reproducibility |
user | string | null | End-user identifier |
tools | array | null | Available function tools |
tool_choice | string/object | auto | Tool selection mode |
response_format | object | null | Structured output format |
logprobs | boolean | false | Return log probabilities |
top_logprobs | integer | null | Number of top logprobs |
Melious Extensions
| Parameter | Type | Default | Description |
|---|---|---|---|
preset | string | balanced | Routing preset (speed/price/quality/environment) |
filters | object | null | Provider constraints |
Message Format
Message Roles
| Role | Description |
|---|---|
system | System instructions (first message) |
user | User input |
assistant | AI responses (for conversation history) |
tool | Tool/function call results |
Text Message
{
"role": "user",
"content": "Hello, how are you?"
}Multi-turn Conversation
{
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is 2+2?"},
{"role": "assistant", "content": "2+2 equals 4."},
{"role": "user", "content": "And 3+3?"}
]
}Vision (Image Input)
Send images for visual understanding with vision-capable models.
Supported Models
Vision-capable models that can process images:
mistral-small-3.2-24b-instruct(Google)mistral-small-3.2-24b-instruct(Mistral)
Image URL
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": {"url": "https://example.com/image.jpg"}
}
]
}Base64 Image
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image."},
{
"type": "image_url",
"image_url": {"url": "..."}
}
]
}Multiple Images
{
"role": "user",
"content": [
{"type": "text", "text": "Compare these two images."},
{"type": "image_url", "image_url": {"url": "https://example.com/image1.jpg"}},
{"type": "image_url", "image_url": {"url": "https://example.com/image2.jpg"}}
]
}Python Example
response = client.chat.completions.create(
model="mistral-small-3.2-24b-instruct", # Vision-capable model
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": {"url": "https://example.com/image.jpg"}
}
]
}
],
max_tokens=300
)Function Calling
Let models call your functions to perform actions or retrieve data.
Define Tools
{
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, e.g., San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["location"]
}
}
}
]
}Using Tool Slugs
Instead of defining tools inline, pass tool slugs from your collection:
{
"model": "auto",
"messages": [{"role": "user", "content": "Search for AI news"}],
"tools": ["web-search", "scrape-url"]
}Get available tools with GET /v1/tools/openai. Add tools to your collection first via POST /v1/user/tools/{slug}.
Mixed formats: Combine slugs with full definitions:
{
"tools": [
"web-search",
{"type": "function", "function": {"name": "custom_tool", "description": "...", "parameters": {}}}
]
}Tool Choice
| Value | Description |
|---|---|
"auto" | Model decides whether to call tools |
"none" | Never call tools |
"required" | Must call at least one tool |
{"type": "function", "function": {"name": "get_weather"}} | Force specific function |
Complete Example
import json
from openai import OpenAI
client = OpenAI(
api_key="sk-mel-your-api-key-here",
base_url="https://api.melious.ai/v1"
)
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}
]
# Step 1: Initial request
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[{"role": "user", "content": "What's the weather in Paris?"}],
tools=tools
)
# Step 2: Check for tool calls
message = response.choices[0].message
if message.tool_calls:
# Execute your function
tool_call = message.tool_calls[0]
args = json.loads(tool_call.function.arguments)
# Your function implementation
weather_result = {"temperature": 22, "condition": "sunny"}
# Step 3: Send result back
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[
{"role": "user", "content": "What's the weather in Paris?"},
message,
{
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(weather_result)
}
],
tools=tools
)
print(response.choices[0].message.content)import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'sk-mel-your-api-key-here',
baseURL: 'https://api.melious.ai/v1'
});
const tools = [
{
type: 'function',
function: {
name: 'get_weather',
description: 'Get current weather',
parameters: {
type: 'object',
properties: {
location: { type: 'string' }
},
required: ['location']
}
}
}
];
// Step 1: Initial request
let response = await client.chat.completions.create({
model: 'gpt-oss-120b',
messages: [{ role: 'user', content: "What's the weather in Paris?" }],
tools
});
// Step 2: Check for tool calls
const message = response.choices[0].message;
if (message.tool_calls) {
const toolCall = message.tool_calls[0];
const args = JSON.parse(toolCall.function.arguments);
// Your function implementation
const weatherResult = { temperature: 22, condition: 'sunny' };
// Step 3: Send result back
response = await client.chat.completions.create({
model: 'gpt-oss-120b',
messages: [
{ role: 'user', content: "What's the weather in Paris?" },
message,
{
role: 'tool',
tool_call_id: toolCall.id,
content: JSON.stringify(weatherResult)
}
],
tools
});
}
console.log(response.choices[0].message.content);Structured Output
Get responses in a specific JSON format.
JSON Object Mode
{
"model": "gpt-oss-120b",
"messages": [...],
"response_format": {"type": "json_object"}
}When using JSON mode, instruct the model in your prompt to produce JSON output.
JSON Schema Mode
{
"model": "gpt-oss-120b",
"messages": [
{"role": "user", "content": "Extract the person's name and age."}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "person_info",
"strict": true,
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"}
},
"required": ["name", "age"]
}
}
}
}Python Example
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[
{"role": "user", "content": "List 3 famous scientists with their fields."}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "scientists",
"strict": True,
"schema": {
"type": "object",
"properties": {
"scientists": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"field": {"type": "string"}
},
"required": ["name", "field"]
}
}
},
"required": ["scientists"]
}
}
}
)
import json
data = json.loads(response.choices[0].message.content)Automatic JSON Repair
Melious Feature: When using json_object or json_schema modes, malformed JSON responses are automatically repaired.
Melious automatically fixes common JSON issues from LLM outputs:
| Issue | Example | Auto-Fixed |
|---|---|---|
| Markdown wrapping | ```json {...}``` | {...} |
| Missing brackets | {"key": "value" | {"key": "value"} |
| Trailing commas | {"a": 1, "b": 2,} | {"a": 1, "b": 2} |
| Single quotes | {'key': 'value'} | {"key": "value"} |
| Unquoted keys | {key: "value"} | {"key": "value"} |
| Comments | {"key": "value" // comment} | {"key": "value"} |
This means you get valid JSON even when models occasionally produce malformed output - no manual parsing or error handling required.
Streaming
Get responses in real-time as they're generated.
stream = client.chat.completions.create(
model="gpt-oss-120b",
messages=[{"role": "user", "content": "Write a short story."}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)const stream = await client.chat.completions.create({
model: 'gpt-oss-120b',
messages: [{ role: 'user', content: 'Write a short story.' }],
stream: true
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) process.stdout.write(content);
}curl https://api.melious.ai/v1/chat/completions \
-H "Authorization: Bearer sk-mel-your-api-key-here" \
-H "Content-Type: application/json" \
-N \
-d '{
"model": "gpt-oss-120b",
"messages": [{"role": "user", "content": "Write a short story."}],
"stream": true
}'See Streaming for more details.
Response Format
Non-Streaming Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1699999999,
"model": "gpt-oss-120b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop",
"logprobs": null
}
],
"usage": {
"prompt_tokens": 15,
"completion_tokens": 8,
"total_tokens": 23
},
"environment_impact": {
"carbon_g_co2": 0.06,
"water_liters": 0.0002,
"energy_kwh": 0.00015,
"renewable_percent": 85,
"pue": 1.18,
"provider_id": "nebius",
"location": "NL"
}
}Finish Reasons
| Reason | Description |
|---|---|
stop | Natural end or stop sequence |
length | Hit max_tokens limit |
tool_calls | Model wants to call tools |
content_filter | Content filtered |
Routing Presets
Optimize routing based on your priorities:
{
"model": "gpt-oss-120b",
"messages": [...],
"preset": "environment"
}| Preset | Optimizes For |
|---|---|
balanced | All metrics (default) |
speed | Lowest latency |
price | Lowest cost |
quality | Highest quality |
environment | Lowest carbon footprint |
Provider Filters
Restrict to specific regions or constraints:
{
"model": "gpt-oss-120b",
"messages": [...],
"filters": {
"countries": ["NL", "FR", "DE"],
"max_carbon_intensity": 200,
"max_input_cost": 1.0,
"min_speed_tps": 500
}
}| Filter | Type | Description |
|---|---|---|
countries | string[] | Allowed country codes (e.g., ["NL", "FR", "DE"]) |
max_carbon_intensity | number | Max g CO2/kWh grid intensity |
max_input_cost | number | Max EUR per million input tokens |
min_speed_tps | number | Min tokens per second |
Error Handling
Common Errors
| Error Code | Description | Solution |
|---|---|---|
AUTH_INVALID_API_KEY | Invalid API key | Check your API key |
BILLING_INSUFFICIENT_ENERGY | Not enough energy | Top up or wait for regeneration |
VALIDATION_REQUIRED_FIELD | Missing field | Add required model or messages |
INFERENCE_PROVIDER_ERROR | Provider failed | Retry or use different preset |
Error Response
{
"error": {
"message": "Model field is required",
"type": "invalid_request_error",
"code": "VALIDATION_REQUIRED_FIELD"
}
}Best Practices
Set Temperature Appropriately
- 0.0-0.3: Factual, deterministic responses
- 0.7-1.0: Creative, varied responses
- 1.0-2.0: Very creative, may be incoherent
Use System Messages
Guide model behavior with clear system instructions:
messages=[
{"role": "system", "content": "You are a helpful coding assistant. Always include code examples."},
{"role": "user", "content": "..."}
]Set max_tokens
Prevent unexpectedly long responses and control costs:
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[...],
max_tokens=500
)Enable Streaming for Long Responses
Improve perceived latency with streaming:
stream = client.chat.completions.create(
model="gpt-oss-120b",
messages=[...],
stream=True
)See Also
- Streaming - Real-time streaming details
- Models - Available models and capabilities
- Embeddings - Vector embeddings
- Tools API - Execute tools with AI