From Anthropic

Point Claude Code or the Anthropic SDK at Melious with two environment variables

Melious implements Anthropic's Messages API (POST /v1/messages) alongside the OpenAI API. If you're using Claude Code, the Anthropic SDK, or anything that speaks that shape, you don't need a new client — just two environment variables.

Claude Code

Claude Code reads ANTHROPIC_API_KEY and ANTHROPIC_BASE_URL. Point both at Melious:

export ANTHROPIC_API_KEY=sk-mel-<YOUR_API_KEY>
export ANTHROPIC_BASE_URL=https://api.melious.ai
claude

That's the whole setup. Your sessions now route through Melious, which runs the inference on open-weight models and sends back Anthropic-shaped responses. No code changes in Claude Code itself.

The base URL is https://api.melious.ai — not .../v1. Anthropic's SDK appends its own version prefix, so you give it the root.

Anthropic SDK

Same idea, different caller:

from anthropic import Anthropic

client = Anthropic(
    api_key="sk-mel-<YOUR_API_KEY>",
    base_url="https://api.melious.ai",
)

response = client.messages.create(
    model="claude-sonnet-4",
    max_tokens=256,
    messages=[{"role": "user", "content": "Name three Hanseatic cities."}],
)
print(response.content[0].text)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: "sk-mel-<YOUR_API_KEY>",
  baseURL: "https://api.melious.ai",
});

const response = await client.messages.create({
  model: "claude-sonnet-4",
  max_tokens: 256,
  messages: [{ role: "user", content: "Name three Hanseatic cities." }],
});
console.log(response.content[0].text);

How model names are mapped

We don't run Claude. When you send a claude-* model ID, we map it to an open-weight model before the inference runs. The Anthropic SDK hardcodes these names, and Claude Code ships with them — so the mapping is what lets those clients work unmodified.

The defaults:

You send	We run
anything containing `opus` (case-insensitive)	`glm-4.7`
anything containing `sonnet`	`deepseek-v3.2`
anything containing `haiku`	`gpt-oss-120b`
anything else	the string as-is

These are defaults — an admin can override them per-instance via config keys anthropic.model_mapping.opus, anthropic.model_mapping.sonnet, and anthropic.model_mapping.haiku.

The response carries the original name back. If you sent claude-sonnet-4, the response model field is claude-sonnet-4 — so Claude Code's internal assumptions don't break. The actually-executed model appears in usage tracking on the Melious side, not in the wire response.

To run a specific open-weight model directly, send its ID:

response = client.messages.create(
    model="glm-4.7",   # no mapping — runs exactly this model
    max_tokens=256,
    messages=[...],
)

What's supported

Everything Anthropic SDK clients rely on for normal chat work:

System prompts — system as string or content blocks
Multi-turn messages — the usual role: "user" | "assistant" alternation
Tool use — Anthropic-shape tools + tool_choice, with tool_use / tool_result blocks in the response
Streaming — SSE with Anthropic event types (message_start, content_block_delta, message_stop)
Vision — image content blocks with base64 or URL sources (model has to support it)
Stop sequences, temperature, top_p, top_k, max_tokens — passed through

Token counting is also implemented:

counted = client.messages.count_tokens(
    model="claude-sonnet-4",
    messages=[{"role": "user", "content": "Hello"}],
)
print(counted.input_tokens)

See the Messages and Count tokens reference pages for full field details.

What's not supported

Honest list:

Anthropic prompt caching (cache_control) is not exposed. Open-weight providers don't surface the same shape of cache handle, and we haven't built a translation. If you need this, tell us.
Extended thinking blocks pass through as regular text for non-reasoning models. Reasoning models (e.g. deepseek-r1-0528, kimi-k2-thinking) produce thinking as part of their output, but the exact Anthropic thinking block shape isn't preserved.
Fine-grained tool streaming events — we emit content_block_delta for text and input_json_delta for tool arguments, matching the SDK's parser; less common events may be absent.

The intent is: if Claude Code and the stock Anthropic SDK work, we call it supported. If you're doing something exotic and something's missing, open an issue.

When it breaks

anthropic-version header missing → 400. The Anthropic SDK sets this automatically; curl callers need to set it manually. Any recent version string (e.g. 2023-06-01) works.
Requests hang on the first token — check that ANTHROPIC_BASE_URL has no trailing /v1. The SDK adds it.
Tool loops behave differently than with Claude — this is a model-capability thing, not a protocol thing. Stronger models (glm-4.7, kimi-k2-thinking) handle longer tool sequences better than smaller ones.

Error codes and retry guidance: Errors.