Rerank

Given a query and a list of documents, return them reordered by relevance. Cohere-compatible request/response shape.

Endpoint:

POST /v1/rerank

Auth: Bearer token or x-api-key. Requires scope inference.rerank.

Example

import httpx

response = httpx.post(
    "https://api.melious.ai/v1/rerank",
    headers={"Authorization": "Bearer sk-mel-<YOUR_API_KEY>"},
    json={
        "model": "bge-reranker-v2-m3",
        "query": "Which Hanseatic cities are still relevant today?",
        "documents": [
            "Hamburg is Germany's second-largest city and largest port.",
            "Berlin is the capital but was never a Hanseatic city.",
            "Lübeck was the Hansa's de facto capital and still hosts its archives.",
        ],
        "top_n": 2,
    },
).json()

for hit in response["results"]:
    print(hit["index"], hit["relevance_score"])

const response = await fetch("https://api.melious.ai/v1/rerank", {
  method: "POST",
  headers: {
    "Authorization": "Bearer sk-mel-<YOUR_API_KEY>",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "bge-reranker-v2-m3",
    query: "Which Hanseatic cities are still relevant today?",
    documents: [
      "Hamburg is Germany's second-largest city and largest port.",
      "Berlin is the capital but was never a Hanseatic city.",
      "Lübeck was the Hansa's de facto capital and still hosts its archives.",
    ],
    top_n: 2,
  }),
}).then((r) => r.json());

for (const hit of response.results) {
  console.log(hit.index, hit.relevance_score);
}

curl https://api.melious.ai/v1/rerank \
  -H "Authorization: Bearer sk-mel-<YOUR_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bge-reranker-v2-m3",
    "query": "Which Hanseatic cities are still relevant today?",
    "documents": [
      "Hamburg is Germany'"'"'s second-largest city and largest port.",
      "Berlin is the capital but was never a Hanseatic city.",
      "Lübeck was the Hansa'"'"'s de facto capital and still hosts its archives."
    ],
    "top_n": 2
  }'

Request

Parameter	Type	Default	Description
`model`	string	—	Rerank model ID. Filter at melious.ai/hub for rerankers.
`query`	string	—	The query to rank documents against.
`documents`	array	—	Documents as strings, or as `{"text": "..."}` objects.
`top_n`	integer	all	Return only the top N documents.
`return_documents`	boolean	`true`	Include the document text in each result. Set to `false` to get only indices and scores.
`max_chunks_per_doc`	integer	provider default	Chunk long documents before ranking.
`user`	string	none	End-user identifier.

Two-stage retrieval is the common pattern: embeddings-based vector search to get a candidate set of ~50–200, then rerank to pick the top 3–10 for the actual LLM call. The rerank model reads the query and each document fully, so it's slower per item — keep the candidate list small.

Response

{
  "id": "rerank-...",
  "results": [
    { "index": 2, "relevance_score": 0.94, "document": { "text": "Lübeck was..." } },
    { "index": 0, "relevance_score": 0.71, "document": { "text": "Hamburg is..." } }
  ],
  "meta": { "billed_units": { "search_units": 1 } },
  "usage": { "prompt_tokens": 42, "total_tokens": 42 },
  "environment_impact": { "energy_kwh": 0.00003, "carbon_g_co2": 0.01, "water_liters": 0.00004, "renewable_percent": 88, "pue": 1.16, "provider_id": "scaleway", "location": "FR" },
  "billing_cost": { "energy": "0.0001", "credits": "0.0", "paid_with": "energy" }
}

`results[]`

index — position in the original documents list.
relevance_score — [0, 1]. Provider-dependent — don't compare scores across models.
document — present only when return_documents: true.

Errors

VALIDATION_4002 — query or documents missing.
INFERENCE_3001 — unknown model.
AUTH_1015 — missing inference.rerank scope.

Pre-ranking with Embeddings • Routing if you want to bias toward lower-cost providers for bulk rerank jobs.

Rerank

Example

Request

Response

results[]

Errors

Related

On this page

`results[]`