Melious
API Reference

Rerank

POST /v1/rerank — reorder documents by relevance to a query

Given a query and a list of documents, return them reordered by relevance. Cohere-compatible request/response shape.

Endpoint:

POST /v1/rerank

Auth: Bearer token or x-api-key. Requires scope inference.rerank.

Example

import httpx

response = httpx.post(
    "https://api.melious.ai/v1/rerank",
    headers={"Authorization": "Bearer sk-mel-<YOUR_API_KEY>"},
    json={
        "model": "bge-reranker-v2-m3",
        "query": "Which Hanseatic cities are still relevant today?",
        "documents": [
            "Hamburg is Germany's second-largest city and largest port.",
            "Berlin is the capital but was never a Hanseatic city.",
            "Lübeck was the Hansa's de facto capital and still hosts its archives.",
        ],
        "top_n": 2,
    },
).json()

for hit in response["results"]:
    print(hit["index"], hit["relevance_score"])
const response = await fetch("https://api.melious.ai/v1/rerank", {
  method: "POST",
  headers: {
    "Authorization": "Bearer sk-mel-<YOUR_API_KEY>",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "bge-reranker-v2-m3",
    query: "Which Hanseatic cities are still relevant today?",
    documents: [
      "Hamburg is Germany's second-largest city and largest port.",
      "Berlin is the capital but was never a Hanseatic city.",
      "Lübeck was the Hansa's de facto capital and still hosts its archives.",
    ],
    top_n: 2,
  }),
}).then((r) => r.json());

for (const hit of response.results) {
  console.log(hit.index, hit.relevance_score);
}
curl https://api.melious.ai/v1/rerank \
  -H "Authorization: Bearer sk-mel-<YOUR_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bge-reranker-v2-m3",
    "query": "Which Hanseatic cities are still relevant today?",
    "documents": [
      "Hamburg is Germany'"'"'s second-largest city and largest port.",
      "Berlin is the capital but was never a Hanseatic city.",
      "Lübeck was the Hansa'"'"'s de facto capital and still hosts its archives."
    ],
    "top_n": 2
  }'

Request

ParameterTypeDefaultDescription
modelstringRerank model ID. Filter at melious.ai/hub for rerankers.
querystringThe query to rank documents against.
documentsarrayDocuments as strings, or as {"text": "..."} objects.
top_nintegerallReturn only the top N documents.
return_documentsbooleantrueInclude the document text in each result. Set to false to get only indices and scores.
max_chunks_per_docintegerprovider defaultChunk long documents before ranking.
userstringnoneEnd-user identifier.

Two-stage retrieval is the common pattern: embeddings-based vector search to get a candidate set of ~50–200, then rerank to pick the top 3–10 for the actual LLM call. The rerank model reads the query and each document fully, so it's slower per item — keep the candidate list small.

Response

{
  "id": "rerank-...",
  "results": [
    { "index": 2, "relevance_score": 0.94, "document": { "text": "Lübeck was..." } },
    { "index": 0, "relevance_score": 0.71, "document": { "text": "Hamburg is..." } }
  ],
  "meta": { "billed_units": { "search_units": 1 } },
  "usage": { "prompt_tokens": 42, "total_tokens": 42 },
  "environment_impact": { "energy_kwh": 0.00003, "carbon_g_co2": 0.01, "water_liters": 0.00004, "renewable_percent": 88, "pue": 1.16, "provider_id": "scaleway", "location": "FR" },
  "billing_cost": { "energy": "0.0001", "credits": "0.0", "paid_with": "energy" }
}

results[]

  • index — position in the original documents list.
  • relevance_score[0, 1]. Provider-dependent — don't compare scores across models.
  • document — present only when return_documents: true.

Errors

  • VALIDATION_4002query or documents missing.
  • INFERENCE_3001 — unknown model.
  • AUTH_1015 — missing inference.rerank scope.

Pre-ranking with EmbeddingsRouting if you want to bias toward lower-cost providers for bulk rerank jobs.

On this page