API Reference
Rerank
POST /v1/rerank — reorder documents by relevance to a query
Given a query and a list of documents, return them reordered by relevance. Cohere-compatible request/response shape.
Endpoint:
POST /v1/rerankAuth: Bearer token or x-api-key. Requires scope inference.rerank.
Example
import httpx
response = httpx.post(
"https://api.melious.ai/v1/rerank",
headers={"Authorization": "Bearer sk-mel-<YOUR_API_KEY>"},
json={
"model": "bge-reranker-v2-m3",
"query": "Which Hanseatic cities are still relevant today?",
"documents": [
"Hamburg is Germany's second-largest city and largest port.",
"Berlin is the capital but was never a Hanseatic city.",
"Lübeck was the Hansa's de facto capital and still hosts its archives.",
],
"top_n": 2,
},
).json()
for hit in response["results"]:
print(hit["index"], hit["relevance_score"])const response = await fetch("https://api.melious.ai/v1/rerank", {
method: "POST",
headers: {
"Authorization": "Bearer sk-mel-<YOUR_API_KEY>",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "bge-reranker-v2-m3",
query: "Which Hanseatic cities are still relevant today?",
documents: [
"Hamburg is Germany's second-largest city and largest port.",
"Berlin is the capital but was never a Hanseatic city.",
"Lübeck was the Hansa's de facto capital and still hosts its archives.",
],
top_n: 2,
}),
}).then((r) => r.json());
for (const hit of response.results) {
console.log(hit.index, hit.relevance_score);
}curl https://api.melious.ai/v1/rerank \
-H "Authorization: Bearer sk-mel-<YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{
"model": "bge-reranker-v2-m3",
"query": "Which Hanseatic cities are still relevant today?",
"documents": [
"Hamburg is Germany'"'"'s second-largest city and largest port.",
"Berlin is the capital but was never a Hanseatic city.",
"Lübeck was the Hansa'"'"'s de facto capital and still hosts its archives."
],
"top_n": 2
}'Request
| Parameter | Type | Default | Description |
|---|---|---|---|
model | string | — | Rerank model ID. Filter at melious.ai/hub for rerankers. |
query | string | — | The query to rank documents against. |
documents | array | — | Documents as strings, or as {"text": "..."} objects. |
top_n | integer | all | Return only the top N documents. |
return_documents | boolean | true | Include the document text in each result. Set to false to get only indices and scores. |
max_chunks_per_doc | integer | provider default | Chunk long documents before ranking. |
user | string | none | End-user identifier. |
Two-stage retrieval is the common pattern: embeddings-based vector search to get a candidate set of ~50–200, then rerank to pick the top 3–10 for the actual LLM call. The rerank model reads the query and each document fully, so it's slower per item — keep the candidate list small.
Response
{
"id": "rerank-...",
"results": [
{ "index": 2, "relevance_score": 0.94, "document": { "text": "Lübeck was..." } },
{ "index": 0, "relevance_score": 0.71, "document": { "text": "Hamburg is..." } }
],
"meta": { "billed_units": { "search_units": 1 } },
"usage": { "prompt_tokens": 42, "total_tokens": 42 },
"environment_impact": { "energy_kwh": 0.00003, "carbon_g_co2": 0.01, "water_liters": 0.00004, "renewable_percent": 88, "pue": 1.16, "provider_id": "scaleway", "location": "FR" },
"billing_cost": { "energy": "0.0001", "credits": "0.0", "paid_with": "energy" }
}results[]
index— position in the originaldocumentslist.relevance_score—[0, 1]. Provider-dependent — don't compare scores across models.document— present only whenreturn_documents: true.
Errors
VALIDATION_4002—queryordocumentsmissing.INFERENCE_3001— unknown model.AUTH_1015— missinginference.rerankscope.
Related
Pre-ranking with Embeddings • Routing if you want to bias toward lower-cost providers for bulk rerank jobs.