Melious
Tools

Document conversion

POST /v1/documents/convert — turn PDFs, Office files, and images into Markdown, with OCR and streaming progress

Convert PDFs, Office documents, and images into clean Markdown — text extraction where the file has text, OCR where it doesn't. Useful as the front half of a RAG pipeline: convert first, then feed the Markdown to a chat model.

Endpoint:

POST /v1/documents/convert

Auth: a browser session, or an API key with scope kit.documents. Content-Type: multipart/form-data. The file field is named data. Max upload: 50 MB per request.

Example

import httpx

with open("report.pdf", "rb") as f:
    response = httpx.post(
        "https://api.melious.ai/v1/documents/convert",
        headers={"Authorization": "Bearer sk-mel-<YOUR_API_KEY>"},
        files={"data": ("report.pdf", f, "application/pdf")},
    ).json()

print(response["results"][0]["pages"][0]["content"])
import fs from "node:fs";

const form = new FormData();
form.append("data", new Blob([fs.readFileSync("report.pdf")]), "report.pdf");

const response = await fetch("https://api.melious.ai/v1/documents/convert", {
  method: "POST",
  headers: { Authorization: "Bearer sk-mel-<YOUR_API_KEY>" },
  body: form,
}).then((r) => r.json());

console.log(response.results[0].pages[0].content);
curl https://api.melious.ai/v1/documents/convert \
  -H "Authorization: Bearer sk-mel-<YOUR_API_KEY>" \
  -F "data=@report.pdf"

Auth and pricing

Two ways to authenticate, with different billing:

  • Browser session — free, but rate-limited to a daily page count that depends on your plan tier. When you hit it, you get KIT_6220 with the limit and reset time in details.
  • API key (scope kit.documents) — charged €0.50 per 1,000 pages (€0.0005/page), no daily cap. Create a scoped key at melious.ai/account/api/keys.

Exact per-plan daily limits are on melious.ai/pricing.

Request

The body is a multipart form. Append one or more files under the data field; behavior is tuned with query parameters.

Query paramTypeDefaultDescription
streambooleanfalseStream per-page progress over SSE instead of returning one JSON blob.
force_ocrbooleanfalseRun OCR even when the file has extractable text. For scanned PDFs saved as text.
fastbooleanfalseSkip the heavier vision and OCR passes for speed, at some quality cost on complex layouts.

Supported formats

CategoryFormats
PDF.pdf (text extraction, OCR fallback for scanned pages)
Microsoft Office.docx, .doc, .xlsx, .xls, .pptx, .ppt
OpenDocument.odt, .ods, .odp
Spreadsheets / text.csv, .txt, .md, .html, .xml, .rtf
Images (OCR).png, .jpg, .gif, .webp, .bmp, .tiff, .heic, .avif

Multiple files

Append data more than once to convert several files in one request:

curl https://api.melious.ai/v1/documents/convert \
  -H "Authorization: Bearer sk-mel-<YOUR_API_KEY>" \
  -F "data=@report.pdf" \
  -F "data=@data.xlsx" \
  -F "data=@scan.png"

Each file is converted independently — one bad file doesn't fail the others. Check status per result.

Response

{
  "results": [
    {
      "filename": "report.pdf",
      "pages": [
        {
          "page_number": 1,
          "content": "# Introduction\n\nThis is the first page...",
          "metadata": { "word_count": 250, "table_count": 0, "image_count": 1 }
        }
      ],
      "total_pages": 1,
      "file_type": "application/pdf",
      "status": "success",
      "processing_time_ms": 1234
    }
  ],
  "total": 1,
  "successful": 1,
  "failed": 0,
  "processing_time_ms": 1234
}
FieldTypeDescription
results[].filenamestringOriginal filename.
results[].pages[].page_numberinteger1-indexed page number.
results[].pages[].contentstringThe page as Markdown.
results[].pages[].metadataobjectPer-page counts (word_count, table_count, image_count). Optional.
results[].total_pagesintegerPage count for the document.
results[].statusstringsuccess or error. On error, results[].error holds the message.
total / successful / failedintegerCounts across all submitted files.

Streaming progress

For large files or batches, set ?stream=true to receive Server-Sent Events as each page lands instead of waiting for the whole job:

curl -N "https://api.melious.ai/v1/documents/convert?stream=true" \
  -H "Authorization: Bearer sk-mel-<YOUR_API_KEY>" \
  -F "data=@large_report.pdf"

Events arrive as data: lines. The type field tells you which:

typeMeaning
queuedFile accepted, waiting for a worker.
processingConversion started (is_ocr flags OCR runs).
completeOne file finished — carries the same result object as the non-streaming shape.
errorOne file failed (error message, code).
stream_endAll files done, with total / successful / failed.

The stream terminates with a literal data: [DONE].

import httpx
import json

with open("large_report.pdf", "rb") as f:
    with httpx.stream(
        "POST",
        "https://api.melious.ai/v1/documents/convert?stream=true",
        headers={"Authorization": "Bearer sk-mel-<YOUR_API_KEY>"},
        files={"data": ("large_report.pdf", f, "application/pdf")},
        timeout=300.0,
    ) as response:
        for line in response.iter_lines():
            if not line.startswith("data: "):
                continue
            payload = line[6:]
            if payload == "[DONE]":
                break
            event = json.loads(payload)
            if event["type"] == "complete":
                print(f"converted {event['result']['filename']}")

Errors

CodeStatusMeaning
KIT_6200429Processing queue at capacity — retry shortly.
KIT_6201413File exceeds the 50 MB limit.
KIT_6202415Unsupported file format.
KIT_6203500OCR failed on a scanned page.
KIT_6204408Conversion timed out.
KIT_6205413Too many files in one batch.
KIT_6220429Session daily page limit reached (session auth only).
AUTH_1015403API key is missing the kit.documents scope.
BILLING_2001402Insufficient balance (API key auth).

A session rate-limit error carries the specifics:

{
  "error": {
    "code": "KIT_6220",
    "message": "Daily document limit reached. Resets at midnight UTC.",
    "details": { "used_pages": 500, "daily_limit": 500, "reset_at": "2026-05-23T00:00:00Z" }
  }
}

Tools for the rest of the tool surface • Vision when you need a model to reason about layout rather than extract text • Chat completions for feeding the Markdown to a model.

On this page