Document conversion
POST /v1/documents/convert — turn PDFs, Office files, and images into Markdown, with OCR and streaming progress
Convert PDFs, Office documents, and images into clean Markdown — text extraction where the file has text, OCR where it doesn't. Useful as the front half of a RAG pipeline: convert first, then feed the Markdown to a chat model.
Endpoint:
POST /v1/documents/convertAuth: a browser session, or an API key with scope kit.documents.
Content-Type: multipart/form-data. The file field is named data.
Max upload: 50 MB per request.
Example
import httpx
with open("report.pdf", "rb") as f:
response = httpx.post(
"https://api.melious.ai/v1/documents/convert",
headers={"Authorization": "Bearer sk-mel-<YOUR_API_KEY>"},
files={"data": ("report.pdf", f, "application/pdf")},
).json()
print(response["results"][0]["pages"][0]["content"])import fs from "node:fs";
const form = new FormData();
form.append("data", new Blob([fs.readFileSync("report.pdf")]), "report.pdf");
const response = await fetch("https://api.melious.ai/v1/documents/convert", {
method: "POST",
headers: { Authorization: "Bearer sk-mel-<YOUR_API_KEY>" },
body: form,
}).then((r) => r.json());
console.log(response.results[0].pages[0].content);curl https://api.melious.ai/v1/documents/convert \
-H "Authorization: Bearer sk-mel-<YOUR_API_KEY>" \
-F "data=@report.pdf"Auth and pricing
Two ways to authenticate, with different billing:
- Browser session — free, but rate-limited to a daily page count that depends on your plan tier. When you hit it, you get
KIT_6220with the limit and reset time indetails. - API key (scope
kit.documents) — charged €0.50 per 1,000 pages (€0.0005/page), no daily cap. Create a scoped key at melious.ai/account/api/keys.
Exact per-plan daily limits are on melious.ai/pricing.
Request
The body is a multipart form. Append one or more files under the data field; behavior is tuned with query parameters.
| Query param | Type | Default | Description |
|---|---|---|---|
stream | boolean | false | Stream per-page progress over SSE instead of returning one JSON blob. |
force_ocr | boolean | false | Run OCR even when the file has extractable text. For scanned PDFs saved as text. |
fast | boolean | false | Skip the heavier vision and OCR passes for speed, at some quality cost on complex layouts. |
Supported formats
| Category | Formats |
|---|---|
.pdf (text extraction, OCR fallback for scanned pages) | |
| Microsoft Office | .docx, .doc, .xlsx, .xls, .pptx, .ppt |
| OpenDocument | .odt, .ods, .odp |
| Spreadsheets / text | .csv, .txt, .md, .html, .xml, .rtf |
| Images (OCR) | .png, .jpg, .gif, .webp, .bmp, .tiff, .heic, .avif |
Multiple files
Append data more than once to convert several files in one request:
curl https://api.melious.ai/v1/documents/convert \
-H "Authorization: Bearer sk-mel-<YOUR_API_KEY>" \
-F "data=@report.pdf" \
-F "data=@data.xlsx" \
-F "data=@scan.png"Each file is converted independently — one bad file doesn't fail the others. Check status per result.
Response
{
"results": [
{
"filename": "report.pdf",
"pages": [
{
"page_number": 1,
"content": "# Introduction\n\nThis is the first page...",
"metadata": { "word_count": 250, "table_count": 0, "image_count": 1 }
}
],
"total_pages": 1,
"file_type": "application/pdf",
"status": "success",
"processing_time_ms": 1234
}
],
"total": 1,
"successful": 1,
"failed": 0,
"processing_time_ms": 1234
}| Field | Type | Description |
|---|---|---|
results[].filename | string | Original filename. |
results[].pages[].page_number | integer | 1-indexed page number. |
results[].pages[].content | string | The page as Markdown. |
results[].pages[].metadata | object | Per-page counts (word_count, table_count, image_count). Optional. |
results[].total_pages | integer | Page count for the document. |
results[].status | string | success or error. On error, results[].error holds the message. |
total / successful / failed | integer | Counts across all submitted files. |
Streaming progress
For large files or batches, set ?stream=true to receive Server-Sent Events as each page lands instead of waiting for the whole job:
curl -N "https://api.melious.ai/v1/documents/convert?stream=true" \
-H "Authorization: Bearer sk-mel-<YOUR_API_KEY>" \
-F "data=@large_report.pdf"Events arrive as data: lines. The type field tells you which:
type | Meaning |
|---|---|
queued | File accepted, waiting for a worker. |
processing | Conversion started (is_ocr flags OCR runs). |
complete | One file finished — carries the same result object as the non-streaming shape. |
error | One file failed (error message, code). |
stream_end | All files done, with total / successful / failed. |
The stream terminates with a literal data: [DONE].
import httpx
import json
with open("large_report.pdf", "rb") as f:
with httpx.stream(
"POST",
"https://api.melious.ai/v1/documents/convert?stream=true",
headers={"Authorization": "Bearer sk-mel-<YOUR_API_KEY>"},
files={"data": ("large_report.pdf", f, "application/pdf")},
timeout=300.0,
) as response:
for line in response.iter_lines():
if not line.startswith("data: "):
continue
payload = line[6:]
if payload == "[DONE]":
break
event = json.loads(payload)
if event["type"] == "complete":
print(f"converted {event['result']['filename']}")Errors
| Code | Status | Meaning |
|---|---|---|
KIT_6200 | 429 | Processing queue at capacity — retry shortly. |
KIT_6201 | 413 | File exceeds the 50 MB limit. |
KIT_6202 | 415 | Unsupported file format. |
KIT_6203 | 500 | OCR failed on a scanned page. |
KIT_6204 | 408 | Conversion timed out. |
KIT_6205 | 413 | Too many files in one batch. |
KIT_6220 | 429 | Session daily page limit reached (session auth only). |
AUTH_1015 | 403 | API key is missing the kit.documents scope. |
BILLING_2001 | 402 | Insufficient balance (API key auth). |
A session rate-limit error carries the specifics:
{
"error": {
"code": "KIT_6220",
"message": "Daily document limit reached. Resets at midnight UTC.",
"details": { "used_pages": 500, "daily_limit": 500, "reset_at": "2026-05-23T00:00:00Z" }
}
}Related
Tools for the rest of the tool surface • Vision when you need a model to reason about layout rather than extract text • Chat completions for feeding the Markdown to a model.