Parse options
This page focuses on the request parameters for the parser-style document parsing endpoint. The examples below keep the file upload flow only and use the current public parameter names.
Request parameters
| Parameter | Location | Type | Required | Default | Description |
|---|---|---|---|---|---|
file | form | file | Yes | — | Input document file |
image_type | query | string | No | url | How images are embedded in Markdown: url or base64 |
content_filter | query | string | No | all | Keep only selected content block types |
options_json | form | JSON string | No | Built-in defaults | Parser configuration merged with the server defaults |
image_type
image_type controls how image content is represented in the Markdown result:
| Value | Meaning |
|---|---|
url | Embed image content as accessible URLs |
base64 | Embed image content inline as Base64 |
Use url for most frontend and knowledge-base integrations. Choose base64 when you need a fully self-contained Markdown artifact.
content_filter
content_filter narrows the result to selected block types. Common patterns:
| Value | Meaning |
|---|---|
all | Return all content blocks |
text | Keep only text-related content |
table | Keep only table-related content |
image | Keep only image-related content |
If your workflow only needs one category, filtering at request time is usually simpler than post-filtering in downstream code.
options_json
options_json is a JSON string that controls parsing behaviour. Typical options include:
- generating a document tree / catalog
- merging related table fragments
- re-levelling title hierarchy
- ignoring headers, footers, footnotes, and similar auxiliary content
Example:
{
"applyDocumentTree": true,
"mergeTables": true,
"relevelTitles": true,
"ignore_labels": [
"number",
"footnote",
"header",
"header_image",
"footer",
"footer_image",
"aside_text"
]
}2
3
4
5
6
7
8
9
10
11
12
13
14
ignore_labels
ignore_labels is typically passed inside options_json to suppress auxiliary block types in the parse output. The supported labels are:
numberfootnoteheaderheader_imagefooterfooter_imageaside_text
To keep all supported auxiliary content, pass an empty array explicitly:
--form 'options_json={"ignore_labels":[]}'Recommendations
- For real-time previews, prefer
image_type=urlto keep payloads smaller. - For search, extraction, or RAG workflows, use
content_filter=textorcontent_filter=tableto reduce downstream processing. - For layout-heavy documents, combine document-tree and table-merge options, then inspect the output with Response overview.