Skip to content
ComPDF

Extract schema

extract_fields is the key input for document extraction. It defines which fields and table headers you want to extract, plus the prompts associated with them.

Base structure

Both modes use the same single-schema shape:

json
{
  "name": "Invoice",
  "keys": {
    "Title": { "prompt": "Invoice title", "mapping": null },
    "Date":  { "prompt": "Invoice date", "mapping": null }
  },
  "tableHeaders": {
    "LineItems": {
      "Item":   { "prompt": "Item name",  "mapping": null },
      "Amount": { "prompt": "Item total", "mapping": null }
    }
  }
}
  • Custom field extraction — fill in keys / tableHeaders to extract a predefined schema.

You can iterate on a schema in the Online Tools site and use its "Export Schema" button to copy the JSON. Paste it directly into extract_fields for both vision and layout modes.

Field reference

FieldTypeMeaning
namestringSchema name for identification
keysobjectScalar key-value fields
tableHeadersobjectTable field definitions grouped by table name
promptstringInstructional prompt for the model; can be null
mappingstringOptional mapping metadata for your downstream system; can be null

When to use keys

Populate keys when you already know which scalar fields you need, for example:

  • invoice number
  • issue date
  • consignee
  • contract ID

When to use tableHeaders

If the target document contains line-item tables, fee tables, or other repeated tabular structures, define them in tableHeaders so the output lands in a more stable format.

json
{
  "name": "auto",
  "keys": {},
  "tableHeaders": {}
}

Recommendations

Next, see Response structure for how extracted values and grounding data are returned.