Skip to main content
A parse returns the document two ways: a ready-to-use markdown string, and a content_list of typed blocks in reading order for when you need structure or position. Pick whichever fits.

The shape, recapped

{
  "markdown": "# Quarterly Report\n\n...",
  "page_count": 1,
  "content_list": [
    {
      "type": "text",
      "text": "Quarterly Report",
      "text_level": 1,
      "page_idx": 0,
      "bbox": [80, 60, 620, 110]
    }
  ]
}
content_list is a flat list across the whole document; each block carries a type, type-specific content fields, a page_idx (which page it’s on, 0-based), and a bbox (position). See the API reference for the full field list.

Just use the Markdown

For most uses — feeding an LLM, indexing for search, displaying the document — you don’t need the blocks at all. The markdown field is the whole document already assembled, with headings, inline tables and equations, and image links:
markdown = data["markdown"]
Reach for content_list only when you need to filter by block type or work with positions.

Filter blocks by type

Because every block is typed, pulling out one kind of content is a filter. For example, collect every table (table content comes through as HTML in table_body):
tables = [
    block["table_body"]
    for block in data["content_list"]
    if block["type"] == "table"
]
The field you read depends on the type: text blocks (and titles, headers, footers) use text; equation uses text (LaTeX); image/chart use img_path and content; list uses list_items; code uses code_body. Headings are text blocks that also carry a text_level (1–4), so you can pull just the titles:
titles = [b["text"] for b in data["content_list"] if b.get("text_level")]

Use bounding boxes

Each bbox is [x_min, y_min, x_max, y_max], scaled to 0–1000 of the page (not pixels), with the origin at the top-left. Because they’re relative, they work at any resolution — divide by 1000 and multiply by your rendered page’s pixel size, for example to crop a figure:
x_min, y_min, x_max, y_max = block["bbox"]
left, top = x_min / 1000 * page_width, y_min / 1000 * page_height
right, bottom = x_max / 1000 * page_width, y_max / 1000 * page_height
# crop = image.crop((left, top, right, bottom))
Coordinates are relative to the block’s own page, so use block["page_idx"] to pick the right rendered page before applying them.