API Catalog Document APIs

Document APIs

PDF, OCR & document processing

Document processing APIs for PDF generation, OCR, text extraction, and document conversion.

PDF Generation

Popular

Generate PDFs from HTML or templates. Create invoices, reports, certificates, and more.

POST /api/v1/documents/pdf/generate
{ "html": "<h1>Invoice #1234</h1>...",
  "options": { "format": "A4",
    "margin": "20mm" } }

// Response
{ "pdf_url": "https://cdn.../doc.pdf",
  "pages": 2,
  "size_bytes": 45230 }
HTML to PDF Templates Custom Styling

OCR

Popular

Extract text from images and scanned documents. Support for 100+ languages with high accuracy.

POST /api/v1/documents/ocr
{ "image_url": "https://.../scan.jpg",
  "language": "en" }

// Response
{ "text": "Invoice Date: 01/15/2025\n
  Amount Due: $1,250.00...",
  "confidence": 0.96,
  "blocks": [...] }
100+ Languages Handwriting Layout Detection

Document Conversion

Convert between document formats. DOCX, PDF, HTML, TXT, RTF, and more.

POST /api/v1/documents/convert
{ "file_url": "https://.../report.docx",
  "output_format": "pdf",
  "options": { "preserve_links": true } }

// Response
{ "converted_url": "https://.../report.pdf",
  "original_format": "docx",
  "pages": 12 }
DOCX PDF HTML

PDF Merge/Split

Merge multiple PDFs into one or split a PDF into separate files by page ranges.

POST /api/v1/documents/pdf/merge
{ "files": [
    "https://.../doc1.pdf",
    "https://.../doc2.pdf"
  ] }

// Response
{ "merged_url": "https://.../merged.pdf",
  "total_pages": 24,
  "size_bytes": 102400 }
Merge Split Page Ranges

Table Extraction

Extract tables from PDFs and images. Get structured data in JSON, CSV, or Excel format.

POST /api/v1/documents/tables/extract
{ "file_url": "https://.../report.pdf",
  "output_format": "json" }

// Response
{ "tables": [
  { "page": 1, "rows": 15, "cols": 4,
    "data": [["Item", "Qty", "Price"],
      ["Widget A", "10", "$50"]...] }
] }
PDF Tables Image Tables CSV Export

Document Classification

Classify documents by type. Invoices, receipts, contracts, IDs, and custom categories.

POST /api/v1/documents/classify
{ "file_url": "https://.../document.pdf" }

// Response
{ "document_type": "invoice",
  "confidence": 0.94,
  "subtypes": ["commercial_invoice"],
  "detected_fields": ["vendor", "amount",
    "date", "invoice_number"] }
Auto-Classify Custom Types Field Detection

Ready to process documents at scale?

Get your API key and start using Document APIs in minutes.