PDF, OCR & document processing
Document processing APIs for PDF generation, OCR, text extraction, and document conversion.
Generate PDFs from HTML or templates. Create invoices, reports, certificates, and more.
POST /api/v1/documents/pdf/generate
{ "html": "<h1>Invoice #1234</h1>...",
"options": { "format": "A4",
"margin": "20mm" } }
// Response
{ "pdf_url": "https://cdn.../doc.pdf",
"pages": 2,
"size_bytes": 45230 }
Extract text from images and scanned documents. Support for 100+ languages with high accuracy.
POST /api/v1/documents/ocr
{ "image_url": "https://.../scan.jpg",
"language": "en" }
// Response
{ "text": "Invoice Date: 01/15/2025\n
Amount Due: $1,250.00...",
"confidence": 0.96,
"blocks": [...] }
Convert between document formats. DOCX, PDF, HTML, TXT, RTF, and more.
POST /api/v1/documents/convert
{ "file_url": "https://.../report.docx",
"output_format": "pdf",
"options": { "preserve_links": true } }
// Response
{ "converted_url": "https://.../report.pdf",
"original_format": "docx",
"pages": 12 }
Merge multiple PDFs into one or split a PDF into separate files by page ranges.
POST /api/v1/documents/pdf/merge
{ "files": [
"https://.../doc1.pdf",
"https://.../doc2.pdf"
] }
// Response
{ "merged_url": "https://.../merged.pdf",
"total_pages": 24,
"size_bytes": 102400 }
Extract tables from PDFs and images. Get structured data in JSON, CSV, or Excel format.
POST /api/v1/documents/tables/extract
{ "file_url": "https://.../report.pdf",
"output_format": "json" }
// Response
{ "tables": [
{ "page": 1, "rows": 15, "cols": 4,
"data": [["Item", "Qty", "Price"],
["Widget A", "10", "$50"]...] }
] }
Classify documents by type. Invoices, receipts, contracts, IDs, and custom categories.
POST /api/v1/documents/classify
{ "file_url": "https://.../document.pdf" }
// Response
{ "document_type": "invoice",
"confidence": 0.94,
"subtypes": ["commercial_invoice"],
"detected_fields": ["vendor", "amount",
"date", "invoice_number"] }
Get your API key and start using Document APIs in minutes.