Question 1

Is this just AI OCR? What's different?

Accepted Answer

Traditional OCR returns characters; document agents return decisions. The agent runs structured extraction (vision + schema), applies your business rules (PO matching tolerance, vendor whitelist, duplicate detection), routes by confidence tier (auto-post / review / reject), and writes to your ERP with the correct mappings. OCR is a 1990s technology; this is what it should have grown up to be.

Question 2

How accurate is it in practice?

Accepted Answer

On well-tuned schemas with clean inputs, we routinely see 85–92% auto-post rates with human review on the rest. That number depends heavily on the variety of document layouts and your tolerance for false positives. We measure accuracy honestly — recall, precision, per-field error rates — and tune until the numbers justify deploying.

Question 3

What document types can you process?

Accepted Answer

Invoices (PDF, scanned, image), receipts (often phone-camera quality), purchase orders, delivery notes, contracts (clause extraction), insurance claim forms, KYC / identity documents, tax forms, expense reports, packing lists. If the document has a definable schema, we can build for it. Handwritten documents are harder but possible with the latest vision models.

Question 4

What if a document type is brand new and the agent has never seen it?

Accepted Answer

It routes to a human review queue with a structured 'unknown layout' tag. The reviewer either categorises it (adds it to the known types) or processes it manually. We log every unknown layout so we can decide if it's worth adding to the schema. Most clients see fewer unknown layouts after the first month.

Question 5

How do you handle vendor / supplier variations in invoice layouts?

Accepted Answer

Two approaches that we combine. (1) A general extraction prompt with a strict Zod schema that handles 80–90% of layouts via the LLM's general capability. (2) Per-vendor templates for high-volume vendors where we know the layout — these are faster and cheaper. We escalate to a human reviewer only when both fail.

Question 6

Does it integrate with NetSuite / SAP / Dynamics / Xero / Sage?

Accepted Answer

Yes. We have shipped integrations with NetSuite (SuiteTalk SOAP and REST), Microsoft Dynamics 365 / Business Central, Xero, Sage Intacct, and several mid-market ERPs. Where the ERP supports webhooks we use them; otherwise we run a polling sync with idempotency. Custom field mapping is part of the build.

Question 7

What's the typical cost reduction vs manual processing?

Accepted Answer

Per-document cost drops from typically €1.50–€4.00 (loaded human cost for keying + checking) to €0.05–€0.20 (model + infra + a fraction of reviewer time). At 1,000 invoices per month that's €1,500–€3,800/month savings — usually payback within 6 months on a typical build.

Question 8

How long until live?

Accepted Answer

Single document type (invoices, for example) with one ERP integration: 4–6 weeks. Multiple document types or multi-ERP: 8–14 weeks. We always run in shadow mode first — the agent extracts in parallel with the human team for 1–2 weeks so we can compare outputs and tune before cutover.

Layer	Without it	With it
Vision extraction	Manual keying, ~3 min/doc	LLM extraction, ~5 sec/doc + cost €0.02
Schema validation	Garbage in, garbage to ERP	Invalid → review queue with structured reason
Business rules	Errors discovered downstream	Errors caught at the boundary
Confidence routing	All-or-nothing automation	Reviewer only sees the 10–15% that need attention
Audit trail	Compliance pain	Full provenance per posted record

Metric	Before	After (week 8)
Human time per invoice	4–6 min	30 sec (reviewer only)
% invoices keyed manually	100%	13%
Cycle time (receipt → posted)	2–4 days	4 hours
Visible error rate	3.2% (estimated)	0.8%
Cost per invoice	~€3.80	~€0.18

Layer	Default
Ingestion	Email (Resend webhook) / SFTP / portal upload
Storage	Firestore + Cloud Storage
Vision extraction	Claude Sonnet 4.6 / Gemini 2.0 Flash for cost-sensitive
Schema validation	Zod (TypeScript) end-to-end
Business rules	TypeScript on Cloud Functions
Reviewer UI	Next.js 16 + shadcn/ui + Firebase Auth
ERP integration	Custom — depends on your ERP (we have shipped NetSuite, Dynamics 365, Xero, Sage)
Observability	Langfuse + Sentry + custom dashboard
Cost attribution	Per-document trace

Engagement	Scope	Investment
Discovery + schema spec	1 week	€4,000–7,000
Single doc-type pipeline (e.g. invoices)	4–6 weeks	€25,000–45,000
Multi-doc pipeline	8–14 weeks	€60,000–120,000
Multi-ERP / multi-tenant	12–20 weeks	€100,000–200,000
Retainer (operations, evals, new schemas)	Monthly	from €2,000/month

AI Document Processing

What document processing actually is

When it pays off

The pipeline

Where the wins come from

Real numbers from a real engagement

How we build it

Phase 1 — Schema definition (1 week)

Phase 2 — Prototype (2 weeks)

Phase 3 — Production build (2–4 weeks)

Phase 4 — Phased rollout (1–2 weeks)

Phase 5 — Iterate (ongoing)

The stack

Pricing — the honest version

What we won't do

Frequently asked questions

Related work

Document Processing Agent

Document Intake Agent

How AI invoice processing actually works (and where it breaks)

AI agents for accounts payable: a deployment guide

RAG done right: the patterns that survive production

Ready to scope ai document processing?