The image is bad
Faded thermal paper. Crumpled wrinkled receipts. Photos taken at angles or in low light. Reflections, shadows, blur. Most OCR engines need clean inputs to hit advertised accuracy.
PLATFORM · OCR RECEIPT PROCESSING
REME's OCR engine extracts 40+ structured fields from any receipt — faded thermal paper, crumpled taxi invoices, multi-language Hawker Centre stalls. 11 languages. 8 tax jurisdictions. A per-company model that learns your vendors, categories, and patterns over time.
MAXWELL FOOD CENTRE
Stall 31 · 1 Kadayanallur St, Singapore 069184
GST Reg: M90364899X
14 Apr 2026 12:34 PM
EXTRACTED IN 1.2s
Vendor demos use clean receipts on white backgrounds in good lighting. Real expense workflows don't. They use thermal paper that faded after a week in a wallet, mobile photos taken at angles, multi-language merchants in Bangkok or Dubai, and crumpled invoices from a taxi driver who barely speaks English. The "98% accuracy" claim that's true in the demo drops to 70% in production — which means your finance team rebuilds 30% of every batch by hand.
29% of T&E managers
still process expenses manually in 2026 — not because they want to, but because OCR engines fail on the receipts they actually receive. The gap between "lab accuracy" and "production accuracy" is the difference.
Source: Skift / Navan 2026 State of Corporate Travel and Expense Report.
Faded thermal paper. Crumpled wrinkled receipts. Photos taken at angles or in low light. Reflections, shadows, blur. Most OCR engines need clean inputs to hit advertised accuracy.
A Hawker Centre stall in Singapore. A souk vendor in Dubai. A bodega in São Paulo. Every region has its own receipt format. Engines trained on US/EU receipts misread totals as taxes, dates as line items, line items as merchants.
A receipt in Mandarin, Arabic, Thai, Bahasa Indonesia, or Hindi. Many "multi-language" OCR engines fall back to character-level recognition without understanding receipt structure — extracting text but not data.
REME's OCR engine doesn't just read text. It runs a five-stage pipeline that handles every step from raw photo to structured, validated data ready for posting.
De-skew the image, correct lighting, sharpen text, separate receipt from background. Even bad photos become readable.
Computer vision model extracts every text element on the receipt — merchant, dates, amounts, line items, tax codes.
LLM reads the recognized text in context — distinguishes total from subtotal, tax from tip, merchant from address.
40+ structured fields output. Each field cross-checked against known patterns, merchant databases, and your company's history.
Every correction your team makes feeds back into your company-specific model. Accuracy gets better every week.
End-to-end pipeline latency: typically 1.0–2.5 seconds per receipt. The employee sees the extracted data appear in WhatsApp before they put their phone down.
Most OCR engines extract a handful of "header" fields — merchant, total, date — and call it a day. REME extracts 40+ structured fields including line items, multi-jurisdiction tax codes, payment methods, and merchant categorizations. The fields you'd care about for proper accounting are all there.
"Self-learning AI" is in every vendor's marketing deck. What it actually means matters. REME maintains a per-company model that captures your specific patterns — your top vendors, your category mappings, your typical tax breakdowns, your receipt formats. Every correction your team makes during the first weeks feeds back into the model. By month three, your accuracy is materially higher than it was on day one.
Accuracy curve from a 200-employee REME customer with multi-currency operations across SG, IN, AU, UK. First-month corrections fed the per-company model; corrections dropped 80% by month three.
What the model learns
Your top vendors
After 50 submissions from "Maxwell Food Centre," the model knows the merchant name, typical amounts, and category — even if a receipt is partially unreadable.
Your category mappings
When your team consistently codes "Grab" trips to "Travel — Local Transport," the model learns the mapping. Future Grab receipts auto-code correctly.
Your tax patterns
GST for India, VAT for UK, IRAS for Singapore. The model learns which jurisdictions your team operates in and applies the right tax engine.
Your receipt formats
If your company gets lots of e-invoices from a specific vendor, the model recognizes that vendor's PDF layout. Hawker centre handwritten receipts? Same — once seen, recognized.
Your team's corrections
When a finance manager corrects a category or amount, that specific correction pattern reinforces the model. Same correction never needed twice.
"99.9% OCR accuracy" is a statement about clean, well-lit POS receipts — which are the easiest case. Real business receipt mixes include faded thermal paper, crumpled invoices, mobile shots in poor lighting, and handwritten bar tabs. Here's how REME actually performs across receipt types, alongside honest industry benchmarks.
Industry averages from public OCR benchmarks (ICDAR, FUNSD datasets, vendor-disclosed numbers). REME production numbers from internal customer deployments — actual performance varies by receipt mix and deployment maturity. The self-learning model typically improves accuracy 15–25% over the first 90 days as it adapts to your company's specific vendors and receipt formats.
Most OCR engines list "multi-language support" without specifying what that means. REME natively reads receipts in 11 languages and applies the correct tax engine for each jurisdiction. Singapore's IRAS rules differ from India's GST. UAE FTA differs from UK HMRC. Mid-market companies operating across these regions need an OCR engine that understands the difference.
More languages added quarterly based on customer geographic distribution.
Multi-component breakdowns supported (e.g., CGST + SGST + IGST for India).
Receipt extraction is step zero. Without accurate OCR, fraud detection has nothing to compare, policy enforcement can't apply rules, and approval routing doesn't know who to route to. OCR is the foundation that makes the rest possible.
99.9% accuracy across 40+ fields. The data foundation.
This page5 rule categories check the extracted data against your policies in 200ms.
Read more →7 AI agents check the extracted data for duplicates, fakes, anomalies.
Read more →Tiered routing decides who approves based on the extracted amount, category, and project.
Read more →Every layer above OCR depends on the data OCR extracts. 99.9% accuracy isn't a vanity metric — it's the foundation that makes the rest of the platform reliable.
Extracted, structured data posts in real time to your accounting and ERP system. No middleware, no manual export-import, no data quality cleanup at month-end.
2-way real-time sync · 40+ field mapping · Multi-currency · Multi-entity support
OCR FAQ
Most modern OCR engines claim 95–99% accuracy on header fields (merchant, total, date) — but those numbers come from clean test receipts. REME's 99.9% accuracy holds across production receipts including faded thermal paper, mobile photos, multi-language vendors, and crumpled receipts from real travel scenarios. The accuracy is also self-learning: it starts at ~96% on day one and reaches 99.9% over the first 90 days as the model learns your company's specific vendors, categories, and tax patterns.
REME maintains a per-company OCR model that captures your specific patterns. After ~50 submissions from a specific vendor, the model knows that vendor's typical amounts, categories, and tax breakdowns — even if a future receipt is partially unreadable. After 200+ corrections by your finance team, the model learns your specific category mappings and stops making the same mistakes. By month three, your model is materially better than it was on day one. This is different from generic LLM-based OCR that doesn't retain company-specific patterns.
40+ structured fields across four categories: (1) Transaction fields — total, subtotal, currency, tip, discount, payment method, date, time, receipt ID. (2) Merchant fields — name, address, tax registration, category, country, phone, format type. (3) Tax fields — GST, VAT, sales tax, service charge, tax registration validation, recoverable amount, multi-component breakdowns. (4) Line-item fields — description, quantity, unit price, item-level discount, item-level tax, category, total per line. Most legacy OCR engines extract 5–10 header fields; REME extracts the 40+ fields you'd actually want for proper accounting.
11 languages natively: English, Mandarin (Simplified + Traditional), Bahasa Indonesia, Bahasa Melayu, Hindi, Tamil, Thai, Vietnamese, Arabic, French, German. More languages added quarterly based on customer geographic distribution. Native support means the OCR engine understands receipt structure in that language — not just character recognition. A Mandarin-language receipt extracts merchant, amount, and tax breakdowns correctly; it doesn't just produce a text dump.
8 native tax engines: IRAS (Singapore), GST (India + Australia), VAT (UK + EU member states), IRS (US), HMRC (UK supplementary), FTA (UAE + GCC). Multi-component breakdowns supported (e.g., CGST + SGST + IGST for India, where GST has multiple components). The engine identifies the jurisdiction from the receipt — typically from the merchant's tax registration number or country of issue — and applies the correct tax rules automatically. For multi-entity companies operating across jurisdictions, this is one of the strongest reasons teams pick REME over geographically-concentrated tools.
1.0–2.5 seconds end-to-end on the typical receipt. The five-stage pipeline (pre-process, recognize, understand structure, extract & validate, learn) runs sequentially, but each stage is optimized for sub-second performance. The employee sees the extracted data appear in their WhatsApp conversation before they put their phone down. The submitter can correct any field with a follow-up message ('the date is March 14, not 4') and the model learns from that correction.
REME's pipeline assigns a confidence score to every extracted field (0–100%). If confidence falls below the configured threshold (default: 95%), the field is flagged for human review — either by the submitter ('we extracted SGD 7.63 — please confirm') or by the finance team during approval. Low-confidence fields don't silently flow into your accounting system as bad data. The threshold is configurable per company; some teams require 99%+ confidence on tax fields specifically because GST/VAT recovery depends on accurate amounts.
OCR is step one in a four-layer pipeline. Once OCR extracts the structured data, it flows immediately to: (1) Policy enforcement, where 5 rule categories check the extracted data against your company's policies in 200 milliseconds. (2) Fraud detection, where 7 AI agents check for duplicates, fake receipts, merchant fabrications, and anomalous patterns. (3) Approval routing, where the engine routes the claim to the right approver based on amount, category, and project. Every layer depends on accurate OCR — which is why 99.9% accuracy isn't a vanity metric but the foundation that makes the rest of the platform reliable.
Bad OCR is the silent tax on every expense management tool. Stop paying it.
40+ fields extracted. 11 languages. 8 tax jurisdictions. Self-learning per company. The OCR engine that's actually built for the receipts your team submits, not the receipts in vendor demos. Backed by our adoption guarantee — if your team doesn't hit 80% in 30 days, we waive the next 60 days of paid usage.
The Adoption Guarantee
If your team doesn't hit 80%+ adoption within 30 days of rollout, we waive the next 60 days of paid usage. WhatsApp-based submission delivers 90%+ adoption in week one — we put our pricing where our promise is.