Receipts Scanner¶
URL: https://receipts.heezy.info
Internal: http://192.168.1.15:30850
k8s: heezy namespace, deployment receipts
Source: heezy-containers/dockerfiles/receipts/
How It Works¶
- Upload photo of receipt via web UI
- EXIF rotation correction applied immediately
- AWS Textract runs OCR in background (~30–45s)
- Extracted: merchant, total, date, line items
- Ollama (llama3.2:3b) categorizes each line item
- Results appear in UI, editable
Database¶
Writes to receipts + receipt_items tables in heezy Postgres.
Key columns on receipts:
- merchant — raw extracted merchant name
- merchant_normalized — lowercased + alias-mapped
- date — raw OCR date string (legacy, mixed formats)
- date_ts — parsed timestamptz (canonical, used by dashboard)
- category — overall receipt category
- payment_type — cash / credit / debit / check / other (set manually via edit form)
OCR Pipeline¶
- Primary: AWS Textract (handles rotation natively, structured output)
- Fallback: Tesseract (if no AWS creds)
- Textract IAM user:
receipts-textractin terraform-heezy - Creds via ExternalSecret → OpenBao
production/heezy/receipts/aws-credentials
Storage¶
- Images: NFS PVC (
192.168.1.200:/mnt/Arr1/SMB/receipts/) - DB: Postgres on big-boi
Reprocessing¶
Forces OCR re-run on a stuck or failed receipt.