Skip to content

Statement Parser

Status: Live and confirmed working
Host: big-boi (192.168.1.21)
Parser scripts: /opt/statement-parser/parse.py and ingest.py
Ingest log: /var/log/statement-ingest.log
Ansible role: ansible-heezy/roles/statement-parser/

Ingest Pipeline

The NFS ingest pipeline automatically monitors and processes bank statements:

  1. Drop a bank statement PDF to /nfs/heezy/ingest/raw/statements/new/ on big-boi
  2. Cron runs every 5 minutes: ingest.py picks up new PDFs
  3. parse.py extracts transactions using pdfplumber
  4. Transactions get auto-populated with:
  5. category - auto-classified by rule-based keyword matcher
  6. merchant_normalized - standardized and alias-mapped
  7. Data written to bank_statements + bank_transactions in heezy Postgres
  8. Processed file moved to processed/, errors to error/

Supported Banks

  • Bank of America: Checking account + CC detection
  • Capital One 360: Checking + Savings

Statement ID Format

{bank}{type}{last4}_{YYYY-MM-DD} - e.g. boachk1897_2026-01-27

Current Data

  • 7 statements: BoA Jan-May 2026, Cap1 Q1 2026
  • 134 transactions
  • Missing: June 2026 BoA, all credit card statements (Phase 2 pending)

Upload Workflow

  1. Download statement PDF from your bank portal
  2. Transfer to big-boi:
    scp statement.pdf mcp-admin@192.168.1.21:/nfs/heezy/ingest/raw/statements/new/
    
  3. Parser picks it up within 5 minutes. Check status:
    tail -f /var/log/statement-ingest.log
    

Why Not Gmail Auto-Fetch?

Bank emails (BoA, Capital One) are notification-only - they don't include downloadable statements. You must manually download the PDF from the bank portal.

File permissions

NFS drop directory is accessible by the ingest process. No manual chown needed if using scp to the drop folder.

Categorization

Transactions are automatically classified by rule-based keyword matcher on insert. No manual action required.

To recategorize all transactions (e.g., after rule updates):

node /home/node/.openclaw/workspace/scripts/categorize-bank-transactions.js