Statement Parser¶
Status: Live and confirmed working
Host: big-boi (192.168.1.21)
Parser scripts: /opt/statement-parser/parse.py and ingest.py
Ingest log: /var/log/statement-ingest.log
Ansible role: ansible-heezy/roles/statement-parser/
Ingest Pipeline¶
The NFS ingest pipeline automatically monitors and processes bank statements:
- Drop a bank statement PDF to
/nfs/heezy/ingest/raw/statements/new/on big-boi - Cron runs every 5 minutes:
ingest.pypicks up new PDFs parse.pyextracts transactions using pdfplumber- Transactions get auto-populated with:
category- auto-classified by rule-based keyword matchermerchant_normalized- standardized and alias-mapped- Data written to
bank_statements+bank_transactionsin heezy Postgres - Processed file moved to
processed/, errors toerror/
Supported Banks¶
- Bank of America: Checking account + CC detection
- Capital One 360: Checking + Savings
Statement ID Format¶
{bank}{type}{last4}_{YYYY-MM-DD} - e.g. boachk1897_2026-01-27
Current Data¶
- 7 statements: BoA Jan-May 2026, Cap1 Q1 2026
- 134 transactions
- Missing: June 2026 BoA, all credit card statements (Phase 2 pending)
Upload Workflow¶
Manual Method (Recommended)¶
- Download statement PDF from your bank portal
- Transfer to big-boi:
- Parser picks it up within 5 minutes. Check status:
Why Not Gmail Auto-Fetch?¶
Bank emails (BoA, Capital One) are notification-only - they don't include downloadable statements. You must manually download the PDF from the bank portal.
File permissions
NFS drop directory is accessible by the ingest process. No manual chown needed if using scp to the drop folder.
Categorization¶
Transactions are automatically classified by rule-based keyword matcher on insert. No manual action required.
To recategorize all transactions (e.g., after rule updates):