Getting Started

Computare is an open-source personal finance platform that extracts transactions from Canadian bank statements, categorizes them with AI, and stores everything in a structured PostgreSQL database you control.

$ git clone https://github.com/Andrew-Girgis/Computare

$ cd Computare

$ computare init

Requires Python 3.11+, a Supabase project (or self-hosted PostgreSQL), and API keys for AI categorization (optional).

Supported Imports

Computare currently supports the following Canadian financial institutions and statement formats.

Scotiabank

PDF statements (chequing, credit card, iTRADE) · 2018 – present

Position-based word extraction with pdfplumber. Falls back to Claude AI vision when confidence is below threshold.

Wealthsimple

CSV exports (TFSA, Spending, Credit Card, Crypto) · 2021 – present

Direct CSV parsing. Clean structured data with no ambiguity.

American Express

Year-end CSV summaries · 2024 – present

Direct CSV parsing of annual statement summaries.

Categorization Pipeline

Transactions are categorized through a 3-tier pipeline. After the initial run, most transactions resolve from cache at zero cost.

Tier 1Description Rules

Pattern matching on transaction descriptions. Handles known formats like "INTERAC E-TRANSFER" or "MB-Transferto". Zero cost, instant.

Tier 2Merchant Cache

Lookup against a growing database of known merchants and their default categories. Zero cost after first encounter.

Tier 3LLM (GPT-4o-mini)

Only fires when rules and cache both miss. Sends only the transaction description — no amounts, no account numbers, no personal info. Covers roughly 15% of new transactions.

13 top-level categories with 33 subcategories including Food & Dining, Retail & Shopping, Transportation, Bills & Utilities, and more.

Self-Hosting

Computare is designed to be self-hosted. You control the database, the extraction pipeline, and all of your financial data.

DatabaseSupabase (managed PostgreSQL) or self-hosted PostgreSQL. All tables have Row Level Security enabled.
ExtractionPython package runs locally. PDF parsing uses pdfplumber. AI fallback uses Claude API (optional).
APIFastAPI REST API for categorization, merchant normalization, and health checks. Runs alongside your database.
FrontendNext.js App Router with Supabase auth. Self-host on Vercel, Cloudflare, or any Node.js runtime.

Privacy & Data Flow

Data ownership is non-negotiable. Here is exactly what happens to your data.

Always local

  • Your bank statements and CSV files
  • Transaction data in your database
  • Category rules & merchant cache
  • Dashboard & visualizations

Sent to LLM

  • Transaction descriptions only
  • No amounts or account numbers
  • Only for cache misses (~15%)
  • Optional — can disable Tier 3

Never shared

  • Bank credentials
  • Full transaction amounts
  • Personally identifiable info
  • Anything you don't upload

Architecture

DATA SOURCES              EXTRACTION
 Scotiabank PDFs ──┐      ┌─ pdfplumber (local)
 Wealthsimple CSVs ─┼──────┤
 Amex CSVs ─────────┘      └─ Claude AI (fallback)
                                  │
                                  ▼
                         CATEGORIZATION
                    ┌─ Tier 1: Description rules (free)
                    ├─ Tier 2: Merchant cache (free)
                    └─ Tier 3: GPT-4o-mini (batched)
                                  │
                                  ▼
                          Supabase (PostgreSQL)
                    ┌────────────────────────┐
                    │ institutions  accounts │
                    │ transactions  categories│
                    │ subscriptions receipts │
                    │ merchant_cache         │
                    │ 9 materialized views   │
                    └────────────────────────┘