Job Cannon

A personal job-search command center: aggregates listings from Gmail alerts and SERP APIs, proactively scrapes career pages from a curated company watchlist (5-platform ATS coverage — Greenhouse / Lever / Ashby / SmartRecruiters / Workday — plus a tier-4 AI navigator for custom sites), scores everything with a cascade-routed AI pipeline (free local + free cloud providers, with Anthropic as paid fallback), and tracks application state. Single-user, runs on localhost.

Engineering Highlights

Single-tier ordinal scoring through a multi-provider cascade. Every job runs through one 'scoring' tier with a six-axis ordinal rubric. The cascade tries free providers first (Ollama local → Groq → Cerebras → Gemini) and falls through to Anthropic only when all free options are exhausted or rate-limited. Phase 33 shootout selected qwen2.5:14b (Ollama) as the production primary; typical monthly cost is ~$0. Classification (apply | consider | skip | reject) is derived in Python from the numeric sub-scores — never emitted by the LLM — which prevents classification drift across model swaps.
Schema-versioned SQLite migrations. 48 idempotent migrations applied via pragma user_version. Migration 41 introduces a backup-recency preflight that refuses destructive schema changes without a recent userdata snapshot (override via GSD_BACKUP_CONFIRMED=1 for alternate backup schemes).
Background scheduler with cross-process safety. APScheduler 3.x with a pidfile + psutil liveness check — survives Flask reloads, single-instance enforced. Auto-starts a local Ollama service for the nightly agentic-backfill tier.
HTMX-only frontend. No JS framework, no bundler, no build step. Inline expansion, partial fragments, server-driven UI. 36 Jinja2 templates, Tailwind via CDN, SortableJS for the kanban.
ATS coverage across 5 platforms with a tier-4 AI navigator. Greenhouse, Lever, Ashby, SmartRecruiters, and Workday have explicit scanners; the AI navigator caches Playwright recipes (16 active) for the long-tail of custom-built career sites (iCIMS, Phenom, UKG, bespoke).
Eval harness with paired MAE + BCa bootstrap 95% CIs for prompt-variant A/B testing across the full provider matrix (Ollama-local, Groq, Cerebras, Gemini, Anthropic).
Cost-gated execution. Configurable monthly budget cap; the cost-gate returns a bool and lets callers decide whether to fail-open or raise — the orchestrator and the scheduler choose differently and that's intentional.
2163 tests (unit + integration + Playwright e2e) green on the CI matrix (Ubuntu + Windows × Python 3.13).

Quick Start

git clone https://github.com/Senkichi/job-cannon.git
cd job-cannon
uv sync --extra dev --extra eval

# First run only — DO NOT run if config.yaml or .env already exist:
if (-not (Test-Path config.yaml)) { Copy-Item config.example.yaml config.yaml }
if (-not (Test-Path .env))        { Copy-Item .env.example .env }
# Add ANTHROPIC_API_KEY to .env (https://console.anthropic.com/settings/keys)

uv run job-cannon
# Open http://localhost:5000

For Gmail OAuth setup and full configuration reference, see docs/SETUP.md.

Architecture

flowchart LR
  Gmail[Gmail Alerts<br/>LinkedIn / Glassdoor / ZipRecruiter] --> Parser
  SerpAPI --> Parser
  JSearch --> Parser
  Thordata --> Parser
  DataForSEO --> Parser
  ATS[ATS Scanners<br/>Greenhouse / Lever / Ashby / Workday / SmartRecruiters] --> Parser
  Parser[Source Parsers<br/>+ Normalize] --> DB[(SQLite + WAL)]
  DB --> Score[Cascade Scoring<br/>six-axis ordinal rubric]
  Score -->|tries in order| Cascade{{Ollama qwen2.5:14b<br/>→ Groq → Cerebras<br/>→ Gemini → Anthropic}}
  Cascade --> Classify[Python-derived<br/>classification]
  Classify --> Dashboard[Flask + HTMX<br/>localhost:5000]
  DB --> Pipeline[Application<br/>Pipeline Tracker]
  Pipeline --> Dashboard

For deeper subsystem detail, see docs/architecture/.

Tech Stack

Layer	Tooling
Runtime	Python 3.13, Flask 3.1, APScheduler 3.x
Storage	SQLite (WAL mode) — raw SQL, no ORM
Frontend	Jinja2 + jinja2-fragments, HTMX 2.x, Tailwind (CDN), SortableJS
AI	Multi-provider cascade: Ollama (qwen2.5:14b primary) → Groq → Cerebras → Gemini → Anthropic SDK (paid fallback)
Sources	Gmail API v1 (OAuth), SerpAPI, JSearch, Thordata, DataForSEO
Tooling	uv (canonical), ruff, pre-commit, gitleaks, commitizen, pytest
CI	GitHub Actions (Ubuntu + Windows matrix), Codecov upload

Project Structure

job_finder/
|-- web/                    # Flask app (11 blueprints, scheduler, AI clients, ATS)
|-- parsers/                # Email parsers (LinkedIn, Glassdoor, ZipRecruiter, Indeed stub)
|-- sources/                # Data sources (Gmail, SerpAPI, JSearch, Thordata, DataForSEO)
|-- scoring/                # Single-tier ordinal scoring + six-axis rubric helpers
|-- eval/                   # Eval harness + bootstrap CIs
|-- models.py               # Job dataclass with dedup_key
|-- config.py               # YAML config loader + path discovery
|-- __main__.py             # `uv run job-cannon` entry point
`-- db/                     # SQLite persistence (raw SQL, no ORM); package since S7d (2026-05-06)
tests/                      # 2163 tests, unit + integration + e2e
docs/
|-- SETUP.md                # Gmail OAuth, config reference, troubleshooting
`-- architecture/           # Subsystem deep-dives

The 11 blueprints: admin, batch_scoring, companies, costs, dashboard, detections, jobs, pipeline, profile, settings, sync.

Cost Estimates

The cascade tries free providers first, so typical monthly AI cost is ~$0. Anthropic only enters the picture as a paid fallback when every free provider in the chain is exhausted or rate-limited.

Provider	Cost	When
Ollama (qwen2.5:14b local)	$0	Primary — runs locally
Groq / Cerebras / Gemini free tiers	$0	Each gated by per-day request limits
Anthropic (paid fallback)	~$0.05–0.15 per job scored	Only when all free providers exhausted

A configurable budget cap (default $25/mo, set in config.yaml under scoring.monthly_budget_usd) limits the Anthropic fallback if it ever activates. The app stops paid scoring when the cap is reached and resumes the next month.

Optional SERP sources: SerpAPI, JSearch, Thordata, and DataForSEO are all opt-in. Each has its own pricing tier — see config.example.yaml for details.

Platform Compatibility

Developed on Windows 11, tested with Python 3.13.
macOS / Linux supported (no Windows-only code paths). The repo's .githooks/ are bash; on Windows use Git Bash.
SQLite ships with Python — no separate database install.
No Docker, no cloud services, no deployment required.

Running Tests

uv run --active pytest -q --tb=short        # full suite
uv run --active pytest -m "not e2e"         # skip Playwright e2e tier
uv run --active pytest tests/test_db.py -v  # one file

Tests use temp SQLite databases and a mocked Anthropic client — no API keys needed for unit / integration. The e2e tier requires uv run --active playwright install chromium once.

Documentation

Setup guide — Gmail OAuth, config, troubleshooting
Architecture deep-dive — entry points, scoring, migration strategy, scheduler, concerns
Contributing — development workflow, commit style, scope check
Security policy — threat model, reporting

License

GNU AGPL v3.0 or later — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 698 Commits
.githooks		.githooks
.github		.github
.planning/eval_results		.planning/eval_results
docs		docs
job_finder		job_finder
scripts		scripts
tests		tests
.editorconfig		.editorconfig
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
backup_userdata.sh		backup_userdata.sh
config.example.yaml		config.example.yaml
experience_profile.example.json		experience_profile.example.json
pyproject.toml		pyproject.toml
run.py		run.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Job Cannon

Engineering Highlights

Quick Start

Architecture

Tech Stack

Project Structure

Cost Estimates

Platform Compatibility

Running Tests

Documentation

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Job Cannon

Engineering Highlights

Quick Start

Architecture

Tech Stack

Project Structure

Cost Estimates

Platform Compatibility

Running Tests

Documentation

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages