This file provides guidance to coding agents (Claude Code, Cursor, etc.) when working in this repository.
uv sync # install deps (creates .venv)
uv run pytest # all tests (uses testcontainers → needs Docker/OrbStack)
uv run pytest --cov # with coverage gate (fail_under = 85, branch coverage)
uv run pytest tests/test_parsers.py # one file
uv run pytest -k test_revert # one test by keyword
uv run ruff check . && uv run ruff format --check . # lint + format check
uv run ruff format . # auto-format
uv run pyright # type-check (strict mode for src/)
RIPTIDE_DB_URL=... uv run alembic upgrade head # apply migrations
RIPTIDE_DB_URL=... uv run alembic downgrade base # tear down
podman-compose up # local dev: Postgres + migrations + app on :8000If docker ps fails, ask the user to start OrbStack.
- Append-only ingestion. Every webhook handler does
INSERT … ON CONFLICT (delivery_id) DO NOTHING. NeverUPDATEorDELETEevent rows. Webhook retries must be idempotent.delivery_idis the dedup key for each source. - Raw payload always stored.
payload JSONBkeeps the full request body even if fields are extracted into typed columns. Don't drop fields you don't currently use. riptide.jsonis config, not data.openshift/collector/riptide.jsonis the in-repo sample; it declares teams (name +group_email) and org-wide automation rules. Edits go through PRs. The running pod hot-reloads via mtime inRiptideConfigStore.maybe_reload(). Do not propose moving the config into Postgres.- Per-team bearer keys live in a separate file: in production it is mounted from the
riptide-collector-team-keysSecret (never committed); the in-repoopenshift/collector/team-keys.jsonis a dev sample with deterministic test hashes (raw dev bearers documented incompose.yaml). Stored as sha256 hashes;TeamKeysStorehot-reloads it the same way as the config. The bearer is the team identity — every webhook is tagged withteam = caller_team. - No
servicecolumn. Noservice_idon the wire. Per-source aggregations group byrepo_full_name/pipeline_name/app_name/repo; cross-source aggregations join oncommit_sha; org-wide rollups group byteam. Identifiers are lowercased at ingest (commit_sha,revision,repo_full_name,branch_name,repo) so SQL joins are case-stable. Do not propose adding a unifiedservicecolumn orservice_idfield — it served only single-pane labelling and was dropped. automationis org-wide. Bot definitions live at the config root, not per team.- Metrics are computed on read, not at ingest. Don't add aggregation tables or scheduled rollup jobs in v1. Schema additions should preserve raw events; new metrics are SQL queries against existing rows or future materialized views.
- Commit SHA is the universal join key. All three sources record
commit_sha/revision. Lead-time joins betweenbitbucket_events,pipeline_events,argocd_eventshappen on this column. change_typelives on Bitbucket events only. Don't denormalise it onto pipeline / Argo rows; join viacommit_shaat read time.- CI events are source-tagged, not source-routed. All pipeline events from any CI (Jenkins, Tekton, …) land in the single
pipeline_eventstable viaPOST /webhooks/pipeline, distinguished by thesourcecolumn. Do not add per-CI tables or endpoints. The dedup key issource#pipeline_name#run_id#phase. - Noergler events carry finops + reviewer-precision only. The
noergler_eventstable isevent_type-discriminated (completed|feedback) and is fed byPOST /webhooks/noerglerfrom optional noergler instances. Do not re-emit PR lifecycle from noergler —bitbucket_eventsalready covers open / merged / declined. Dedup keys:completed#<run_id>andfeedback#<finding_id>#<verdict>. - Senders verify reachability + bearer at startup via
GET /auth/ping. Authenticated endpoint returning{"status":"ok","team":"<caller_team>"}. Use this from any sender (noergler, future ones) to fail-fast on a wrong token. Don't reuse/health(unauth liveness) or/ready(unauth readiness) for this — those answer different questions. modified_athas a Postgres trigger (riptide_set_modified_at), not just SQLAlchemyonupdate. Raw-SQL updates also bump it. Keep the trigger when changing migrations.- Database is external.
riptide-collectordoes NOT manage Postgres. Do not add a Postgres Deployment toopenshift/. - Pyright strict for
src/, standard fortests/andmigrations/. New code undersrc/must satisfy strict mode — noAnyleaks; narrowOptionals withisinstanceor helpers like_as_dict()inrouters/bitbucket.py.
- Layering. Routers do HTTP + auth + dispatch only. Source-specific payload extraction lives in
parsers_<source>.py(e.g.parsers_bitbucket.py) as pure functions returning a typed*EventDraftdataclass — no HTTP, no DB, no config. The router computes config-derived fields (e.g.automation_source) and persists. Don't put extraction logic in routers; don't duplicate JSON-shape coercion (_as_dict/_as_list-style helpers belong with the extractor that uses them). - Single Python package,
riptide_collector(flat top-level, not a namespace package). Future suite components (e.g.riptide-api,riptide-dashboard) get their own top-level package, e.g.riptide_dashboard— leave architectural room for them. - Webhook routers are factories that return an
APIRouter. Bitbucket needs the config for automation detection (make_router(config, session_factory, auth_dep)); Pipeline, ArgoCD, and Noergler don't, so they take just(session_factory, auth_dep). They're wired up insrc/riptide_collector/main.py::create_app. Add the config only when a router actually needsautomationrules or team metadata. - Pydantic schemas: strict for
/webhooks/pipelineand/webhooks/argocd(we own the contract — invalid payloads must 422); permissive raw-dict parsing for Bitbucket (its payload shapes vary; we best-effort extract). - Use
_as_dict()/_as_list()helpers inrouters/bitbucket.pyto coerce arbitrary JSON shapes — pyright strict won't accept chained.get()onOptional[dict]. - Tests use real Postgres via testcontainers, never SQLite. The
clientfixture intests/conftest.pydepends onsession_factorywhich truncates tables per test. .pre-commit-config.yamlruns ruff + pyright + uv-lock-check; expect CI to enforce the same.
openshift/ is suite-level, structured per-component. The collector lives in openshift/collector/. When adding a new component:
- Create
openshift/<component>/with its ownkustomization.yaml. - Add it to the
resources:list inopenshift/kustomization.yaml. - Every container needs explicit
requestsANDlimitsfor cpu and memory — no exceptions. - Use
runAsNonRoot: trueandreadOnlyRootFilesystem: true; no fixedrunAsUser(OpenShift assigns a random UID per project).
If asked to add these, push back unless the user is explicit:
- Change failure rate / failed deployment recovery time (DORA's current term, formerly MTTR) — no reliable incident source yet; schema reserves room for rollback-proxy detection
- Backfill workers (forward-only ingestion only)
- Aggregation API or metric endpoints (collector ingests; reads are SQL or future siblings)
- Helm chart (Kustomize is enough for v1)
- Postgres deployment manifests