Skip to content

kurrent-io/kurrent-sqlalchemy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

kurrent-sqlalchemy

SQLAlchemy dialect and Apache Superset engine spec for KurrentDB's Arrow Flight SQL endpoint.

Why this exists

KurrentDB ships an experimental Arrow Flight SQL surface for querying event streams via SQL. Pointing existing SQLAlchemy-based tools (Apache Superset, Metabase, Hex, etc.) at it doesn't work out of the box because:

  • The upstream flightsql-dbapi driver assumes a fully-implemented Flight SQL server. It does info.endpoints[0].ticket without guards on every metadata RPC. KurrentDB's Flight SQL returns empty endpoints for CommandGetSqlInfo, GetCatalogs, GetDbSchemas, and GetTables, so the driver crashes with IndexError: list index out of range on connection setup.
  • KurrentDB rejects context-free queries like SELECT 1, which SQLAlchemy and Superset use as liveness probes.
  • Apache Superset rewrites outgoing SQL through sqlglot (parsing as PostgreSQL but rendering with no dialect), which turns PostgreSQL JSON operators like data::json->>'field' into Trino-style JSON_EXTRACT_SCALAR(CAST(data AS JSON), '$.field') calls KurrentDB cannot execute.
  • Superset's DB_ENGINE_SPEC_OVERRIDES config setting only replaces existing engine entries β€” it doesn't add new ones β€” so a naive override of a non-built-in engine like KurrentDB is silently dropped.

This package solves all four problems. The dialect handles crashes; the Superset integration handles SQL rewriting, engine binding, and result-size caps.

Status

Beta. Tested against KurrentDB 26.1 RC2 experimental Flight SQL builds and Apache Superset 4.x.

Expect changes as KurrentDB's Flight SQL implementation matures β€” and the workarounds in this package to become unnecessary as that happens.

Installation

pip install kurrent-sqlalchemy

For Apache Superset deployments, include the optional extra:

pip install kurrent-sqlalchemy[superset]

Usage

Plain SQLAlchemy

The dialect auto-registers via entry point on install β€” no explicit import needed.

from sqlalchemy import create_engine, text

engine = create_engine("kurrentdb+flightsql://admin:changeit@localhost:2113")

with engine.connect() as conn:
    result = conn.execute(text(
        "SELECT data::json->>'OrderId' as order_id "
        "FROM kdb.records WHERE category = 'orders' LIMIT 10"
    ))
    for row in result:
        print(row)

KurrentDB Flight SQL connections are TLS-by-default. If KurrentDB uses a self-signed certificate, the client needs to trust it β€” set GRPC_DEFAULT_SSL_ROOTS_FILE_PATH to point at your CA bundle:

export GRPC_DEFAULT_SSL_ROOTS_FILE_PATH=/path/to/ca.crt

Apache Superset

1. Install into your Superset environment

If you're running Superset via the official Docker setup, add to docker/requirements-local.txt:

kurrent-sqlalchemy[superset]

Then restart all Superset services β€” web, worker, beat, init each run pip-install on startup independently:

docker compose stop superset superset-init superset-worker superset-worker-beat
docker compose up -d superset superset-init superset-worker superset-worker-beat

2. Wire it up in superset_config.py

from kurrent_sqlalchemy.superset import (
    KurrentDBEngineSpec,
    install_superset_patches,
)

DB_ENGINE_SPEC_OVERRIDES = {"kurrentdb": KurrentDBEngineSpec}
FLASK_APP_MUTATOR = install_superset_patches

If you already use FLASK_APP_MUTATOR for something else, chain them:

def my_app_mutator(app):
    install_superset_patches(app)
    # ... your existing logic ...

FLASK_APP_MUTATOR = my_app_mutator

3. If KurrentDB is running with TLS, mount the CA into the Superset containers

In your docker-compose.yml:

x-superset-volumes: &superset-volumes
  - ./docker:/app/docker
  - superset_home:/app/superset_home
  - /path/to/your/kurrentdb/certs:/certs:ro

x-superset-environment: &superset-environment
  GRPC_DEFAULT_SSL_ROOTS_FILE_PATH: /certs/ca/ca.crt

Add environment: *superset-environment to each of the four superset services.

4. Add the database in the Superset UI

  • Settings β†’ Database Connections β†’ + Database
  • Supported Databases β†’ Other
  • SQLAlchemy URI: kurrentdb+flightsql://admin:changeit@<host>:2113
    • Use the Docker container hostname or IP if Superset reaches KurrentDB across a Docker network, not localhost.

After clicking Test Connection (or running any SQL Lab query), verify the patches installed correctly by checking the logs:

docker compose logs superset 2>&1 | grep kurrentdb

You should see:

kurrentdb: installed Database.db_engine_spec bind patch
kurrentdb: installed Database.apply_limit_to_sql bypass
kurrentdb: installed SQLScript.format bypass
kurrentdb: all patches installed and verified

If you see kurrentdb: N patch(es) NOT installed, something went wrong β€” check the surrounding logs for the exception.

Configuration

Three environment variables tune behavior:

Variable Default Purpose
KURRENTDB_TEST_QUERY SELECT * FROM kdb.records LIMIT 1 Superset's connection-test query. Override if your KurrentDB doesn't have the kdb.records table populated, or set to a query against a known-stable stream.
KURRENTDB_MAX_ROWS 10000 Maximum rows for any KurrentDB query without an explicit LIMIT. Set to 0 to disable the cap (queries stream until exhausted).
GRPC_DEFAULT_SSL_ROOTS_FILE_PATH (system trust store) Path to a CA PEM file for KurrentDB TLS verification. Required when KurrentDB uses a self-signed certificate.

Schema browser

The dialect reports a fixed static schema until KurrentDB Flight SQL implements its metadata commands. You'll see a single table kdb.records in Superset's schema picker, with the columns KurrentDB Flight SQL exposes for events: log_position, commit_position, stream_revision, created_at, expires_at, stream, stream_hash, schema_name, category, deleted, schema_id, schema_format, record_id, data, metadata.

For typed views over event streams, register virtual datasets in Superset β€” paste the SQL that projects the JSON fields you care about, give it a name, it becomes available in Explore. Example virtual dataset SQL:

SELECT
  stream,
  data::json->>'OrderId' as order_id,
  data::json->>'Amount' as amount,
  created_at
FROM kdb.records
WHERE category = 'orders'

Known limitations

  • Schema is static. The dialect's get_table_names, get_columns, etc. return hand-rolled metadata. If KurrentDB Flight SQL adds support for CommandGetTables, future versions can switch to live reflection.
  • No transactions. Flight SQL has no transaction semantics. The dialect reports AUTOCOMMIT.
  • Server-side errors are opaque. When KurrentDB's Flight SQL handler throws an unhandled exception, gRPC returns Exception was thrown by handler. with no detail. Check KurrentDB's own logs for the real stack trace.
  • Superset's DEFAULT_SQLLAB_LIMIT doesn't apply. Superset's automatic LIMIT injection goes through sqlglot, which this package bypasses. Use KURRENTDB_MAX_ROWS instead.

Caveat about flightsql-dbapi

The upstream influxdata/flightsql-dbapi library was archived in March 2026. It still works and installs from PyPI, but won't receive further updates. A future major version of this package may switch to the Apache Arrow ADBC Flight SQL driver, which is maintained but doesn't currently ship its own SQLAlchemy dialect.

Contributing

Issues and pull requests welcome. Please include the KurrentDB version you're running against β€” the Flight SQL implementation is moving, and what works on RC2 may not work on later builds.

License

Apache 2.0. See LICENSE.

About

A repository of artifacts for using KurrentDB with SQLAlchemy and related toolsets

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages