Production-grade receipt & invoice data extraction API powered by Vision Language Models with intelligent OCR fallback.
Upload a receipt image β get structured JSON with vendor, date, total, line items, tax breakdown, and expense category.
API: api.ozkanceylan.dev
Demo UI: api.ozkanceylan.dev/demo
# 1. Clone and configure
git clone https://github.com/ozkannceylan/BelegPilot.git
cd BelegPilot
cp .env.example .env
# Edit .env with your OpenRouter API key
# 2. Start services
docker compose -f docker/docker-compose.yml up --build
# 3. Generate API key
docker compose -f docker/docker-compose.yml exec app python scripts/generate_api_key.py --name "dev"
# Save the key that's printed!
# 4. Extract a receipt
curl -X POST http://localhost:8000/api/v1/extract \
-H "X-API-Key: riq_live_<your-key>" \
-F "file=@receipt.jpg"Upload a receipt/invoice image and get structured data back.
Headers: X-API-Key: riq_live_<your-key>
Body: Multipart form with file (JPEG, PNG, or PDF, max 10MB)
Response:
{
"id": "uuid",
"status": "success",
"data": {
"vendor": "REWE",
"date": "2026-02-07",
"total_amount": 47.83,
"currency": "EUR",
"tax_amount": 7.63,
"tax_rate": 19.0,
"line_items": [...],
"payment_method": "Visa ****1234",
"category": "groceries"
},
"confidence_score": 0.94,
"extraction_method": "vlm",
"processing_time_ms": 3500,
"cost_usd": 0.002
}Retrieve a previous extraction result.
Health check (no auth required).
Interactive API documentation (Swagger UI).
import httpx
response = httpx.post(
"https://api.ozkanceylan.dev/api/v1/extract",
headers={"X-API-Key": "riq_live_<your-key>"},
files={"file": open("receipt.jpg", "rb")},
)
print(response.json())βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BelegPilot API β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Client βββββΆβ FastAPI (ASGI) β β
β β (REST API) β β β’ API Key Authentication β β
β βββββββββββββββ β β’ Request Validation (Pydantic) β β
β β β’ Rate Limiting β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββΌββββββββββββββββββββββββββββββββ β
β β Image Preprocessor β β
β β β’ Auto-rotation (EXIF) β β
β β β’ Contrast enhancement (OpenCV) β β
β β β’ Noise reduction β β
β β β’ Resolution optimization β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β βββββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ β
β β VLM Extractor β β OCR Extractor β β Validator β β
β β (Primary Path) β β (Fallback Path) β β & Categorizer β β
β βββββββββββββββββββββ€ βββββββββββββββββββββ€ βββββββββββββββββββββ€ β
β β β’ Qwen2.5-VL-72B β β β’ Tesseract OCR β β β’ Field scoring β β
β β β’ GPT-4o-mini β β β’ DE/EN langs β β β’ Confidence calc β β
β β β’ OpenRouter API β β β’ Regex parsing β β β’ Auto-categorize β β
β β β’ Cost tracking β β β β β’ Tax validation β β
β βββββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ β
β β β
β ββββββββββββββββββββΌββββββββββββββββββββββββββββββββ β
β β Data Layer β β
β β β’ PostgreSQL 16 (async via asyncpg) β β
β β β’ SQLAlchemy 2.0 ORM β β
β β β’ Result persistence & retrieval β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Observability Stack β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β β Phoenix UI β β Prometheus β β Structlog β β
β β (LLM Traces) β β (Metrics) β β (JSON Logs) β β
β β OpenTelemetry β β /metrics β β Request IDs β β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Infrastructure
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Cloudflare DNS β Caddy (ozkanceylan.dev edge, auto-TLS) β Docker β
β β
β Public entry: api.ozkanceylan.dev β
β Reverse proxy: shared ozkanceylan.dev Caddy via `belegpilot-edge` network β
β Containers: belegpilot-app + belegpilot-db (Postgres 16) β
β β
β CI/CD: GitHub Actions β GHCR β pull-and-restart on VPS β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
See deploy/README.md for the full production deployment
guide (host prep, env file, Caddy wiring, Cloudflare DNS records).
- Vision Language Model primary path using state-of-the-art Qwen2.5-VL-72B via OpenRouter
- Intelligent OCR fallback with Tesseract when VLM fails or returns low confidence
- Automatic model failover from premium to cost-effective models based on budget
- Hard budget limits - configurable daily/monthly caps on OpenRouter API spend
- Real-time cost tracking - per-request cost calculation and logging
- Automatic model downgrade when approaching budget limits
- API key authentication with bcrypt hashing (no plaintext storage)
- Rate limiting per API key
- Input validation with Pydantic v2 schemas
- Non-root container execution
- Distributed tracing instrumented via OpenTelemetry (optional Arize Phoenix backend; OTLP exporter fails silently when no collector is running)
- Prometheus metrics at
/metricsendpoint - Structured JSON logging with request correlation IDs
- LLM call tracing with token counts and latency
- 56+ automated tests with pytest-asyncio
- 74% code coverage with pytest-cov
- Golden dataset evaluation for accuracy regression testing
- Type checking with mypy strict mode
- Linting with Ruff (fast Python linter)
- Multi-stage Docker builds for minimal image size
- Health checks with dependency verification
- Graceful degradation when external services fail
- Async throughout - non-blocking I/O with asyncio
| Technology | Purpose |
|---|---|
| Python 3.12 | Core language with modern type hints |
| FastAPI | High-performance async REST API framework |
| Pydantic v2 | Data validation & serialization |
| Uvicorn | ASGI server with HTTP/2 support |
| SQLAlchemy 2.0 | Async ORM with type-safe queries |
| asyncpg | High-performance PostgreSQL async driver |
| Technology | Purpose |
|---|---|
| OpenRouter API | LLM gateway for vision models |
| Qwen2.5-VL-72B | Primary Vision Language Model for extraction |
| GPT-4o-mini | Fallback VLM with cost optimization |
| Tesseract OCR | Open-source OCR engine (German + English) |
| OpenCV | Image preprocessing & enhancement |
| Pillow | Image format handling |
| Technology | Purpose |
|---|---|
| OpenTelemetry | Distributed tracing instrumentation |
| Arize Phoenix | LLM observability & trace visualization |
| Prometheus | Metrics collection & alerting |
| Structlog | Structured JSON logging |
| Technology | Purpose |
|---|---|
| bcrypt | Password & API key hashing |
| python-jose | JWT token handling |
| API Key Auth | Request authentication via X-API-Key header |
| Technology | Purpose |
|---|---|
| Docker | Multi-stage containerization |
| Docker Compose | Multi-service orchestration |
| PostgreSQL 16 | Primary datastore with health checks |
| Caddy | Edge reverse proxy with automatic Let's Encrypt TLS |
| Hetzner Cloud | Production hosting |
| Cloudflare | DNS (DNS-only, TLS terminated at Caddy) |
| Technology | Purpose |
|---|---|
| GitHub Actions | Automated CI/CD pipeline |
| GHCR | GitHub Container Registry for images |
| pytest | Async test framework with fixtures |
| pytest-cov | Code coverage reporting |
| Ruff | Fast Python linter & formatter |
| mypy | Static type checking |
| Technology | Purpose |
|---|---|
| httpx | Async HTTP client with retry support |
| tenacity | Retry logic with exponential backoff |
| python-multipart | Multipart file upload handling |
This project demonstrates proficiency in:
- Backend Engineering: Async Python, RESTful API design, dependency injection, middleware patterns
- AI/ML Integration: Vision Language Models, prompt engineering, multi-model orchestration, cost optimization
- Database Design: Async ORMs, connection pooling, migration strategies, query optimization
- DevOps & Infrastructure: Docker multi-stage builds, container orchestration, reverse proxies, cloud deployment
- Observability: Distributed tracing, metrics collection, structured logging, LLM monitoring
- Security: Authentication schemes, secret management, input validation, secure defaults
- Testing: Async test patterns, fixtures, mocking external services, coverage analysis
- CI/CD: GitHub Actions workflows, container registries, automated deployments
# Run tests with coverage
docker compose -f docker/docker-compose.yml exec app pytest -v --cov=app --cov-report=term-missing
# Type checking
mypy app/ --ignore-missing-imports
# Linting & formatting
ruff check app/ tests/
ruff format app/ tests/
# Check OpenRouter spend
docker compose -f docker/docker-compose.yml exec app python scripts/check_cost.py
# Generate new API key
docker compose -f docker/docker-compose.yml exec app python scripts/generate_api_key.py --name "my-key"| Variable | Description | Default |
|---|---|---|
OPENROUTER_API_KEY |
OpenRouter API key (required) | - |
OPENROUTER_DEFAULT_MODEL |
Primary VLM model | qwen/qwen2.5-vl-72b-instruct |
OPENROUTER_FALLBACK_MODEL |
Fallback VLM model | openai/gpt-4o-mini |
OPENROUTER_DAILY_BUDGET_USD |
Daily spend limit | 1.0 |
OPENROUTER_MONTHLY_BUDGET_USD |
Monthly spend limit | 5.0 |
DB_HOST |
PostgreSQL host | localhost |
DB_PASSWORD |
Database password | - |
ENVIRONMENT |
development / production |
development |
The production deployment uses a slim two-container stack (belegpilot-app +
belegpilot-db) that lives behind ozkanceylan.dev's shared Caddy edge. Phoenix
is not deployed in production β the OTLP exporter is wired in code but
fails silently when no collector is reachable, so observability stays optional.
# On the VPS (one-shot, as root)
sudo bash deploy/scripts/setup_server.sh # docker + /data dirs + edge net
# As deploy user
bash deploy/scripts/create_env.sh # generates deploy/.env
$EDITOR deploy/.env # set OPENROUTER_API_KEY
cd deploy && docker compose -f docker-compose.production.yml up -dSee deploy/README.md for the architecture rationale,
Caddy wiring, Cloudflare DNS instructions, and ops runbooks.
MIT