Skip to content

automationkit/HeadlessX

 
 

HeadlessX Logo

Self-hosted scraping and search workflows powered by Camoufox

Version Runtime Next.js License

Setup Guide β€’ API Reference β€’ MCP


Overview

HeadlessX is a self-hosted scraping platform with a web dashboard, protected API, queue-backed workflows, and a remote MCP endpoint.

Current live surfaces:

  • Website scraping: scrape, crawl, map, content extraction, screenshots
  • Google SERP
  • Tavily
  • Exa
  • YouTube
  • Queue jobs, logs, API keys, proxy management, and config management
  • Remote MCP over /mcp

What Changed In v2.1.0

  • Simplified the dashboard around one global browser/runtime model
  • Added Tavily, Exa, and YouTube workspaces
  • Added queued crawl and job flows with Redis + worker support
  • Added remote MCP secured with normal dashboard-created API keys
  • Added setup and API guides aligned with the current route tree

Scrapers

HeadlessX Live Scrapers

Live Now

Scraper Status
Website Live
Google SERP Live
Tavily Live
Exa Live
YouTube Live

Coming Soon

Scraper Status
Google Maps Planned
Twitter / X Planned
LinkedIn Planned
Instagram Planned
Amazon Planned
Facebook Planned
Reddit Planned

UI Screenshots

Google SERP

Google SERP UI

Website

Website UI

Proof

BrowserScan

BrowserScan

Cloudflare Challenge

Cloudflare Challenge

Pixelscan

Pixelscan

Proxy Validation

Proxy Validation

Quick Start

Requirements

  • Node.js 22+
  • pnpm 9+
  • PostgreSQL
  • Redis
  • Python/uv for yt-engine
  • Go for the HTML-to-Markdown sidecar

Recommended Development Setup

Recommended for most developers:

  • PostgreSQL: Supabase or Docker
  • Redis: Docker
  • App runtime: pnpm dev or mise run dev

This keeps infrastructure simple while still running the app locally.

Local Development

  1. Clone and install:
git clone https://github.com/saifyxpro/HeadlessX.git
cd HeadlessX
pnpm install
  1. Create root .env from the full example:
cp .env.example .env

Current root .env.example:

# ==============================================
# HEADLESSX V2.1.0 - LOCAL DEVELOPMENT
# ==============================================

# ------------------------------
# 1. DATABASE
# ------------------------------
DATABASE_URL="postgresql://postgres.xxxxx:YOUR_PASSWORD@aws-0-region.pooler.supabase.com:5432/postgres"

# ------------------------------
# 2. SERVER
# ------------------------------
PORT=8000
HOST=0.0.0.0
NODE_ENV=development

# ------------------------------
# 2A. SECURITY (REQUIRED)
# ------------------------------
# Used by the Next.js dashboard server to authenticate against the API.
DASHBOARD_INTERNAL_API_KEY=replace-with-a-long-random-string

# Used to encrypt stored credentials at rest.
CREDENTIAL_ENCRYPTION_KEY=replace-with-a-different-long-random-string

# ------------------------------
# 3. QUEUE / REDIS
# ------------------------------
# BullMQ uses Redis to persist async scrape, extract, and index jobs.
REDIS_URL=redis://localhost:6379

# Search providers and local engines
TAVILY_API_KEY=
EXA_API_KEY=
YT_ENGINE_URL=http://localhost:8090
YT_ENGINE_PORT=8090
YT_ENGINE_TIMEOUT_MS=45000
YT_ENGINE_TEMP_DIR=./tmp/yt-engine
YT_ENGINE_JOB_TTL_HOURS=12
HTML_TO_MARKDOWN_SERVICE_URL=http://localhost:8081
HTML_TO_MARKDOWN_PORT=8081
HTML_TO_MARKDOWN_TIMEOUT_MS=60000

# Optional queue tuning
BULLMQ_QUEUE_NAME=headlessx-jobs
QUEUE_WORKER_CONCURRENCY=2
QUEUE_JOB_ATTEMPTS=3
QUEUE_JOB_BACKOFF_MS=5000
QUEUE_STREAM_POLL_MS=1000
QUEUE_CONNECTION_RETRY_MS=10000

# Browser and anti-detection settings are managed from the dashboard

# ------------------------------
# 4. FRONTEND (Next.js)
# ------------------------------
WEB_PORT=3000

NEXT_PUBLIC_API_URL=http://localhost:8000
INTERNAL_API_URL=http://localhost:8000

# CORS: Add your frontend URL for custom deployments
FRONTEND_URL=http://localhost:3000

# ------------------------------
# 5. DEFAULT RUNTIME SETTINGS
# ------------------------------
BROWSER_TIMEOUT=60000
MAX_CONCURRENCY=5
STEALTH_MODE=advanced

If you are using Docker instead of local services, start from the complete Docker env too:

cp infra/docker/.env.example infra/docker/.env
  1. Prepare services:
pnpm db:push
pnpm camoufox:fetch
  1. Start the workspace:
pnpm dev

This starts:

  • web
  • api
  • worker
  • HTML-to-Markdown service
  • yt-engine

Important:

  • pnpm dev does not provision PostgreSQL or Redis
  • Website Crawl requires both Redis and the worker

Docker

For the current Docker path:

cp infra/docker/.env.example infra/docker/.env
cd infra/docker
docker compose --profile all up --build -d

Important notes:

  • use --profile all
  • partial profile runs are not currently reliable because of depends_on relationships
  • the core Docker stack does not yet define a yt-engine container, so YouTube may still need to run locally

See docs/setup-guide.md for the full matrix:

  • no-Docker setup
  • mixed local setup
  • full Docker setup
  • MCP client configuration

API Summary

All non-health backend routes are protected with x-api-key.

Core backend surfaces:

  • GET /api/health
  • GET/PATCH /api/config
  • GET /api/dashboard/stats
  • GET /api/logs
  • GET/POST/PATCH/DELETE /api/keys
  • proxy CRUD under /api/proxies
  • website routes under /api/website/*
  • Google SERP routes under /api/google-serp/*
  • Tavily routes under /api/tavily/*
  • Exa routes under /api/exa/*
  • YouTube routes under /api/youtube/*
  • queue job routes under /api/jobs/*
  • remote MCP endpoint at /mcp

See the full route reference in docs/api-endpoints.md.

MCP

HeadlessX exposes a remote MCP endpoint from the API:

http://localhost:8000/mcp

Use a normal API key created from the dashboard API Keys page.

Do not use DASHBOARD_INTERNAL_API_KEY for MCP clients.

Example client config:

{
  "mcpServers": {
    "headlessx": {
      "transport": "http",
      "url": "http://localhost:8000/mcp",
      "headers": {
        "x-api-key": "hx_your_dashboard_created_key"
      }
    }
  }
}

Monorepo Layout

apps/
  api/                    Express API + worker + MCP
  web/                    Next.js dashboard
  yt-engine/              Python YouTube engine
  go-html-to-md-service/  Go HTML-to-Markdown sidecar
docs/
  setup-guide.md
  api-endpoints.md
infra/docker/

Notes

  • The dashboard uses the internal dashboard key for server-side internal requests
  • MCP uses normal user-created API keys, not the dashboard internal key
  • Queue-backed features return degraded/unavailable behavior when Redis is missing
  • Docker support is available for the core stack, but yt-engine still needs separate Docker wiring

Contributing

See CONTRIBUTING.md for the current contribution workflow, local setup expectations, pull request guidance, and commit message conventions.

License

MIT

About

The undetected self-hosted browser automation platform. Powered by Camoufox (Firefox) for 0% detection rates. Built for speed, privacy, and scalability.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • TypeScript 94.6%
  • Python 3.8%
  • CSS 0.7%
  • Go 0.5%
  • Dockerfile 0.2%
  • Shell 0.1%
  • JavaScript 0.1%