π§ Data2Evidence is beta software. There might be breaking changes. π§
Transforming fragmented health data into actionable insights - with speed and precision.
Data2Evidence (D2E) is an open-source research platform that streamlines the entire journey from raw health data to reproducible scientific evidence.
Built for the OHDSI community and powered by the OMOP Common Data Model (CDM), it unifies data ingestion, transformation, quality assessment, and analysis within one consistent, end-to-end environment.
Modern health data is often fragmented across incompatible systems, making research time-consuming and error-prone.
Data2Evidence directly addresses these challenges β enabling researchers, data custodians, and institutions to standardize data, run quality checks, define cohorts, and analyze results with unparalleled transparency and speed.
- End-to-End Workflows - Transition from raw data to research-ready cohorts, validation, and analytics - all within a single, unified environment.
- Design-Centric Usability - A clean, modern interface minimizes technical overhead and decision fatigue.
- Reproducibility & Trust - Built-in validation, provenance tracking, and consistent workflows ensure reliability.
- Accelerated Onboarding - Lightweight installation and sensible defaults make it easy to get started.
- Extensible Platform - Add custom workflows, ML models, and integrations to fit institutional needs.
- Data Custodians / Administrators - Deliver high-quality OMOP datasets ready for research. Scan, map, transform, and validate data β all in one place.
- Researchers & Clinicians - Explore the data, define cohorts visually, test feasibility in real-time, and run analyses in interactive notebooks.
- Collaborating Institutions - Run federated, privacy-preserving studies without moving sensitive data.
| Category | Description |
|---|---|
| Intuitive Cohort Building | Define, refine, and assess patient cohorts with a no-code interface and real-time feedback. |
| Integrated Analytics & AI | Run R or Python notebooks, assisted by an integrated coding chatbot. |
| Comprehensive Interoperability | Full OMOP CDM and FHIR support, adhering to FAIR data principles. |
| Powerful Dashboards | Monitor dataset quality and utility through dashboards. |
| Data Quality Checks | Perform standardized data quality checks on entire datasets or selected cohorts. |
| Federated & Collaborative Research | Share cohorts and analyses across institutions via Git-based workflows. |
| Flexible Deployment | Containerized architecture supports both on-premise and cloud setups. |
| Extensible Architecture | Add custom ETL steps, ML models, and integrations as research evolves. |
Data2Evidence supercharges the OHDSI ecosystem, combining familiar tools with new orchestration and governance features.
- ETL Pipeline Orchestration - Manage data pipelines with Prefect for transparency and reproducibility.
- Data Storage & User Management - Secure, compliant storage with role-based access control.
- FHIR Integration - Bridge clinical data exchange standards with OMOP.
- Federated Network Studies - Run analyses across sites without moving patient-level data.
- Docker β₯ 24
- Node.js β₯ 18
- Git
- Windows users: WSL2 or Ubuntu recommended
Install the Data2Evidence CLI
npm i -g d2eCreate folder for Data2Evidence
mkdir d2e
cd d2eGenerate .env file for Data2Evidence with random generated secretes and certificats
d2e initStart the Data2Evidence services
d2e -e pull
d2e -e startCreate and load demo dataset
d2e setupdemoAccess via https://localhost:443
Default credentials: admin / Updatepassword12345
Full guide: data2evidence.org/docs/getting_started
services/ β Backend microservices (auth, ETL, storage)
ui/ β Frontend web interface
flows/ β Prefect-based orchestration flows
functions/ β Analytical utilities and notebook helpers
- RESTful APIs for integration and automation
- Extensible plugin system for custom modules
- Supports on-premise, hybrid, and cloud deployments
| d2e services | d2e functions | d2e ui |
|---|---|---|
We welcome community contributions!
- Open issues or feature requests on GitHub
- Submit pull requests
- Join our Slack for discussions
If you use Data2Evidence in your research, please cite it using the metadata in CITATION.cff. GitHub provides a "Cite this repository" shortcut in the sidebar that generates APA and BibTeX entries from this file.
Licensed under the Apache 2.0 License.
Developed by Data4Life, supported by the Hasso Plattner Foundation and global research partners.
Let's unlock the potential of global health data - together.
π Explore now at data2evidence.org