Pipeline catalog · Lineage · Quality checks

Know every pipeline. Trust every dataset.

DataXPipe turns declarative pipeline specs into runnable artifacts and keeps a live catalog of pipelines, datasets, lineage edges, run history, and quality check results — all through one API.

Start free — 2 pipelines included Explore features

No credit card required · Generate Airflow DAGs, SQL, and checks from YAML specs

orders_sync pipeline

GET /api/v1/pipelines/orders_sync
GET /api/v1/lineage/orders_raw
POST /api/v1/checks  →  { "status": "pass", "check_id": "row_count" }

Catalog: 12 pipelines · 34 datasets · 89 lineage edges
Last run: success · 3 quality checks passed

1 API

Catalog, runs, checks & lineage

Spec-first

Validate before you generate

SaaS-ready

Multi-tenant orgs & billing

Everything your data platform needs

From spec validation to production runs, DataXPipe connects generation, cataloging, lineage, and quality in one workflow.

Pipeline catalog

Register pipelines, datasets, and connections in a central metadata store. Query specs, owners, schedules, and environment tags from a versioned catalog API.

End-to-end lineage

Capture source-to-target edges as pipelines are registered. Trace upstream and downstream dependencies for any dataset to understand blast radius before you change a transform.

Quality checks

Attach SQL and runnable checks to pipeline runs. Store pass/fail results with row counts and sample rows so stakeholders can verify data health after every execution.

Spec-driven generation

Validate YAML or JSON specs against a JSON Schema, then generate Airflow DAGs, SQL transforms, test scripts, and metadata bundles — ready to deploy.

Run history & observability

Every pipeline run records status, timing, row counts, and linked check results. Prometheus metrics and structured logging integrate with your existing monitoring stack.

Multi-tenant & RBAC

Organizations get isolated API keys, plan-based limits, and role-aware permissions. Platform and admin roles control production deployments and sensitive operations.

How teams use DataXPipe

A repeatable workflow from declarative specs to observable production pipelines.

Define your pipeline spec

Author a YAML or JSON spec with sources, targets, lineage edges, and quality checks. DataXPipe validates it against a JSON Schema before anything runs.

Generate & deploy artifacts

Get Airflow DAGs, SQL transforms, runnable check scripts, and metadata bundles. Generated DAGs notify the catalog on every run.

Catalog, trace & verify

Register pipelines in the catalog API, query lineage for any dataset, and review check results tied to each run — all from one place.

Ready to catalog your pipelines?

Start with two pipelines on the Free plan. Upgrade when your team needs more connections, retention, and support.

Create your organization View pricing