Also known as: grai.io
Open source version control for metadata to track data lineage across databases, pipelines, warehouses, APIs, and dashboards.
Tracking and testing data lineage across databases, pipelines, warehouses, APIs, and dashboards is difficult, making database changes risky and prone to outages.
Tracking and testing data lineage across databases, pipelines, warehouses, APIs, and dashboards is difficult, making database changes risky and prone to outages.
Open source platform with connectors to map column-level lineage, run centralized tests in CI/CD, and predict impacts of data changes.
Open source platform with connectors to map column-level lineage, run centralized tests in CI/CD, and predict impacts of data changes.
Company closed after Y Combinator Summer 2022 batch.
Event Year: 2022
Company closed after Y Combinator Summer 2022 batch.
Event Year: 2022
Grai is an open source platform focused on metadata version control and data lineage. It enables teams to map data relationships across various systems, helping developers understand the impact of changes on machine learning models, APIs, and dashboards. The tool integrates with development workflows to make data management more reliable.
Grai simplifies data lineage by providing pre-built connectors for popular data tools. These include support for Snowflake, BigQuery, Redshift, Postgres, MySQL, SQL Server, dbt, Fivetran, and others. Users can automatically synchronize metadata, ensuring it stays current without manual updates. The platform centralizes data validation tests that trigger on upstream changes, integrating seamlessly into GitHub CI/CD processes.
As a fully open source project, Grai allows self-hosting, giving users control over their data and environments. Components include a backend server on Postgres and Django, a React-based frontend, CLI tools, schema libraries, graph utilities, GitHub Actions, and integrations. Deployment options range from Docker images to building from source. A demo mode is available via CLI for quick testing.
Grai is an open source platform focused on metadata version control and data lineage. It enables teams to map data relationships across various systems, helping developers understand the impact of changes on machine learning models, APIs, and dashboards. The tool integrates with development workflows to make data management more reliable.
Grai simplifies data lineage by providing pre-built connectors for popular data tools. These include support for Snowflake, BigQuery, Redshift, Postgres, MySQL, SQL Server, dbt, Fivetran, and others. Users can automatically synchronize metadata, ensuring it stays current without manual updates. The platform centralizes data validation tests that trigger on upstream changes, integrating seamlessly into GitHub CI/CD processes.
As a fully open source project, Grai allows self-hosting, giving users control over their data and environments. Components include a backend server on Postgres and Django, a React-based frontend, CLI tools, schema libraries, graph utilities, GitHub Actions, and integrations. Deployment options range from Docker images to building from source. A demo mode is available via CLI for quick testing.
Total Raised: unknown (Y Combinator backed)
Last Round: Summer 2022
Total Raised: unknown (Y Combinator backed)
Last Round: Summer 2022
Open source (self-hosted); previously B2B infrastructure
Open source (self-hosted); previously B2B infrastructure
Data engineers, ML teams, developers managing data pipelines
Data engineers, ML teams, developers managing data pipelines
GitHub release v0.1.76 on Oct 18, 2024.
Hiring: No
GitHub release v0.1.76 on Oct 18, 2024.
Hiring: No
Grai embeds into GitHub workflows, running validation tasks during CI/CD. This catches potential data issues early, such as impacts from production system changes on warehouses or dbt projects. Alerts notify teams of risks, promoting safer deployments. The tool supports column-level lineage spanning warehouses and services.
Grai embeds into GitHub workflows, running validation tasks during CI/CD. This catches potential data issues early, such as impacts from production system changes on warehouses or dbt projects. Alerts notify teams of risks, promoting safer deployments. The tool supports column-level lineage spanning warehouses and services.
The GitHub repository shows ongoing activity with releases up to October 2024, including version 0.1.76. A community roadmap invites feedback on features, documentation, bugs, and FAQs. Despite the company's closure, the core project remains accessible and usable in self-hosted form.
The GitHub repository shows ongoing activity with releases up to October 2024, including version 0.1.76. A community roadmap invites feedback on features, documentation, bugs, and FAQs. Despite the company's closure, the core project remains accessible and usable in self-hosted form.
Grai addresses challenges in tracking data beyond production environments, whether in transformation pipelines, ML models, or dashboards. By providing visibility into dependencies, it reduces risks from data changes. The platform bridges isolated systems, offering a holistic view of data flows.
Grai addresses challenges in tracking data beyond production environments, whether in transformation pipelines, ML models, or dashboards. By providing visibility into dependencies, it reduces risks from data changes. The platform bridges isolated systems, offering a holistic view of data flows.