Research

Open source as practice.

We learn faster when the work is shared. Here's the research and tooling we've put into the community — and where we're heading next.

Why we contribute

We were built on open source. We give back.

The tools that make modern data engineering possible — Apache Spark, dbt, Delta Lake, Apache Iceberg, Apache Kafka, Airflow, MLflow — were built in the open by communities of contributors. Our work runs on their work.

When we build something during an engagement that solves a problem we know other teams will hit too, our default is to open-source it. Code locked behind a single client's firewall helps one team. Code in the open helps everyone — including us. Outside review tends to be stricter than internal review.

We don't open-source for marketing. We do it because it's how this part of the industry actually works.

How we contribute

Three ways our work moves into the open.

Accelerators in the open

Frameworks we've built across engagements — ingestion templates, transformation libraries, CI/CD scaffolding for data platforms — released as permissively-licensed projects. Patterns that aren't client-specific belong in the community.

Reference models and implementations

Trained models, evaluation harnesses, and reference architectures for common ML use cases. Forecasting, classification, RAG patterns. Designed to be forked, adapted, and improved — not consumed as black boxes.

Upstream contributions

Bug fixes, feature pull requests, and documentation improvements to the open-source projects we depend on day to day — Databricks tooling, dbt packages, Apache Spark, MLflow, Delta Lake, and others.

Repositories

What's published so far.

We're consolidating our open-source work into a single GitHub organization. Once the first set of accelerators and reference models is published, every release will be listed here.

{ }

Public repositories coming soon.

We're packaging the first wave of our internal accelerators for public release. Each project will ship with documentation, an opinionated example, and a clear license. Drop us a note if you'd like to be told when the org goes live.

Get Notified
Collaborate

Want to build something together?

If you're working on a problem we'd recognize — or one we wouldn't — we'd like to hear about it. Open source gets better when more people contribute.

Get in Touch