Replies: 3 comments
-
🧱 1. Heavy dependencies. ProposalThe topic of Kedro’s large dependency footprint has a long history of discussion — most recently summarised in #3967. In the recent Tech Design, Simon Brugman clearly demonstrated that the number of mandatory dependencies can become a serious blocker for production deployment in large organisations, eventually limiting Kedro adoption. As highlighted, there is a clear functional split between:
Most heavy dependencies are only required for the latter, not for running pipelines. 🚀 Proposal directionTo start a gradual effort towards dependency separation, in three phases: Step 1 – Make the dependency boundaries explicit (non-breaking)In the next minor release, update Step 2 – Discuss swap defaults before Kedro 2.0Once the boundaries are defined and documented, consider flipping the default install behaviour:
This would reduce dependency load in production, while remaining easy to use for developers. Step 3 – Split packages in Kedro 2.0For Kedro 2.0, explore introducing a formal split, following the model used by other Python ecosystems:
|
Beta Was this translation helpful? Give feedback.
-
|
I wanted to share a quick reflection from the conversation around Ordeq. Some of the ideas Simon mentioned; such as promoting a pure Python catalog, adopting a Python-first runner approach, and reducing the API surface by removing the framework layer; are all directions that have existed in Kedro’s past. If you look back at Kedro 0.13 (before it was open-sourced), it actually looked very similar to what Ordeq is proposing now. I’d really encourage you to dig up the internal documentation or code from that version if you can, since it provides a useful reference point. This raises an interesting question: did we make a mistake in moving Kedro toward being more framework-like, rather than staying closer to a lightweight library model? It might be worth revisiting that trade-off, even as a quick prototype, to see what that earlier simplicity could look like today. |
Beta Was this translation helpful? Give feedback.
-
|
I'd like to use this opportunity to cross-link some not-so-old ideas that we had about this 😉 #3659 sad that I did a terrible job at selling the idea, but happy that the project is determined to take this direction. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
This is a summary of the Kedro Tech Design session (22 Oct 2025) led by Simon Brugman.
🧭 Summary
In this Tech Design, @sbrugman , a long-time Kedro contributor and Technical Steering Committee member, presented his experience building ORDEQ — a lightweight pipeline framework.
Simon shared insights using Kedro in production and explained how ORDEQ re-imagines Kedro’s core ideas around pipelines, datasets, and catalogs with a focus on simplicity, minimal dependencies, and a Python-first developer experience.
The discussion centred on what Kedro can learn from ORDEQ’s design — especially in reducing complexity, improving usability, and making deployment smoother in large enterprise environments.
🏦 Kedro at One Company: From Wide Adoption to Decline
At One Company, Kedro was initially adopted widely across data science and engineering teams — over 10 data scientists and multiple squads used it to structure their machine learning and data pipelines.
It became the de facto standard for reproducible experimentation, thanks to its clear project structure, modular pipelines, and strong integration with Kedro-Viz.
However, over time, adoption declined significantly, particularly on the engineering and production side.
While Kedro worked well for research and prototyping, it became hard to maintain and deploy in One Company’s highly controlled, cloud-based (GCP) environment.
🧩 From Kedro to ORDEQ: Closing the Gaps
To overcome these challenges, One Company decided to build their own lightweight framework — ORDEQ — inspired by Kedro’s core ideas but redesigned for simplicity and production reliability.
ORDEQ keeps Kedro’s strengths (pipelines, modularity, reproducibility) but removes or reimagines the parts that caused friction in day-to-day use.
Below are the key Kedro pain points and how ORDEQ addressed them:
🧱 Heavy dependencies → Zero-dependency core
⚙️ Complex dataset abstraction → Simple Pythonic classes
AbstractDatasetand handling filesystem protocols, versioning, and hooks.load()/save()), usually under 10 lines of code.📁 YAML configuration → Pure Python catalog
catalog.ymlandconf/base/parameters.ymlfiles added indirection and were hard to navigate.🚀 CLI and Session coupling → Python-first runner
kedro run) or aKedroSession.🧩 Hard Airflow/GCP integration → Direct orchestration compatibility
🧠 Learning curve → Minimal API surface
📊 Visualization → Retain Kedro-Viz compatibility
Beta Was this translation helpful? Give feedback.
All reactions