My point was: you're probably using or will use one of them, but first understand what you will want in terms of orchestration and observability of your pipeline, and only then check how to implement it in that specific tool.
Junior here, in the process of picking one. In DataTalks' Zoomcamp they work with Kestra. It seemed like an interesting, modern open-source option, but I was wondering what you think of it.
I recently made a presentation at a data conference and reviewed these exact tools.
My bottom line -- decouple what you want from your data pipeline and how you design it, from the choise of the tool and framework.
That’s such a powerful takeaway, Romi — totally agree.
Too often people overfit to the tool before clarifying their pipeline’s actual needs.
Would love to hear what frameworks you personally leaned toward after your review!
I explored exactly these three frameworks -- Airflow, Dagster and Prefect.
This is exactly the reason why your post jumped to me straight away! :)
Here's my slideshow where I mentioned them:
https://www.slideshare.net/slideshow/multi-tenant-data-pipeline-orchestration/278940078
My point was: you're probably using or will use one of them, but first understand what you will want in terms of orchestration and observability of your pipeline, and only then check how to implement it in that specific tool.
I love the slides very informative. Is it okay if I share them on my linkedin? please help me with your linkedin profile to tag you for credits
Thank you, that will be super awesome!
Here's my LinkedIn:
https://www.linkedin.com/in/romik/
Oh and here's my actually giving the talk with these slides:
https://www.youtube.com/watch?v=rWjt0FziRnU
Junior here, in the process of picking one. In DataTalks' Zoomcamp they work with Kestra. It seemed like an interesting, modern open-source option, but I was wondering what you think of it.
That’s a great question, Francisco !
Kestra is definitely gaining traction — it has a clean UI and modern event-driven architecture.
I haven’t deeply tested it yet, but it feels promising for teams starting fresh.
Curious: What’s your main use case — batch jobs, streaming, or something else?