Data orchestration and DAGs are something that most data teams need. There are many commercial and open source options available. Examples include Airflow, Luigi, Oozie and many others.
Airflow is very popular at the moment and rightly so; it is a very useful tool and is the backbone of very productive data teams. Argo is a relatively new challenger. It is a Kubernetes native workflow engine.
At Canva, we evaluated both Airflow and Argo and chose Argo as our primary data orchestration system. In this talk I’ll briefly compare Airflow and Argo, talk about the evaluation process we undertook and how we came to our decision. Finally, I’ll talk about our experience using it so far, the things that have been good and the things that have been not so good.
Greg is the Data Engineering Lead at Canva. He founded the team and quickly grew it to build out scalable systems that enable advanced analytics and machine learning features inside the Canva product.
Previously he was the Co-founder and CTO of AirHelp, a Y Combinator backed startup, where he built systems to process flight and booking data in real-time.
Greg is passionate about technology and is frequently involved with meetups, such as the Sydney Data Engineering meetup.