Flyte is the backbone for large-scale Machine Learning and Data Processing (ETL) pipelines at Lyft. It is used across business critical applications ranging from ETA, Pricing, Mapping, Autonomous, etc. At its core is an open source workflow engine that executes 15M+ containers per month as part of thousands of workflows. The talk will focus on:- Architecture of Flyte and its specification language to orchestrate compute and manage data flow across disparate systems like Spark, Flink, Tensorflow, Hive, etc.- Data Provenance and Lineage in Flyte- How to leverage Flyte in various parts of the machine learning pipelinesThe talk will conclude with a demo of a machine learning pipeline built using the open source version of Flyte.
Haytham Abuelfutuh is an Engineering Manager in the Lyft ML Organization leading the Flyte team. During his tenure at Lyft, Haytham has helped build Flyte from the ground up, built and shipped Kubernetes operators and investigated and optimized Flyte system performance on k8s. Before Lyft, Haytham has gained his expertise in building Cloud and Massive Distributed Systems through his 3-year tenure in Google using Borg & Flume and 7-year tenure in Microsoft Office 365 and Azure Storage.