Lineage has long been a requirement for anyone processing data - whether for complying with regulations, ensuring data reliability or, to quote Marvin Gaye, plainly just knowing what’s going on from provenance to impact analysis. However, our industry has historically had difficulties collecting data lineage reliably. From the early days of lineage powered by spreadsheets, we’ve come a long way towards standardizing lineage. We have evolved from painful, manual approaches to automated operational lineage extraction across batch and stream processing. Now, we’re on the brink of a new era when lineage will be built into every data processing layer - whether ETL, data warehouse or ai - and not an afterthought. In this panel, OpenLineage project lead Julien Le Dem will ask professionals from the Data catalog and data observability space how their experience building products that rely on lineage has evolved over the past 10 years.
Julien Le Dem is the OpenLineage project lead at the LFAI&Data. He co-created the Parquet file format and is involved in several open source projects including OpenLineage, Marquez, Arrow, Iceberg and a few others. He was the Chief Architect of Astronomer and Co-Founded Datakin. Previously he held technical leadership positions at Wework, Dremio on the founding team, Twitter, where he also obtained a two-character Twitter handle (@J_); and Yahoo!, where he received his Hadoop initiation. His French accent makes his talks particularly attractive.
Harel Shein is an Engineering Manager II at Datadog, a leading observability and security SaaS platform. He works on data lineage and integrations for Data Observability and is a TSC member and committer of OpenLineage. Prior to working at Datadog, he held product engineering leadership positions at Astronomer and data engineering leadership at WeWork.
Ernie Ostic, Chief Evangelist at Manta Software (manta.io), is a leading figure in metadata integration and lineage. With a four-year tenure at Manta, Ernie has gained deep insights into the evolving challenges of data management, emphasizing its critical role as a business asset. His passion lies in fostering meticulous data curation for organizational success.
Prior to Manta, Ernie spent over two decades at IBM, specializing in data integration within the Unified Governance and Integration team. His expertise spanned information governance, DataStage, and real-time integration on the InfoSphere Information Server platform. With a rich background at IBM and contributions to Information Builders, Ernie is dedicated to helping colleagues and clients extract maximum value from data solutions, offering insights into metadata extracts and enhancing decision-making through effective governance.
Ernie Ostic's wealth of experience and commitment to empowering individuals and organizations make him a sought-after speaker in the dynamic landscape of data management and integration.
Sheeri Cabral is a seasoned database professional with over 20 years of experience in open source databases. With a background in computer science, she has excelled in roles spanning startups to large enterprises, implementing efficient database solutions. Sheeri is a passionate advocate for open source and knowledge sharing, regularly speaking at industry conferences and hosting workshops. She is dedicated to excellence and innovation in the industry and committed to fostering a vibrant and supportive community.