Datasets are one of the fundamental concepts in data work: as data practitioners, we use the word all the time in colloquial day-to-day conversations. Many different data tools have independently converged on similar concepts. However—like many concepts in the modern data stack—the exact meaning, properties, and capabilities of Datasets differ in subtle but important ways.
Concepts like Datasets will define the next generation of data work. The cornerstone of “the modern data stack” is a new set of tools, abstractions, and metadata that map more tightly to the real work that data practitioners need to do. This panel brings together leading tool builders and practitioners in the data community to discuss that evolution.
We’ll start by comparing and contrasting different approaches to Datasets. From there, we’ll branch out into an open discussion about relative strengths and weaknesses of different approaches, and alignment (or lack thereof) between tools and systems. This talk will be useful for data practitioners looking to understand how the field is evolving, and how new tools are enabling those changes.
James Campbell is the CTO at Superconductive, the company behind the open source data quality project Great Expectations, which he co-founded in 2017. Prior to that, he worked across a variety of quantitative and qualitative analytic roles in the US intelligence community. He studied Math and Philosophy at Yale, and is passionate about creating tools that help communicate uncertainty and build intuition about complex systems.
Nick Schrock is the founder and CEO of Elementl, the company behind Dagster. Previously, Nick worked at Facebook, where he co-created GraphQL. Nick believes deeply in the power of well-designed developer tools to make engineers more productive, accelerate their careers, make their lives more enjoyable, and transform the organizations in which they work.
Steven Hillion has been leading engineering and analytics teams for twenty years. At Astronomer, his team works with operational data to improve Apache Airflow. Previously, he founded Alpine Data, acquired by TIBCO, to build a collaborative platform for data science. Before that he built a global team of data scientists at Pivotal and developed a suite of open-source and enterprise software in machine learning. Earlier, he led engineering at a series of start-ups that resulted in acquisition or IPO. Steven is originally from Guernsey, in the British Isles. He received his Ph.D. in mathematics from the University of California, Berkeley, and before that read mathematics at Oxford University.