Software practitioners work to make their systems reliable. We hear teams boasting of having four or five 9s of uptime. Data Systems depend on data that can be out of date or late. Pipelines and automated jobs fail to run. Data sometimes arrives late changing the outcomes of processing jobs. All these situations are examples of Data Downtime and lead to misleading results and false reporting. As a DRE team (Data Reliability Engineering) we borrowed tools and practices from SRE to build a better data system.
In this talk, we will explore real-world reliability situations for our data systems and address three major topics to strengthen any pipeline: Data Downtime: What Data Downtime is, how it affects your bottom line, and how to minimize it. Data Service Level Metrics: We will talk about metadata for your Data pipeline and how to report on pipeline transactions that can lead to preventative data engineering practices. Data monitoring: What to look out for and how to be aware of system failure versus data failures.
Miriah Peterson, a seasoned engineer with 6 years of expertise in Go programming, excels as a Data Reliability Engineer. Her professional journey includes crafting videos, tutorials, and courses, showcasing her mastery in Go and Data Engineering. A dynamic speaker, Miriah has delivered talks on Go, machine learning, and data engineering. As a board member of Forge Foundation Inc. and an organizer of the GoWest Conference, Utah Data Engineering, and Machine Learning Utah meetups, she actively shapes the tech community. Miriah earned her bachelor's degree in physics from Brigham Young University in 2017, laying a strong foundation for her multifaceted contributions to the field.