There are two phases to {ML, LLM, AI} Ops: (1) [Urgent] I need to get my model to production! and (2) [Urgent] I just got a model to production, what do I do? Most platforms make only one of these easy, and ignore the other. This often leads to a tough choice: either quickly ship a brittle, opaque model, or navigate a maze of technical jargon and complex infrastructure to deploy a more robust implementation. In this talk, we tear down this dichotomy and present the open-source library Hamilton that we developed at Stitch Fix, where we built the self-service ML platform for over 100 data scientists/MLEs. We’ll dive into the wide array of challenges we’ve dealt with in getting a model to production and maintaining it, present the ways Hamilton addresses both phases, and connect/compare it to other tools and trends in the industry.
Elijah has always enjoyed working at the intersection of math and engineering. More recently, he has focused his career on building tools to make data scientists and researchers more productive. At Two Sigma, he built infrastructure to help quantitative researchers efficiently turn ideas into production trading models. At Stitch Fix he ran the Model Lifecycle team — a team that focuses on streamlining the experience for data scientists to create and ship machine learning models. He is now the CTO at DAGWorks, which aims to solve the problem of building and maintaining complex data pipelines. In his spare time, he enjoys geeking out about fractals, poring over antique maps, and playing jazz piano.