Microsoft's experimentation platform team aims to enable users to make trustworthy decisions in A/B tests. In case studies, I’ll describe how we use statistical evaluation and simulation frameworks in meeting this trustworthiness promise and making the right platform decisions for our user needs. How should we choose an analysis window and inclusion criteria for a single-step vs. ramp-up policy for controlled feature rollout, and is there one “right” choice? How did we ground the performance of our platform’s variance reduction (VR) estimator and understand cases of poor efficacy? How did we evaluate the complexity vs. efficacy tradeoffs of ML-assisted VR techniques head-to-head with simpler approaches?
Laura Cosgrove is a data scientist on Microsoft’s Experimentation Platform (ExP). As part of the ExP team, Laura enables product teams across Microsoft to run trustworthy experiments at scale. Before joining ExP, Laura was a Senior Data Scientist at CVS Health, where she designed personalized marketing experiments aiming to reduce medical costs and improve health. Prior to that, Laura received her M.S. in Biostatistics at the Columbia University Mailman School of Public Health. Laura’s research interests include experiment design and causal inference in observational data.