Hand labeling, a fundamental part of human-mediated machine intelligence, in today's age is akin to scribes hand-copying books post-Gutenberg. What's more is that the process is naive, dangerous, and expensive in light of the ever-growing world of alternatives which includes semi-supervised learning, weak supervision, and active learning.
The significant issues with hand labeling include the introduction of bias (and hand labels are neither interpretable nor explainable), the prohibitive costs (both financial costs and the time of subject matter experts), and the fact that there is no such thing as gold labels (even the most well-known hand labeled datasets have label error rates of at least 5%!).
We will explore the ways hand labeling has been negatively impacting ML solutions in production today, navigate the world of alternatives, and provide a framework for how to think about when to turn towards automation or manual annotation.
Shayan Mohanty is the CEO and Co-Founder of Watchful, a company that largely automates the process of creating labeled training data. He has a decade leading data engineering teams at various companies including Facebook, where he served as lead for the stream processing team responsible for processing 100% of the ads metrics data for all FB products. He is also a Guest Scientist at Los Alamos National Laboratory and has given talks on topics ranging from Automata Theory to Machine Teaching.