How do you take a platform designed for large scale storage of unstructured key-value data and optimize it for the structured world of Spark? In this talk we'll look at the real world lessons learned integrating Riak, the distributed key-value NoSQL database, with Spark. This will cover both the challenges and solutions for integrating these tools. We'll also dive into more advanced topics we encountered while creating the open source Spark-Riak connector including:
John Musser is VP of Engineering for Basho Technologies, creators of the NoSQL database Riak. John is a recognized industry expert having founded ProgrammableWeb, the leading online API resource for developers, as well as the DevOps service API Science. He is often quoted in the media including the Wall Street Journal, New York Times, Forbes, and Wired, and speaking at conferences including OSCON, QCon, SXSW, Dreamforce, and Web 2.0. He also consults on API and big data strategy with clients including Google, Microsoft, AT&T, and Salesforce. He has taught at Columbia University and University of Washington.