The DataOps Manifesto

Abstract: The list of failed big data projects is long. They leave end-users, data analysts and data scientists frustrated with long lead times for changes. This presentation will illustrate how to make changes to big data, models, and visualizations quickly, with high quality, using the tools analytic teams love. We synthesize DevOps, Demming, and direct experience into the DataOps Manifesto.

To paraphrase an old saying: “It takes a village to get insights from data.” Data analysts, data scientists, and data engineers are already working in teams delivering insight and analysis, but how do you get the team to support experimentation and insight delivery without ending up failing? Christopher Bergh presents the seven shocking steps to get these groups of people working together. These seven steps contain practical, doable steps that can help you achieve data agility.

After looking at trends in analytics and a brief review of Agile, Christopher outlines the steps to apply DevOps techniques from software development to create an Agile analytics operations environment, including how to add tests, modularize and containerize, do branching and merging, use multiple environments, parameterize your process, use simple storage, and use multiple workflows deploy to production with W. Edwards Deming efficiency. They also explain why “don’t be a hero” should be the motto of analytic teams—emphasizing that while being a hero can feel good, it is not the path to success for individuals in analytic teams.
Christopher’s goal is to teach analytic teams how to deliver business value quickly and with high quality. They illustrate how to apply Agile processes to your department. However, a process is not enough. Walking through the seven shocking steps will demonstrate how to create a technical environment that truly enables speed and quality by supporting DataOps.

Bio: Gil Benghiat is one of three founders of DataKitchen, a company on a mission to enable analytic teams to deliver value quickly and with high quality using the tools they love. Gil's career has always been data-oriented, starting with collecting and displaying network data at AT&T Bell Laboratories, managing data at Sybase, collecting and cleaning clinical trial data at PhaseForward, integrating pharmaceutical sales data at LeapFrogRx, and liberating data at Solid Oak Consulting. Gil has an MS in Computer Science from Stanford University and a BS in Applied Mathematics/Biology from Brown University. He completed hiking all 48 of New Hampshire's 4,000 peaks and is now working on the New England 67.