Causal Inference for Data Scientists

Abstract: Causal inference is an increasingly necessary skill set for data scientists and analysts. No longer is it enough only to predict what happens given a set of environmental conditions, but rather internal business partners need to know how the decisions they are making influence outcomes. For example, marketers not only need to know that spending more money drives more revenue, but they also need to know how much revenue they can expect to observe at various levels of marketing spend. Understanding the causal relationship between spend and revenue empowers decision makers to optimize their decisions more accurately and quickly around crucial business goals such as ROI targets or revenue maximization. At DraftKings, we are always thinking about how we draw accurate conclusions from all of our tests. Our efforts include utilizing modern techniques as well as exploring new ideas and methods to improve our ability to learn.

Managers often assume that causal inference is a simple exercise for data scientists. Unfortunately, causal inference is not as simple as running A/B experiments. The purpose of this talk is to establish that causal inference is as much a philosophical exercise as it is a data exercise. Developing expertise in causal inference requires a deep understanding of the accepted framework, an ability to identify when data doesn't adhere to the assumptions of this framework, and expertise with tools and techniques that can solve many of the significant challenges with estimating unbiased effects of treatments on critical outcomes.

This session serves as an introduction to the practice of causal inference. We start with an overview of the Rubin Causal Model (RCM), the leading framework for establishing causality. Once users are comfortable with the philosophy, we explore how the commonly used A/B testing framework maps to the more robust RCM framework from both a mathematical and philosophical perspective. In the final portion of the talk, we discuss several techniques developed by researchers that can be used to establish causality for a compromised A/B test or cases where tests are not feasible to implement. Throughout the talk, we use a general set of challenges faced by businesses to illustrate when issues arise and how these techniques mitigate the challenges.

Bio: Coming soon