While Causal Inference has been a well-developed research field for over two decades, its applicability in the industry remains highly limited. A thorough explanation of cause and effect is fundamental to decision making particularly in today’s multi-faceted and increasingly complex environments. According to a survey conducted by Maastricht University & Copenhagen Business School in 2021, even though companies have already lived and breathed Big Data, causal methods are not yet widely adopted. One major roadblock is the lack of foundation on theoretical frameworks and practical skills to apply them effectively. Not only analysts and data scientists face this education gap, but even management has not realized the importance of using causal data science methods to improve their business. This calls for more efforts to bridge this gap and promote accessibility to knowledge of Causal Inference to a broader public.
A Reason To Begin
Causal Inference provides powerful techniques for evaluating the impact of strategic actions and investigating causal factors that influence various areas of the business. Data practitioners are long familiar with the adage Correlation is not Causation, but find it difficult to go beyond correlation to establish any causal conclusions about our analysis. We data scientists are seriously deprived of the tools and capability to dig deeper into the data goldmine, and must instead resort to statistical techniques that barely scratch the surface.
Why This Book
This handbook is intended for analysts and data scientists with zero background in Causal Inference but wishing to learn to integrate causal methods into their data analysis pipelines. This beginner's guide aims to provide a general picture of how Causal Inference is applied in practice while equipping you with a minimally sufficient foundation to begin with it. Thus, it should only be treated as a pocket tutorial and by no means replace formal training.
There are indeed plenty of good resources on Causal Inference, but very few of them are friendly to non-academics. I myself struggled to go through the materials in the beginning despite a strong background in statistics. In the book, Causal Inference is approached from a hands-on perspective. Every discussion of a certain topic is restricted to essential or strongly related concepts that aid your understanding. Fortunately, we nowadays have software tools and libraries developed to automate the implementation. I therefore focus on introducing the key ideas to build up intuition and highlighting the connections among frameworks, while leaving the technical details handled by computers.
Since you may find it incomprehensive, I also provide a handy list of materials to enrich your knowledge. You can always trace the references therein for alternative selection of reads.
Contents
The book contains 4 parts:
- Part I: Fundamentals covers funamental concepts of Causal Inference under the paradigm of probabilistic graphical models. This part clarifies what we mean by cause and effect while offers various methods to make causal claims out of non-experimental data.
- Part II: Experimentation discusses the importance of experimentation in determining causal effects, along with several caveats in designing and conducting experiments that may affect the reliability of your causal results. In this regard, Causal Inference is to be explored under the Potential Outcome framework.
- Part III: Applications provides case studies on how tech firms such as LinkedIn, Apple, Facebook, Uber adopt Causal Inference for forecasting and product development. At the same time, it presents more complicated scenarios where causal methods are difficult to apply.
- Part IV: Tools introduces a number of tools and libraries that supports implementation of Causal Inference in practice. It includes coding tutorials and comprehensive comparison of these tools.
Prerequisite
Basic knowledge of Statistics, Probability and Python is highly recommended. It also requires a bit of patience on your first touch with Causal Inference. If you struggle to understand, please remember that this is very natural when acquiring new knowledge. I am happy to provide private tutoring if you need a refresher on any of these topics. Feel free to drop me an email at causalguide[at]gmail.com.
Resources
The following books are particularly useful to me. These are formal readings to solidify your understanding. I rank them from the easiest to most difficult to beginners.
- Introduction to Causal Inference (Neal 2020)
- The Book of Why (Pearl & Mackenzie 2018)
- Causal Inference: What If (Robins & Hernan 2010)
- Elements of Causal Inference (Peteres, Janzing, Scholkopf 2017)
- Causal Inference in Statistics: A Primer (Pearl, Glymour, Jewell 2016)
- Causality: Models, Reasoning and Inference (Pearl 2009)
About the author
Vy Vo gradudated from Monash Unversity with a Master in Data Science and is currently a research associate at the university. Her research interests relate to Causal Inference, Natural Language Processing and AI Reasoning. Learn more about Vy Vo here.