Causality-Driven Decision-Making: Bridging Data and Optimization

Over the past decade, data-driven decision-making (DDDM) has emerged as a fruitful approach for guiding business strategies and operations. While DDDM highlights the important role of data, it often overlooks causality in shaping decisions. Causality-driven decision-making (CDDM) bridges this gap by integrating causal reasoning into the decision-making process, ensuring that decisions are not only informed by data but also grounded in cause-and-effect relationships. This approach provides a robust decision framework for translating empirical insights into actionable solutions that guide optimal business decisions.

From Empirical Analysis to Optimal Decisions

In many cases, CDDM begins with empirical analysis, as it is essential to know if a specific factor causally affects outcomes of interest before taking any follow-up actions. Such analysis is largely descriptive and can uncover important insights from the data; however, it may not always directly indicate the optimal course of action. For example, in the first chapter of my dissertation [1], we first identify the average treatment effects of faster delivery speed promises on both short- and long-term outcomes using a difference-in-differences (DID) approach. The findings reveal that while faster delivery promises can boost immediate spending, they come at the cost of reduced customer retention. Given the significant impact of delivery speed promises and the inherent tension between short- and long-term objectives, how to strategically set this promise becomes an important decision.

To address this trade-off, we develop a customer lifetime model that uses estimates from the empirical analysis as inputs to optimize long-term profitability. By utilizing a structural model tailored to our business problem, we can translate those important descriptive insights into prescriptive recommendations that guide day-to-day operations.

The decision framework outlined above can be summarized as a two-step process: first identifying causal parameters of interest (e.g., the average treatment effect), and then feeding them into an optimization problem. I refer to this approach as identify-then-optimize (ITO) modeling. It mirrors the popular predict-then-optimize (PTO) approach in DDDM, where a statistical or machine learning model predicts some unknown parameters that are subsequently used in optimization. In ITO, the word "identify" highlights the crucial role of causal inference in determining the key parameters that guide decision-making.

From Machine Learning to Causal Machine Learning

PTO has gained popularity largely because it effectively leverages machine learning's accuracy and scalability, making optimization a seamless next step for turning predictions into actionable policies. In contrast, implementing ITO presents more challenges for several reasons: (1) establishing causality is inherently complex and requires more sophisticated identification conditions; (2) most causal inference methods are not designed for complex data and typically estimate only low-dimensional causal parameters, such as average treatment effects (ATE), limiting their applicability for subsequent optimization; and (3) optimization based on these low-dimensional parameters often demands more domain expertise to build highly customized structural models, as demonstrated in [1].

The recent emergence of causal machine learning (Causal ML) offers a promising solution to some of these challenges by merging the strengths of both causal inference and machine learning. Like machine learning, Causal ML surpasses traditional causal methods (such as linear regression) by imposing fewer parametric assumptions and scaling up more effectively, thus improving the accuracy of treatment effect estimation. For example, instead of limiting itself to estimating ATE, causal ML can efficiently estimate more complex and higher dimensional targets such as individual treatment effects (ITE), which opens the door to personalization and other fine-grained optimization strategies. Meanwhile, similar to traditional causal inference, causal ML still requires rigorous identification conditions to ensure reliable results. Yet, this restriction should be viewed as a strength rather than a weakness, as it is a necessary price to pay for credible causal insights. Causal ML brings these issues to the forefront, making the process rigorous and transparent.

The second chapter of my dissertation [2] illustrates the use of causal ML to drive optimal decisions. We study the optimal targeting strategy for last-mile home delivery. After quantifying the causal impact of home delivery on Alibaba’s sales and revenue using a staggered DID identification strategy, we estimate ITE using causal ML. This shift from ATE to ITE enables us to develop a more granular optimization strategy—targeting the most responsive customers for home delivery while adhering to capacity constraints. This problem, framed as a large-scale knapsack optimization, provided an actionable plan for the operations team. Furthermore, we extended this optimization framework to address fairness concerns by adding a fairness constraint, ensuring that the targeting policy does not discriminate against certain demographic groups.

CDDM Beyond ITO

The term CDDM encompasses a wide range of business applications that explicitly consider causality in decision-making. While ITO represents a fruitful approach, CDDM extends beyond it. In some cases, separating identification and optimization can be inefficient or even infeasible. For example, the policy learning approach directly learns the optimal policy from data without explicitly estimating treatment effects, thereby improving learning efficiency—similar to the end-to-end approach in PTO, where prediction and optimization are unified under a single loss function.

Another example where such separation is infeasible comes from the final chapter of my dissertation [3], where we examine the impact of a sequence of personalized treatments on long-term user engagement in online gaming. This problem can be viewed as a dynamic extension of the customer targeting problem in [2], though much more complicated. First, estimating causal effects in the presence of dynamic confounding poses unique challenges that require specialized techniques from dynamic treatment regimes. Second, we need repeatedly estimate causal parameters of interest over time, and this process is intertwined with policy optimization. In dynamic settings, the effectiveness of current policies depends on future policies, making the identification of the optimal policy part of a dynamic programming problem.

Since resources are often limited in real-world applications, we further extend the framework to incorporate budget constraints. This extension is again nontrivial, as it alters future policies, which in turn affect today’s estimations and decisions. Our dynamic, causality-driven decision framework naturally accounts for these subtleties and nuances, prescribing optimal, state-dependent dynamic policies. It is interesting that the dynamic and forward-looking nature of this problem blurs the boundary between causal identification and policy optimization.

The Role of Structural Models in CDDM

Structural models may serve different roles in CDDM. In [1], we develop a customer lifetime value model to navigate optimal decisions. In this case, the model facilitates optimization but does not establish causality. Thus, causality remains data-based rather than model-based. In many cases, however, structural modeling implies a model-based approach to causality, where causal mechanisms are explicitly specified using math. This mechanistic approach stands in sharp contrast to the data-based empirical approach, which treats the underlying causal processes as a black box. This distinction has fueled a long-standing debate between reduced-form (i.e., causal inference) and structural models in fields like economics. Each approach has its own strengths and weaknesses, and the choice between them should depend on the specific problem and available data.

My research also develops structural models to address causality, evaluate counterfactual policies, and make optimal decisions [4,5,6]. In these studies, since model parameters are learned from data, either through calibration or structural estimation, the resulting decisions should be considered as data-driven. As a result, they also belong to CDDM in a broad sense, as the decision-making is explicitly backed by causality via structural models.

Research Agenda

Depending on the questions at hand and the data available, I apply a wide range of methodologies, including causal inference [1,2,3,4,5,7,8], machine learning [2,3,4], and structural modeling [4,5,6]. Topicwise, my work involves platform operations (e.g., online retail [1,2], logistics and supply chain [2], digital entertainment [3], and food delivery [4], firm productivity [5,6], and financial markets [7,8]. By drawing insights from data, my research aims to improve operations efficiency and business revenue at the micro level, while also improving market efficiency and economic productivity at the macro level. All my studies aim to inform decision-making by explicitly considering causality, either through structural models where causal mechanisms are specified mathematically, or through empirical strategies where causal mechanisms are not explicitly specified but are inferred from data. Therefore, they fall under the broad category of CDDM.

Looking ahead, my research will continue to advance CDDM, with a particular focus on the integration of causal inference, machine learning, and optimization, as well as their applications to platform operations. Below are several promising research directions I plan to explore:

  1. Network Effects and Interference: Exploring how network effects between units (e.g., customers, products) impact decision-making and how causal ML can address these complexities.

  2. Distributional Impact: Moving beyond average treatment effects to examine the distributional impacts of business strategies.

  3. Generative Modeling: Using generative models to simulate counterfactual scenarios in uncertain or incomplete data environments.

  4. Text Data and Large Language Models (LLM): Leveraging unstructured data and LLM to refine causal insights in business decisions.

While these methodological advancements are central to my research agenda, I am equally committed to tackling real-world business challenges using a diverse range of approaches. No single method can solve all problems. I will continue to explore and adopt innovative methods to address interesting and important research questions.

References

[1] Ruomeng Cui, Zhikun Lu, Tianshu Sun, and Joseph M Golden. Sooner or later? promising delivery speed in online retail. Manufacturing & Service Operations Management, 26(1):233–251, 2024. [2] Zhikun Lu, Ruomeng Cui, Tianshu Sun, and Lixia Wu. The value of last-mile delivery in online retail. Available at SSRN 4590356, 2023. [3] Zhikun Lu, Ruomeng Cui, and Yang Su. Incentives in online gaming: Optimal policy design with dynamic causal machine learning. Working Paper, 2024. [4] Zhikun Lu, Ruomeng Cui, and Wenchang Zhang. Food delivery platform expansion strategies: A structural approach. Working Paper, 2024. [5] Kaiji Chen, Yuxuan Huang, Xuewen Liu, Zhikun Lu, and Yong Wang. Preferential credit policy with sectoral markup heterogeneity. Available at SSRN 4847111, 2024. [6] Xi Li, Xuewen Liu, Zhikun Lu, and Yong Wang. A model of china’s economic vertical structure. Available at SSRN 4925056, 2024. [7] Caroline Fohlin and Zhikun Lu. How contagious was the panic of 1907? new evidence from trust company stocks. AEA Papers and Proceedings, 111:514–19, May 2021. [8] Caroline Fohlin, Zhikun Lu, and Nan Zhou. Short sale bans may improve market quality during crises: New evidence from the 2020 covid crash. Available at SSRN 4187052, 2022.