Whether we're analyzing transactional records, user behavior, or operational metrics, the ultimate value of data science lies in knowing which levers to pull to encourage favorable outcomes and reduce unfavorable ones.
Take transactional data: it tells us who bought what, when, and where. But the real power of transactional analysis comes from understanding why transactions happen—or fail to happen. It’s one thing to predict that a customer will churn based on recent activity; it’s another to know what can be done to prevent it.
Today’s most advanced AI architectures, trained on vast amounts of business data, deliver extraordinary predictive power and pattern recognition. But to realize the full promise of data science, we must equip AI with something greater: the ability to explain data.
A tale of two model types
At modell.ai, we distinguish between two fundamentally different modeling approaches: data models and causal models. Both rely on data, but their capabilities diverge.
Data models—such as transformer-based deep learning systems—detect and exploit correlations between data segments [1]. They predict one part of the data from another. These models are powerful, but they remain blind to the processes that generate those correlations.
Causal models, on the other hand, combine data with explicit explanations about the mechanisms driving observed behavior. This distinction is critical for decision-making because it allows us to separate causation from mere correlation. For instance, if you want to know how much customer churn will decrease by reducing your app downtime, you need a causal model that describes how customer dissatisfaction follows from app downtime.
Data models remain the dominant paradigm in data science. But while they excel at pattern recognition, they offer limited insight into the levers that decision-makers can act on [2].
The leap forward
We believe the future of data science lies in bridging the pattern-finding power of deep learning with the structural clarity of causal modeling.
Consider the problem of understanding, not just predicting, repeated purchases of a consumer good. A purchase could result from low inventory, but it might also stem from a temporary price cut during a promotion. Which explanation fits a given case? The answer matters for deciding whether to keep running promotions or assessing a customer’s price sensitivity. Answering such questions requires more than historical patterns; it demands integrating realistic information on consumption rates, sales events, and other external influences.
Such inferences are only possible if a model, like a human, establishes a causal framework before analyzing the data. This integration can transform data science from a passive observer of past behavior into an active, decision-ready partner—one that explains, simulates, and helps shape the outcomes that matter most.
[1] See the profiling opportunity for an example of a data model.
[2] In practice, real-world knowledge often enters the modeling process informally—through manual fixes and ad hoc adjustments layered onto data models. But this puts the cart before the horse. As Simpson’s paradox reminds us (see here and here), causal structure must come before statistical analysis if we aim to measure causal impact accurately. Without it, even the most advanced data model can lead us astray.
For the same reason, efforts to make deep learning models more interpretable won't necessarily make them more actionable. Transparency alone doesn't reveal the causal mechanisms required for decision-making when the revealed structure reflects only correlations within the data.
Through focused partnerships and consulting, modell.ai can help engineer the next generation of causal, decision-ready data science.
We are particularly interested in exploring this opportunity with large transactional datasets. Let's talk!