Netlift modeling workflow¶
Access this AI accelerator on GitHub
Uplift modeling, also referred to as "netlift" modeling, is an approach used often in marketing to isolate the impact of a marketing campaign on specific prospective customers’ propensity to purchase something. The underlying example in this DataRobot AI Accelerator is exactly that, but more generally this approach could be used to isolate the impact of any “intervention” on the propensity of any positive response. The key challenge in uplift modeling is to isolate the effect of the campaign, because no individual person can be observed both receiving the campaign and not receiving the campaign. The accelerator addresses this key challenge, as well as other tips and tricks for uplift modeling.
In many cases, the historical strategy for determining who received a campaign targeted those already likely to purchase the product (or generally, produce a favorable response). That approach would suggest a simple trend that receiving the campaign increases the likelihood to purchase, but many other features about the customers may be confounding the isolated impact of the campaign. In fact, it's possible that a campaign that targeted already high-probability buyers actually reduced their probability of purchase. These are the so-called "sleeping dogs'' in marketing lingo. From an ROI standpoint, increasing the probability to purchase on one group of prospects from 25% to 50% is just as valuable as increasing that probability on another group from 50% to 75% (assuming the groups are roughly the same size, with the same expected revenue values). So what you're really trying to ask from machine learning models is this: on which prospective customers will the campaign increase the probability of purchase by the greatest amount?
This accelerator uses a generic dataset where the favorable outcome is binary: whether or not a product was purchased. The "treatment", or campaign, is simple: a single campaign type that was sent randomly to some prospective buyers, though it also discusses how these methods can be extrapolated to the common case where there was selection bias in the campaign. Leverage machine learning to find patterns around the types of people for whom the campaign is most effective, controlling for their baseline likelihood to purchase in the case that they don't see a campaign. Uplift use cases require some additional post-processing to extract and evaluate the "uplift score", and thus this use case is an ideal candidate for leveraging the DataRobot programmatic API, to seamlessly integrate powerful machine learning with one's typical coding pipeline.
While working through the provided Jupyter Notebook, the following concepts and strategies will be reinforced:
- Data formatting tricks to extract the most from your uplift models.
- How to leverage DataRobot's API to integrate powerful machine learning into your code-first pipelines.
- How to extract uplift scores from a single, binary classification model.
- How to evaluate and understand those uplift scores, and their implied ROI.
- Considerations for cases where your historical, training data exhibits selection bias, where the campaign was not randomly sent.