## Give it a Lift: Uplift Modeling

- 03/09/2015
- 1023
- 0 Like

**Published In**

- Big Data
- Analytics
- Business Intelligence

Uplift modeling, also known as incremental modeling, true lift modeling, or net-lift modeling is a predictive modeling technique that directly models the incremental impact of a treatment (such as a direct marketing action) on an individual’s behavior.

Uplift modeling has applications in customer relationship management for up-sell, cross-sell and retention modeling. It has also been applied to personalized medicine. Unlike the related Differential Prediction concept in psychology, Uplift modeling assumes an active agent.

**Introduction**

Uplift modeling uses a randomized scientific control to not only measure the effectiveness of a marketing action but also to build a predictive model that predicts the incremental response to the marketing action. It is a data mining technique that has been applied predominantly in the financial services, telecommunications and retail direct marketing industries to up-sell, cross-sell, churn and retention activities.

**Measuring uplift**

The uplift of a marketing campaign is usually defined as the difference in response rate between a treated group and a randomized control group. This allows a marketing team to isolate the effect of a marketing action and measure the effectiveness or otherwise of that individual marketing action. Honest marketing teams will only take credit for the incremental effect of their campaign.

**The uplift problem statement**

The uplift problem can be stated as follow, given the following:

- Cases P={1,..,n},

- treatments ={1,…,U}

- expected return R(i,t)cfor each case i and treatment t,

- non-negative integers n1,…,nU such that

n1+⋯+ nU = n

find a treatment assignment

f: P→J

So that the total return

∑ Rif(i) for i=1…N

is maximized, subject to the constraints that the number of cases assigned to treatment j is not to exceed.

nj,(j =1,..,U)

Example: Marketing action case

- P: a group of customers,

- two treatments:

**1. treatment 1:** exercise some marketing action; Ri1is the expected return if treatment 1 is given to customer ,

**2. treatment 2: **exercise no the marketing action; let Ri2 be the expected return if treatment 2 is given to customer i.

Solution to the maximization problem, can be reached with a sum does not involve f, so maximizing total return is equivalent to maximizing the first term:

∑ (Ri1-Ri2) over (i∈(f=1))

As for to the solution to the problem when we consider only the responses to treatment 1, to attain the maximum return:

- assign treatment 1 to the customers with the largest values of Ri1-Ri2

- assign treatment 2 to the remaining customers

The difference Ri1-Ri2 is called net lift, uplift, incremental response, differential response, etc.

If we consider only the response to treatment 1, and base targeting on a model built out of responses to previous marketing actions, we are not proceeding as if to maximize ∑ Rif(i) for i=1…N. Such maximization would not yield the maximum return. We need to consider the return from cases subjected to no marketing action.

Now, consider a model with a binary response, e.g., Yes = 1, No = 0. Then the netlift is:

Prob(1)=exp(score1 )/(1+exp (score1 ) );

Prob(0)=exp(score0 )/(1+exp (score0 ) );

netlift = Prob(1)–Prob(0),

where Prob(1) is the probability of a response equal to 1, Prob(0) is the probability of a response = 0, score1 is the model scores for responses equal to 1, and score0 is the model scores for responses equal to 0.

The incremental response lift, with all initial vales set to 0, can be obtained using the following pseudo code:

prob1 = exp(score_1)/(1+exp(score_1));

prob0 = exp(score_0)/(1+exp(score_0));

netlift = prob1 – prob0;

if treatment flag = 1 then

mail total = mail total + 1;

if response = 1 then mail response = mail response + 1;

expected netlift mailed = expected netlift mailed + netlift;

expected anyway mailed = expected anyway mailed + prob(0);

expected mail response = expected mail response + prob(1);

end;

else do;

nomail totoal = nomail total + 1;

if response = 1 then nomail response = nomail response + 1;

end;

The associated probabilities and netlift pseudo code is:

if last pentile;

empirical prob mail = mail response/mail total;

empirical prob nomail = nomail response/nomail total;

empirical netlift = empirical prob mail - empirical prob nomail;

percent gain = 100* empirical netlift/ empirical prob nomail;

empirical expected buyanyway mailed = mail total* empirical prob nomail;

empirical expected netlift = mailresp - empirical expected buyanyway mailed;

The table below shows the details of a campaign showing the number of responses and calculated response rate for a hypothetical marketing campaign. This campaign would be defined as having a response rate uplift of 5%. It has created 50,000 incremental responses (100,000 – 50,000).

**Traditional response modeling**

Traditional response modeling typically takes a group of treated customers and attempts to build a predictive model that separates the likely responders from the non-responders through the use of one of a number of predictive modeling techniques. Typically this would use decision trees or regression analysis.

This model would only use the treated customers to build the model.

In contrast uplift modeling uses both the treated and control customers to build a predictive model that focuses on the incremental response. To understand this type of model it is proposed that there is a fundamental segmentation that separates customers into the following groups (Lo, 2002):

- The Persuadables : customers who only respond to the marketing action because they were targeted

- The Sure Things : customers who would have responded whether they were targeted or not

- The Lost Causes : customers who will not respond irrespective of whether or not they are targeted

- The Do Not Disturbs or Sleeping Dogs : customers who are less likely to respond because they were targeted

The only segment that provides true incremental responses is the Persuadables.

Uplift modeling provides a scoring technique that can separate customers into the groups described above. Traditional response modeling often targets the Sure Things being unable to distinguish them from the Persuadables.

**Example: Simulation-Educators.com**

Majority of direct marketing campaigns are based on purchase propensity models, selecting customer email, paper mail or other marketing contact lists based on customers’ probability to make a purchase. Simulation-Educators.com offers training courses in modeling and simulation topics. The following is an example of a of standard purchase propensity model output for a mailing campaign for such courses.

**Table 1**. Example of standard purchase propensity model output used to generate direct campaign mailing list at Simulation-Educators.com

This purchase propensity model had a ‘nice’ lift (rank’s response rate over total response rate) for the top 4 ranks on the validation data set. Consequently, we would contact customers included in top 4 ranks. After the catalog campaign had been completed, we conducted post analysis of mailing list performance vs. control group. The control group consisted of customers who were not contacted, grouped by the same purchase probability scoring ranks.

**Table 2. **Campaign Post analysis

As shown the table 2, the top four customer ranks selected by propensity model perform well for both mailing group and control group. However, even though mailing/test group response rate was at decent level – 16.7%, our incremental response rate (mailing group net of control group) for combined top 4 ranks was only 0.15%. With such low incremental response rate, our undertaking would be likely generating a negative ROI.

What was the reason that our campaign shown such poor incremental results? The purchase propensity model did its job well and we did send an offer to people who were likely to make a purchase. Apparently, modeling based on expected purchase propensity is not always the right solution for a successful direct marking campaign. Since there was no increase in response rate over control group, we could have been contacting customers who would have bought our product without promotional direct mail. Customers in top ranks of purchase propensity model may not need a nudge or they are buying in response to a contact via other channels. If that is the case, the customers in the lower purchase propensity ranks would be more ‘responsive’ to a marketing contact.

We should be predicting incremental impact – additional purchases generated by a campaign, not purchases that would be made without the contact. Our marketing mailing can be substantially more cost efficient if we don’t mail customers who are going to buy anyway.

Since customers very rarely use promo codes from catalogs or click on web display ads, it is difficult to identify undecided, swing customer based on the promotion codes or web display click-throughs.

Net lift models predict which customer segments are likely to make a purchase ONLY if prompted by a marketing undertaking.

Purchasers from mailing group include customers that needed a nudge, however, all purchasers in the holdout/control group did not need our catalog to make their purchasing decision. All purchasers in the control group can be classified as ‘need no contact’. Since we need a model that would separate ‘need contact’ purchasers from ‘no contact’ purchasers, the net lift models look at differences in purchasers in mailing (contact) group versus purchasers from control group.

In order to classify our customers into these groups we need mailing group and control group purchases results from similar prior campaigns. If there are no comparable historic undertakings, we have to create a small scale trial before the main rollout.

**Uplift modeling approach—probability decomposition models**

Segments used in probability decomposition models:

**Figure 2.** Segments in probability decomposition models

Standard purchase propensity models are only capable of predicting all purchasers (combined segments A and B). The probability decomposition model predicts purchasers segments that need to be contacted (segment A) by leveraging two logistic regression models, as shown in the formula below (Zhong, 2009).

**Summary of probability decomposition modeling process:**

1. Build stepwise logistic regression purchase propensity model (M1) and record model score for every customer in a modeled population.

2. Use past campaign results or small scale trial campaign results to create a dataset with two equal size sections of purchasers from contact group and control group. Build a stepwise regression logistic model predicting which purchasers are from the contact group. The main task of this model will be to penalize the score of model built in the step 1 when purchaser is not likely to need contact.

3. Calculate net purchasers score based on probability decomposition formula

**Results of the probability decomposition modeling process**

**Table 3.** Post analysis of campaign leveraging probability decomposition model for Simulation-Educators.com

Scoring Ranks 1 thru 6 show positive incremental response rates. The scoring ranks are ordered based on the incremental response rates.

**Return on investment**

Because uplift modeling focuses on incremental responses only, it provides very strong return on investment cases when applied to traditional demand generation and retention activities. For example, by only targeting the persuadable customers in an outbound marketing campaign, the contact costs and hence the return per unit spend can be dramatically improved (Radcliffe & Surry, 2011).

**Removal of negative effects**

One of the most effective uses of uplift modeling is in the removal of negative effects from retention campaigns. Both in the telecommunications and financial services industries often retention campaigns can trigger customers to cancel a contract or policy. Uplift modeling allows these customers, the Do Not Disturbs, to be removed from the campaign.

**History of uplift modeling**

The first appearance of true response modeling appears to be in the work of Radcliffe and Surry (Radcliffe & Surry, 1999).

Victor Lo also published on this topic in The True Lift Model (Lo, 2002), and more recently Radcliffe (Radcliffe, Using Control Groups to Target on Predicted Lift: Building and Assessing Uplift Models, 2007). Radcliffe also provides a very useful frequently asked questions (FAQ) section on his web site, Scientific Marketer (Uplift Modelling FAQ, 2007).

Similar approaches have been explored in personalized medicine (Cai, Tian, Wong, & Wei, 2009).

Uplift modeling is a special case of the older psychology concept of Differential Prediction. In contrast to differential prediction, uplift modeling assumes an active agent, and uses the uplift measure as an optimization metric.

**Implementations**

- uplift package for R

- JMP by SAS

- Portrait Uplift by Pitney Bowes

- Uplift node for KNIME by Dymatrix

- 03/09/2015
- 1023
- 0 Like

## Give it a Lift: Uplift Modeling

- 03/09/2015
- 1023
- 0 Like

#### Jeffrey Strickland

Predictive Analytics Consultant at Clarity Solution Group

Opinions expressed by Grroups members are their own.

#### Top Authors

Uplift modeling, also known as incremental modeling, true lift modeling, or net-lift modeling is a predictive modeling technique that directly models the incremental impact of a treatment (such as a direct marketing action) on an individual’s behavior.

Uplift modeling has applications in customer relationship management for up-sell, cross-sell and retention modeling. It has also been applied to personalized medicine. Unlike the related Differential Prediction concept in psychology, Uplift modeling assumes an active agent.

**Introduction**

Uplift modeling uses a randomized scientific control to not only measure the effectiveness of a marketing action but also to build a predictive model that predicts the incremental response to the marketing action. It is a data mining technique that has been applied predominantly in the financial services, telecommunications and retail direct marketing industries to up-sell, cross-sell, churn and retention activities.

**Measuring uplift**

The uplift of a marketing campaign is usually defined as the difference in response rate between a treated group and a randomized control group. This allows a marketing team to isolate the effect of a marketing action and measure the effectiveness or otherwise of that individual marketing action. Honest marketing teams will only take credit for the incremental effect of their campaign.

**The uplift problem statement**

The uplift problem can be stated as follow, given the following:

- Cases P={1,..,n},

- treatments ={1,…,U}

- expected return R(i,t)cfor each case i and treatment t,

- non-negative integers n1,…,nU such that

n1+⋯+ nU = n

find a treatment assignment

f: P→J

So that the total return

∑ Rif(i) for i=1…N

is maximized, subject to the constraints that the number of cases assigned to treatment j is not to exceed.

nj,(j =1,..,U)

Example: Marketing action case

- P: a group of customers,

- two treatments:

**1. treatment 1:** exercise some marketing action; Ri1is the expected return if treatment 1 is given to customer ,

**2. treatment 2: **exercise no the marketing action; let Ri2 be the expected return if treatment 2 is given to customer i.

Solution to the maximization problem, can be reached with a sum does not involve f, so maximizing total return is equivalent to maximizing the first term:

∑ (Ri1-Ri2) over (i∈(f=1))

As for to the solution to the problem when we consider only the responses to treatment 1, to attain the maximum return:

- assign treatment 1 to the customers with the largest values of Ri1-Ri2

- assign treatment 2 to the remaining customers

The difference Ri1-Ri2 is called net lift, uplift, incremental response, differential response, etc.

If we consider only the response to treatment 1, and base targeting on a model built out of responses to previous marketing actions, we are not proceeding as if to maximize ∑ Rif(i) for i=1…N. Such maximization would not yield the maximum return. We need to consider the return from cases subjected to no marketing action.

Now, consider a model with a binary response, e.g., Yes = 1, No = 0. Then the netlift is:

Prob(1)=exp(score1 )/(1+exp (score1 ) );

Prob(0)=exp(score0 )/(1+exp (score0 ) );

netlift = Prob(1)–Prob(0),

where Prob(1) is the probability of a response equal to 1, Prob(0) is the probability of a response = 0, score1 is the model scores for responses equal to 1, and score0 is the model scores for responses equal to 0.

The incremental response lift, with all initial vales set to 0, can be obtained using the following pseudo code:

prob1 = exp(score_1)/(1+exp(score_1));

prob0 = exp(score_0)/(1+exp(score_0));

netlift = prob1 – prob0;

if treatment flag = 1 then

mail total = mail total + 1;

if response = 1 then mail response = mail response + 1;

expected netlift mailed = expected netlift mailed + netlift;

expected anyway mailed = expected anyway mailed + prob(0);

expected mail response = expected mail response + prob(1);

end;

else do;

nomail totoal = nomail total + 1;

if response = 1 then nomail response = nomail response + 1;

end;

The associated probabilities and netlift pseudo code is:

if last pentile;

empirical prob mail = mail response/mail total;

empirical prob nomail = nomail response/nomail total;

empirical netlift = empirical prob mail - empirical prob nomail;

percent gain = 100* empirical netlift/ empirical prob nomail;

empirical expected buyanyway mailed = mail total* empirical prob nomail;

empirical expected netlift = mailresp - empirical expected buyanyway mailed;

The table below shows the details of a campaign showing the number of responses and calculated response rate for a hypothetical marketing campaign. This campaign would be defined as having a response rate uplift of 5%. It has created 50,000 incremental responses (100,000 – 50,000).

**Traditional response modeling**

Traditional response modeling typically takes a group of treated customers and attempts to build a predictive model that separates the likely responders from the non-responders through the use of one of a number of predictive modeling techniques. Typically this would use decision trees or regression analysis.

This model would only use the treated customers to build the model.

In contrast uplift modeling uses both the treated and control customers to build a predictive model that focuses on the incremental response. To understand this type of model it is proposed that there is a fundamental segmentation that separates customers into the following groups (Lo, 2002):

- The Persuadables : customers who only respond to the marketing action because they were targeted

- The Sure Things : customers who would have responded whether they were targeted or not

- The Lost Causes : customers who will not respond irrespective of whether or not they are targeted

- The Do Not Disturbs or Sleeping Dogs : customers who are less likely to respond because they were targeted

The only segment that provides true incremental responses is the Persuadables.

Uplift modeling provides a scoring technique that can separate customers into the groups described above. Traditional response modeling often targets the Sure Things being unable to distinguish them from the Persuadables.

**Example: Simulation-Educators.com**

Majority of direct marketing campaigns are based on purchase propensity models, selecting customer email, paper mail or other marketing contact lists based on customers’ probability to make a purchase. Simulation-Educators.com offers training courses in modeling and simulation topics. The following is an example of a of standard purchase propensity model output for a mailing campaign for such courses.

**Table 1**. Example of standard purchase propensity model output used to generate direct campaign mailing list at Simulation-Educators.com

This purchase propensity model had a ‘nice’ lift (rank’s response rate over total response rate) for the top 4 ranks on the validation data set. Consequently, we would contact customers included in top 4 ranks. After the catalog campaign had been completed, we conducted post analysis of mailing list performance vs. control group. The control group consisted of customers who were not contacted, grouped by the same purchase probability scoring ranks.

**Table 2. **Campaign Post analysis

As shown the table 2, the top four customer ranks selected by propensity model perform well for both mailing group and control group. However, even though mailing/test group response rate was at decent level – 16.7%, our incremental response rate (mailing group net of control group) for combined top 4 ranks was only 0.15%. With such low incremental response rate, our undertaking would be likely generating a negative ROI.

What was the reason that our campaign shown such poor incremental results? The purchase propensity model did its job well and we did send an offer to people who were likely to make a purchase. Apparently, modeling based on expected purchase propensity is not always the right solution for a successful direct marking campaign. Since there was no increase in response rate over control group, we could have been contacting customers who would have bought our product without promotional direct mail. Customers in top ranks of purchase propensity model may not need a nudge or they are buying in response to a contact via other channels. If that is the case, the customers in the lower purchase propensity ranks would be more ‘responsive’ to a marketing contact.

We should be predicting incremental impact – additional purchases generated by a campaign, not purchases that would be made without the contact. Our marketing mailing can be substantially more cost efficient if we don’t mail customers who are going to buy anyway.

Since customers very rarely use promo codes from catalogs or click on web display ads, it is difficult to identify undecided, swing customer based on the promotion codes or web display click-throughs.

Net lift models predict which customer segments are likely to make a purchase ONLY if prompted by a marketing undertaking.

Purchasers from mailing group include customers that needed a nudge, however, all purchasers in the holdout/control group did not need our catalog to make their purchasing decision. All purchasers in the control group can be classified as ‘need no contact’. Since we need a model that would separate ‘need contact’ purchasers from ‘no contact’ purchasers, the net lift models look at differences in purchasers in mailing (contact) group versus purchasers from control group.

In order to classify our customers into these groups we need mailing group and control group purchases results from similar prior campaigns. If there are no comparable historic undertakings, we have to create a small scale trial before the main rollout.

**Uplift modeling approach—probability decomposition models**

Segments used in probability decomposition models:

**Figure 2.** Segments in probability decomposition models

Standard purchase propensity models are only capable of predicting all purchasers (combined segments A and B). The probability decomposition model predicts purchasers segments that need to be contacted (segment A) by leveraging two logistic regression models, as shown in the formula below (Zhong, 2009).

**Summary of probability decomposition modeling process:**

1. Build stepwise logistic regression purchase propensity model (M1) and record model score for every customer in a modeled population.

2. Use past campaign results or small scale trial campaign results to create a dataset with two equal size sections of purchasers from contact group and control group. Build a stepwise regression logistic model predicting which purchasers are from the contact group. The main task of this model will be to penalize the score of model built in the step 1 when purchaser is not likely to need contact.

3. Calculate net purchasers score based on probability decomposition formula

**Results of the probability decomposition modeling process**

**Table 3.** Post analysis of campaign leveraging probability decomposition model for Simulation-Educators.com

Scoring Ranks 1 thru 6 show positive incremental response rates. The scoring ranks are ordered based on the incremental response rates.

**Return on investment**

Because uplift modeling focuses on incremental responses only, it provides very strong return on investment cases when applied to traditional demand generation and retention activities. For example, by only targeting the persuadable customers in an outbound marketing campaign, the contact costs and hence the return per unit spend can be dramatically improved (Radcliffe & Surry, 2011).

**Removal of negative effects**

One of the most effective uses of uplift modeling is in the removal of negative effects from retention campaigns. Both in the telecommunications and financial services industries often retention campaigns can trigger customers to cancel a contract or policy. Uplift modeling allows these customers, the Do Not Disturbs, to be removed from the campaign.

**History of uplift modeling**

The first appearance of true response modeling appears to be in the work of Radcliffe and Surry (Radcliffe & Surry, 1999).

Victor Lo also published on this topic in The True Lift Model (Lo, 2002), and more recently Radcliffe (Radcliffe, Using Control Groups to Target on Predicted Lift: Building and Assessing Uplift Models, 2007). Radcliffe also provides a very useful frequently asked questions (FAQ) section on his web site, Scientific Marketer (Uplift Modelling FAQ, 2007).

Similar approaches have been explored in personalized medicine (Cai, Tian, Wong, & Wei, 2009).

Uplift modeling is a special case of the older psychology concept of Differential Prediction. In contrast to differential prediction, uplift modeling assumes an active agent, and uses the uplift measure as an optimization metric.

**Implementations**

- uplift package for R

- JMP by SAS

- Portrait Uplift by Pitney Bowes

- Uplift node for KNIME by Dymatrix

- 03/09/2015
- 1023
- 0 Like

## Jeffrey Strickland

Predictive Analytics Consultant at Clarity Solution Group

Opinions expressed by Grroups members are their own.