S Learner Uplift Model for Individual Treatment Effect and Customer Segmentation in Python

Uplift model using meta-learner s-learner for heterogeneous individual treatment effect (ITE) and marketing customer segmentation

S-learner is a meta-learner that uses a single machine learning model to estimate the individual level causal treatment effect. In this tutorial, we will talk about:

How to implement s-learner in Python?
How to make individual treatment effect (ITE) estimation using an s-learner?
How to segment customers based on the individual treatment effect (ITE) estimation?

Resources for this post:

Video tutorial for this post on YouTube
Click here for the Colab notebook.
More video tutorials on Causal Inference
More blog posts on Causal Inference

Let’s get started!

Step 1: Install and Import Libraries

In step 1, we will install and import the python libraries.

Firstly, let’s install causalml for synthetic dataset creation.

# Install package
!pip install causalml

After the installation is completed, we can import the libraries.

pandasis imported for data processing.
synthetic_data is imported for synthetic data creation.
seaborn is for visualization.
LGBMRegressor is for the machine learning model training.

# Data processing
import pandas as pd
import numpy as np

# Create synthetic data
from causalml.dataset import synthetic_data

# Visualization
import seaborn as sns

# Machine learning model
from lightgbm import LGBMRegressor

Join Medium with my referral link - Amy @GrabNGoInfo

Read every story from Amy (and thousands of other writers on Medium). Your membership fee directly supports Amy and…

medium.com

Step 2: Create Dataset

In step 2, we will create a synthetic dataset for the s-learner uplift model.

Firstly, a random seed is set to make the synthetic dataset reproducible.
Then, using the synthetic_data method from the causalml python package, we created a dataset with five features, one treatment variable, and one continuous outcome variable.
After that, the dataset is saved in a pandas dataframe.
Finally, using value_counts on the treatment variable, we can see that out of 1000 samples, 512 units received treatment and 488 did not receive treatment.

# Set a seed for reproducibility
np.random.seed(42)

# Create a synthetic dataset
y, X, treatment, _, _, _ = synthetic_data(mode=1, n=1000, p=5, sigma=1.0)

# Save the data in a pandas dataframe
df = pd.DataFrame({'y': y, 'X1': X.T[0], 'X2': X.T[1], 'X3': X.T[2], 'X4': X.T[3], 'X5': X.T[4], 'treatment': treatment})

# Check treatment
df['treatment'].value_counts()

Output:

1    512
0    488
Name: treatment, dtype: int64

Step 3: S-Learner Model Training

In step 3, we will train an s-learner model.

S-learner trains a single machine learning model, and that’s why it’s called s-learner.

The model features (a.k.a. predictors or independent variables) include covariates and the treatment variable. We created a variable called features that includes X1, X2, X3, X4, X5, and treatment.
The dependent variable(a.k.a. label) is the outcome variable. We created a variable called y that includes the outcome column y.

# Features
features = df.loc[:, ['X1', 'X2', 'X3', 'X4', 'X5', 'treatment']]

# Dependent variable
y = df.loc[:, ['y']]

The model selection and hyperparameter tuning are important for the performance of an s-learner. This is because the model performance affects the model predictions hence the accuracy of the individual treatment effect (ITE) estimation.

Many machine learning model algorithms can be used to build the s-learner. The model algorithms include but are not limited to LASSO regression, Ridge regression, random forest, XGBoost, and a neural network model.

A light GBM model is used in this example, and the process is the same for other machine learning model algorithms.

After initiating the LGBMRegressor method, we fit the model using the features and the outcome variable.

# Initiate the light GBM model
s_learner = LGBMRegressor()

# Fit the model
s_learner.fit(features, y);

Step 4: S-Learner Model Predictions

In step 4, we will make predictions using an s-learner model.

To make the treatment effect estimation, two separate predictions need to be made using the trained model:

In the first prediction, the treatment variable is set to 1 for all the samples. This gives us the predicted outcome values if all the samples received the treatment.
In the second prediction, the treatment variable is set to 0 for all the samples. This gives us the predicted outcome values if none of the samples received the treatment.

# Set treatment value to 1
with_treatment = features.assign(treatment = 1)

# With treatment predictions
with_treatment_predict = s_learner.predict(with_treatment)

# Set treatment value to 0
without_treatment = features.assign(treatment = 0)

# With treatment predictions
without_treatment_predict = s_learner.predict(without_treatment)

Step 5: S-Learner Individual Treatment Effect (ITE)

In step 5, we will calculate the individual treatment effect (ITE) using the s-learner predictions.

Individual treatment effect (ITE) is the difference between the predicted outcomes with and without treatment.

After calculating the individual treatment effect (ITE), the data is saved in a dataframe.

# ITE
ite = with_treatment_predict - without_treatment_predict

# Save ITE data in a pandas dataframe
ite_df = pd.DataFrame({'ITE': ite, 'with_treatment_predict':with_treatment_predict, 'without_treatment_predict': without_treatment_predict})

# Take a look at the data
ite_df.head()

S-Learner Individual Treatment Effect (ITE) — GrabNGoInfo.com

The histogram visualization of the individual treatment effect (ITE) shows a normal distribution.

The average treatment effect is around 0.5.
Most individuals in the dataset have a positive treatment effect.
Some individuals have negative treatment effects.

# visualization
ite_df.hist(column ='ITE', bins=50, grid=True, figsize=(12, 8))

S-Learner Individual Treatment Effect (ITE) Visualization — GrabNGoInfo.com

Step 6: S-Learner Average Treatment Effect (ATE)

In step 6, we will estimate the average treatment effect (ATE) using the s-learner predictions.

The average treatment effect (ATE) for the population is the average of the individual treatment effect (ITE). We can see that the average treatment effect (ATE) is 0.47.

To learn more about the definition and calculation for the average treatment effect (ATE), please check out my previous tutorial ATE vs CATE vs ATT vs ATC for Causal Inference.

# Calculate ATE
ATE = ite.mean()

# Print out results
print(f'The average treatment effect (ATE) is {ATE:.2f}')

Output:

The average treatment effect (ATE) is 0.47

Step 7: Customer Segmentation Using S-Learner Individual Treatment Effect (ITE)

In step 7, we will talk about how to segment customers using individual treatment effect (ITE) from the s-learner.

An uplift model divides customers into four groups based on the incremental change caused by the intervention. A common use of the uplift model is for marketing campaigns. The common treatments in a marketing campaign can be emails, displays, and promotions.

Persuadables are those who buy with treatment and do not buy without treatment.
Lost Causes are those who do not buy with or without treatment.
Sure Things are those who buy with or without treatment.
Sleeping Dogs are those who buy without treatment and do not buy with treatment.

Based on the uplift model customer segmentation, we divide the individuals into three groups using their individual treatment effect (ITE).

sleeping-dog is a segment with the individual treatment effect (ITE) less than -0.5. These are customers who would have made the purchase if they did not receive and get annoyed by the marketing messages.
persuadable is a segment with the individual treatment effect (ITE) greater than 0.5. These are customers who would have made the purchase if they receive the marketing messages, but would not make the purchase if they did not receive the marketing messages.
no_impact is a segment with the individual treatment effect (ITE) between -0.5 and 0.5. Because their individual treatment effects (ITE) are close to 0, we consider the marketing messages do not have any impact on their purchasing decisions. This segment includes both lost causes and sure things groups.

# Create the segmentation function
def segmentation(x):
  if x < -0.5:
    return 'sleeping-dog'
  elif x > 0.5:
    return 'persuadable'
  else:
    return 'no_impact'

# Apply the function 
ite_df['segment'] = ite_df['ITE'].apply(segmentation)

# Take a look at the data
ite_df.head()

Customer Segmentation Using S-Learner — GrabNGoInfo.com

After creating the customer segments, we created a column in the dataframe. The visualization of the segments shows that nearly half of the individuals are persuadable, very few individuals are sleeping dogs, and more than half of people are not impacted by marketing campaigns.

These segments can help optimize the marketing strategies and increase the return on investment (ROI).

# Visualization
sns.countplot(x=ite_df['segment'])

Customer Segmentation Using S-Learner Visualization — GrabNGoInfo.com

If you prefer to use a python package instead of manually building an S-learner, please check out the tutorial Explainable S-Learner Uplift Model Using Python Package CausalML

More tutorials are available on GrabNGoInfo YouTube Channel and GrabNGoInfo.com.

References

Lo, V. S. Y. (2002); The True Lift Model, ACM SIGKDD Explorations Newsletter, Vol. 4, №2, 78–86, available at http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=4FD247B4987CBF2E29186DACE0D40C3D?doi=10.1.1.99.7064&rep=rep1&type=pdf
CausalML documentation

Join Medium with my referral link - Amy @GrabNGoInfo

Read every story from Amy (and thousands of other writers on Medium). Your membership fee directly supports Amy and…

medium.com

S Learner Uplift Model for Individual Treatment Effect and Customer Segmentation in Python

Step 1: Install and Import Libraries

Join Medium with my referral link - Amy @GrabNGoInfo

Read every story from Amy (and thousands of other writers on Medium). Your membership fee directly supports Amy and…

Step 2: Create Dataset

Step 3: S-Learner Model Training

Step 4: S-Learner Model Predictions

Step 5: S-Learner Individual Treatment Effect (ITE)

Step 6: S-Learner Average Treatment Effect (ATE)

Step 7: Customer Segmentation Using S-Learner Individual Treatment Effect (ITE)

Recommended Tutorials

References

Join Medium with my referral link - Amy @GrabNGoInfo

Read every story from Amy (and thousands of other writers on Medium). Your membership fee directly supports Amy and…