avatarTerence Shin, MSc, MBA

Summary

The provided content offers a comprehensive guide on building and interpreting a Marketing Mix Model (MMM) using Python, emphasizing its importance in understanding and optimizing marketing strategies for businesses.

Abstract

The article delves into the concept of Marketing Mix Modeling, a statistical technique used to quantify the impact of various marketing channels on sales or conversions. It underscores the shift of marketing towards a more data-driven approach with the advent of online marketing and big data. The guide outlines the steps to construct an MMM from scratch in Python, using a dataset from Kaggle to illustrate the process. It includes exploratory data analysis (EDA) with correlation matrices, pair plots, and feature importance assessments, followed by the actual model building using an Ordinary Least Squares (OLS) approach. The article also discusses how to evaluate the model by comparing predicted and actual sales values and provides insights on interpreting the model's output, including the significance of predictors and the model's explanatory power. The author concludes by encouraging readers to apply these learnings to their businesses and offers further reading on related data science topics.

Opinions

  • The author posits that marketing is becoming increasingly mathematical and methodical due to the influence of data science and machine learning.
  • Marketing Mix Models are presented as superior to Attribution Modeling for their ability to measure the impact of traditionally immeasurable marketing channels like TV, radio, and newspapers.
  • The article suggests that the insights gained from an MMM can significantly enhance budget optimization and predictive capabilities for future marketing efforts.
  • Feature importance analysis is highlighted as a critical step in understanding the relative influence of different marketing channels on sales outcomes.
  • The author expresses confidence in the predictive accuracy of the demonstrated MMM, as evidenced by the close alignment between actual and predicted sales values.
  • The interpretation of the model's summary, particularly the Adj. R-squared and p-values, is emphasized to assess the model's effectiveness and the significance of individual predictors.
  • The author encourages continued learning in data science, hinting at the importance of staying updated with analytical techniques for professional growth in the field of marketing analytics.

A Complete Guide to Building a Marketing Mix Model

Fully understand what a Marketing Mix Model is and how to use one

Photo by Joshua Earle on Unsplash

Introduction

Marketing was always considered more of an art than a science. But with the emergence of online marketing and big data, marketing is more mathematical and methodical than ever. In fact, one of the biggest areas of opportunities for data science and machine learning applications is marketing!

This article is going to focus on an extremely prevalent and powerful marketing science technique called Marketing Mix Modeling. This article will cover what it is, why it’s so useful, how to build one in Python, and most importantly, how to interpret it.

What is a Marketing Mix Model?

A Marketing Mix Model is a modeling technique used to determine market attribution, the estimated impact of each marketing channel that a given company is using.

Unlike Attribution Modeling, another technique used for marketing attribution, Marketing Mix Models attempt to measure the impact of immeasurable marketing channels, like TV, radio, and newspapers.

Generally, your output variable will be sales or conversions, but can also be things like website traffic. Your input variables typically consist of marketing spend by channel by period (day, week, month, quarter, etc…), but can also include other variables which we’ll get to later.

Why is it so Useful?

You can tap into the power of a marketing mix model in a number of ways, including:

  • To get a better understanding of the relationships between your marketing channels and your target metric (i.e. conversions).
  • To distinguish high ROI marketing channels from low ones and ultimately better optimize your marketing budget.
  • To predict future conversions based on given inputs.

Each of these insights can offer a ton of value as you scale your business. Let’s dive into what it takes to build one with Python. 👀

Building a Marketing Mix Model

To get a better intuition for marketing mix models, this section will walk through building a marketing mix model from scratch in Python. This marketing mix model is going to be built off of this dataset from Kaggle.

Step 1: Import all relevant libraries and data.

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_csv("../input/advertising.csv/Advertising.csv")

Step 2: Perform some EDA

Generally, you’d conduct a lot more exploratory data analyses, but for this tutorial we’ll focus on the three most common (and powerful, in my experience):

  1. Correlation matrices: a table that shows the correlation values for each pair-relationship
  2. Pair plots: a simple way to visualize the relationships between each variable
  3. Feature importance: techniques that assign a score for each feature based on how useful they are at predicting the target variable

Correlation Matrix

To reiterate, a correlation matrix is a table that shows the correlation values for each pair-relationship. It’s a very fast and efficient way of understanding feature relationships. Here’s the code for our matrix.

corr = df.corr()
sns.heatmap(corr, xticklabels = corr.columns, yticklabels = corr.columns, annot = True, cmap = sns.diverging_palette(220, 20, as_cmap=True))

The correlation matrix above shows that there’s a strong correlation between TV and sales (0.78), a moderate correlation between radio and sales (0.58), and a weak correlation between newspaper and sales (0.23). It’s still too early to conclude anything but this is good to keep into consideration moving forward.

Pair plot

A pair plot is a simple way to visualize the relationships between each variable — it’s similar to a correlation matrix except it shows a graph for each pair-relationship instead of a correlation. Now let’s take a look at the code for our pair plot.

sns.pairplot(df)

We can see some consistency between our pair plot and our original correlation matrix. It looks like there’s a strong positive relationship between TV and sales, less for radio, and even less for newspapers.

Feature Importance

Feature importance allows you to determine how “important” each input variable is to predict the output variable. A feature is important if shuffling its values increases model error because this means the model relied on the feature for the prediction.

# Setting X and y variables
X = df.loc[:, df.columns != 'sales']
y = df['sales']# Building Random Forest model
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error as mae
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.25, random_state=0)
model = RandomForestRegressor(random_state=1)
model.fit(X_train, y_train)
pred = model.predict(X_test)# Visualizing Feature Importance
feat_importances = pd.Series(model.feature_importances_, index=X.columns)
feat_importances.nlargest(25).plot(kind='barh',figsize=(10,10))

The code above is first creating a random forest model with sales as the target variable and the marketing channels as the feature inputs. Once the model is created, we then calculate the feature importance of each predictor and plot it on a bar chart.

There seems to be a pattern, where TV is the most important, followed by radio, leaving newspaper last. Next comes actually building the model.

Step 3: Build the Marketing Mix Model (aka. OLS model)

It’s time to build our marketing mix model! Another way to refer to the model we’re building is an OLS model, short for ordinary least squares, which is a method used to estimate the parameters in a linear regression model. An OLS model is a type of regression model that is most commonly used when building marketing mix models.

What makes Python so amazing is that it already has a library that you can use to create an OLS model:

import statsmodels.formula.api as sm
model = sm.ols(formula="sales~TV+radio+newspaper", data=df).fit()
print(model.summary())

The code above creates our ordinary least squares regression model, which specifies that we’re predicting sales based on TV, radio, and newspaper marketing dollars.

Step 4: Plot Actual vs Predicted Values

Next, let’s graph the predicted sales values with the actual sales values to visually see how our model performs. This is a particularly useful thing to do in a business use case if you’re trying to see how well your model reflects what’s actually happening — in this case, if you’re trying to see how well your model predicts sales based on the amount spent in each marketing channel.

from matplotlib.pyplot import figure
y_pred = model.predict()
labels = df['sales']
df_temp = pd.DataFrame({'Actual': labels, 'Predicted':y_pred})
df_temp.head()
figure(num=None, figsize=(15, 6), dpi=80, facecolor='w', edgecolor='k')
y1 = df_temp['Actual']
y2 = df_temp['Predicted']plt.plot(y1, label = 'Actual')
plt.plot(y2, label = 'Predicted')
plt.legend()
plt.show()
Actual vs Predicted values

Not bad! It seems like this model does a good job of predicting sales given TV, radio, and newspaper spend.

How to Interpret a Marketing Mix Model

Going back to the output from .summary(), there are a couple of things to focus on:

  1. .summary() provides us with an abundance of insights on our model. Going back to the output from .summary(), we can see a few areas to focus in on (you can reference these insights against the OLS regression results below):
  2. The Adj. R-squared is 0.896. This means that approximately 90% of the total variation in the data can be explained by the model. This also means that the model doesn’t account for 10% of the data used — this could be due to missing variables, for example if there was another marketing channel that wasn’t included, or simply due to noise in the data.
  3. At the top half, you can see Prob (F-statistic): 1.58e-96. This probability value (p-value) represents the likelihood that there are no good predictors of the target variable — in this case, there are no good predictors of sales. Since the p-value is close to zero, we know that there is at least one predictor in the model that is a good predictor of sales.

If you look at the column, P>|t|, you can see the p-values for each predictor. The p-values for TV and radio are less than 0.000, but the p-value for newspapers is 0.86, which indicates that newspaper spend has no significant impact on sales. Generally, you want the p-value to be less than 1% or 5%, which are the two standards in practice.

Thanks for Reading!

By reading this, you should now know what a marketing mix model is, how to build one, and how to impactfully use it for your business. As always, I wish you the best in your learning endeavors :)

Not sure what to read next? I’ve picked another article for you:

and another one!

Terence Shin

Data Science
Programming
Python
Marketing
Artificial Intelligence
Recommended from ReadMedium