avatarJacky Yuan

Summary

The provided content is a comprehensive guide on Marketing Mix Modeling (MMM), detailing its purpose, data requirements, preparation, transformation, variable selection, model building, findings, media contribution analysis, budget optimization, and common pitfalls to avoid.

Abstract

The article offers an in-depth exploration of Marketing Mix Modeling (MMM), a statistical technique used to analyze the impact of various marketing tactics on sales and to forecast future marketing effectiveness. It emphasizes the importance of MMM in understanding the business metrics of a product and increasing ROI by estimating the effectiveness of different marketing channels. The guide covers the common data types used in MMM, such as base variables, incremental variables, and sales data, and outlines the best practices for data preparation and transformation, including handling lag, carryover, and curve effects. It also discusses the process of variable selection, particularly addressing multicollinearity using VIF, and the steps to build a linear regression model using media transformation techniques. The article further illustrates how to interpret model findings, calculate media incremental contributions, and optimize media budgets for maximum ROI. It concludes with a caution against common MMM pitfalls and reiterates the significance of MMM in making informed marketing investment decisions.

Opinions

  • The author suggests that MMM is particularly useful for optimizing advertising budgets over medium to long-term periods, whereas Multi-Touch Attribution (MTA) models are better suited for short-term digital budget allocation.
  • There is an opinion that a weekly time period is the industry standard for MMM data granularity, as it balances the variability and accuracy of the model.
  • The article posits that advertising effects are not immediate due to consumer response delays, and therefore, the model should account for lag and carryover effects.
  • The author expresses that the power curve is the most widely used sales response function in MMM due to its ability to explain ad saturation levels and diminishing returns.
  • It is the author's view that in the real business world, multicollinearity is a common issue in MMM, and VIF scores are a useful diagnostic tool.
  • The author advocates for a systematic approach to media transformation, involving the application of lag, power curve, and decay value, and suggests trying different parameter combinations to find the best-fit model.
  • The article conveys the opinion that positive media correlation to sales is typically expected, and a correlation above 0.3 is considered good in the context of advertising investments.
  • The author emphasizes the importance of selecting the best linear regression formula based on positive media correlation, T values, and adjusted R Square.
  • There is an opinion that the final model should have a high adjusted R Square to ensure it captures actual sales well.
  • The author suggests that media contribution analysis can reveal the effectiveness and ROI of different channels, potentially leading to strategic budget reallocations.
  • The article concludes with the opinion that using tools like Excel Solver for budget optimization can lead to more efficient media spending and improved ROI.

Marketing Mix Model Guide With Dataset Using Python, R, and Excel

1.0 Introduction

What is the Marketing/Media Mix model?

According to Wikipedia, Marketing mix modeling (MMM) is a statistical analysis such as multivariate regressions on sales and marketing time series data to estimate the impact of various marketing tactics (marketing mix) on sales and then forecast the impact of future sets of tactics.

The key purpose: By estimating the effectiveness of different marketing channels activities, MMM helps to better understand how various marketing activities are driving the business metrics of a product and increase ROI.

What’s the difference between Marketing mix modeling (MMM) and Multi-touch attribution modeling(MTA)?

For offline media like TV, radio, or magazine, it is impossible to track individual impressions or clicks. The MMM model can use historical data to measure the total attributions of each offline channel with online channels.

In the digital world, the data is richer and most of the time, we are able to capture individual clicks and impression levels data.

In other words, MMM is an aggregate model while MTA is a user-level model.

What questions the MMM and MTA models can answer?

MMM model is often used to optimize advertising dollar amounts for the next quarter or year. Let the CMO and financial team make spending and allocation decisions. And also make investment decisions in terms of incremental spending.

While MTA most of the time is for digital budget allocation and investment for a short time. And further, understand which digital channels or campaigns work or didn’t work.

See my other article here Multi-Touch Attribution Marketing Model — The Shapley value approach.

Common Data Types for MMM

Base Variables:

  1. Seasonality: For example, sales during the Christmas season are generally high than average sales.
  2. Macroeconomic Data: CCI, inflation, unemployment rate, GDP, etc.
  3. Product Sales Price: base price, Avg. sales price
  4. Distribution: No. of stores or No. of locations the product is available. The distribution chain can impact business outcomes. For example, the stock product may cause sales to decrease.

Incremental Variables:

  1. Advertising data: TV / Radio/ NewsPpapers / Magzines / Search/ Display / Social Media / sponsorship / Affiliation / Content Marketing etc.
  2. Promotion data: No. of offers/days for which offers are provided. e.g. a price promotion like free delivery, 0% APR or cashback, etc is not supported by any advertisement that would also drive sales.

Others:

  1. Sales: It is impossible to build a MMM model without the sales variable. is generally considered as a dependent variable in MMM. Sales can be in volume in units as well as revenue($).
  2. Advertising spends data: Get the spend data from the internal marketing team or through an external marketing agency.

2.0 Data Preparation and Data Transformation

The industry standard typically will pick a weekly time period. This is because monthly data granularity is too long and daily level data has too much variation which leads to poor accuracy. Therefore, aggregate data at a weekly level is the best practice for creating a MMM model.

The Lag, Carryover Effect, and Shape Effect

Let’s talk about the media variables transformation. In a typical decision model, the response of sales to a media variable tends to linear. However, it is widely known that advertising effects will not immediately take effect due to delayed consumer response and so on.

Therefore, we will consider those effects on our media transformation.

Lag: The advertisement impact may affect certain times later, for example, 1 week.

Exhibit 1: Lag Effects

Carryover Effect: The “retention rate” of which advertising has an impact on periods after the ad took place. Decay=1 - Carryover

For example, if the Decay is 0.7, formula = This week activity*0.7 + Last week activity*(1–0.7). The first row's last activity will be 0.

Exhibit 2: Decay = 1-Carryover Effect

Curve: The response curve to explain ad saturation levels and diminishing returns.

In the demo example that I will give it later, we will use the power curve which the concave curve below in exhibit 3. The concave shape is the most widely used in the marketing mix model.

See the article here that further explain why this technique can be transformed the non-linear model into an estimation model using linear regression techniques.

Exhibit 3: Shapes of different sales response functions — Credit for https://www.ashokcharan.com/

We will use the calculation method in 3.2 Media Transformation — TV, Radio, Newspaper chapter.

The demo data Advertising.csv and all other files can be download here.

3.0 Variable Selection Process

3.1 Detecting Multicollinearity using VIF

Multicollinearity occurs when two or more independent variables are highly correlated with one another in a regression model.

In the real business world, it is very common that we will have multicollinearity when we were building the marketing mix model. Because marketing campaigns usually tend to happen at the same time. TV is supported by ratio, the newspaper may be supported by magazines.

Fortunately, the demo advertising data, VIF does not exceed 5 which means all media variables do not have a big multicollinearity issue so that we will keep all three media variables in this demo.

Exhibit 8: VIF Scores

VIF score of an independent variable represents how well the variable is explained by other independent variables.

VIF exceeding 5 or 10 indicates high multicollinearity between this independent variable and the others.

3.2 Media Transformation — TV, Radio, Newspaper

I personally use python here to conduct the media transformation. But there are tons of other tools that can do it. For example, Excel with VBA.

We will apply lag first, then build the power curve and finally calculate the decay value. The reason why we want to give a range of the different parameters is we never know which parameter will the best-fit parameters for our model. Therefore, we want to try different combinations. For example, I give each parameter five different ranges, in that case, we will have 5x5x5 = 125 combinations for each media.

Exhibit 4: The Raw Data
Exhibit 5: First Transform — The Lag
Exhibit 6: Second Transform — Power Curve Applied
Exhibit 7: Third Transform — Calculate the Decay Value

We will do the same for the remaining other media channels.

3.3 Choose the top 3 from 125 transformations by sorting the highest correlations

  1. Find out the correlations of the media with sales

2. Select the top three highest positive correlation transform from each media channel

The reason why we are looking for a positive media correlation to sales is that we typically expect the advertising has a positive effect on sales. A correlation above 0.3 is commonly good because advertising may not invest always.

I used python to perform the correlation selection, you may use R or Excel to do the selections.

Exhibit 8: Top 9 Radio transformations
Exhibit 9: Top 9 TV transformations
Exhibit 10: Top 9 Newspaper transformations

See the final file named ‘All Iterations.csv’ where I combined the top 3 media transformations from each media channel.

Exhibit 11: All Iterations.csv

3.4 Build a TRUE/FALSE Matrix in R

We are going to build a matrix that will use later where we will apply to generate linear regression formulas.

Since we only will need one media variable among each top three variables, we can filter the possibilities of duplicate variables in each channel.

In that way, we will reduce the combination from 84 to only 3x3x3= 27 combinations. As we saw in the matrix below, every three columns will appear true only one time.

Exhibit 12: Matrix

3.5 Apply Liner Regression Formulas

Exhibit 13: Apply ‘All Iterations’ columns name to generate 27 formulas

3.6 Get Results of final lists of Linear Regression Formulas

Few key things we are looking for, positive media correlation to sales in the model, T values, Adjusted R Square.

Exhibit 14: 27 Model Results

See results.csv here.

3.7 Select the best Linear Regression Formula

After filter only keeping the positive media correlation and correlation, I got 2 final formulas. And I decide to choose the one with the highest adjusted R Square 0.9163961.

The final model is:

sales ~ 1(0.5912092) + TVLag0Alpha0.7Decay1.0(0.2761494) + radioLag0Alpha0.9Decay1.0(0.2859621)+ NS_Lag0Alpha0.8Decay1.0(0.0009608111)

See Final.csv.

3.8 Get the model prediction results

By using the final model, we got the prediction results.

4.0 Model Findings (Actual vs. Model)

Now we finished the most complicated part of the MMM, which is to find a model that fits our sales.

By observing the graph, our model captures the actual sales pretty well.

Download the AVM.csv

Exhibit 15: Actual vs Model

5.0 Media Incremental Contribution

5.1 Contribution Calculation

Let’s see how each media lift the sales.

Get the contribution = coefficient(from model) * activity(from final.csv)

As we noticed that newspaper contributions are very very limited.

Exhibit 16: Contribution

Next step we will unpivot the contribution as well as the media activity.

Combine contribution, media activity, and spend.

See contribution_final.csv

Note:

  1. To make it easier to visualize in Tableau, I changed the ‘week 1’ to ‘01/06/2016’, and so on.
  2. Intercept activity can be set as 1, and spend is $0.
  3. Let’s assume the cost per activity is: TV $3000/activity, Radio $2000/activity, Newspaper $1000/activity.

5.1 Impact of different media channels

Media Contribution Impact % YOY

Exhibit 17: Media Contribution Impact % YOY

In this case, we can see that the intercept is the base sales(without media), the newspaper has a very limited contribution to the sales. Please note, this is demo data.

In the real world, I will highly suspect the TV contribution because the TV contribution is very influential to sales. Typically, I think 10% — 20% is normal.

5.1 Media Effectiveness & Media Efficiency ROI

The Media Effectiveness refers to the incremental volume or extra volume above the base volume gained from one unit of the activity being measured.

Media Effectiveness = Contribution(Incremental Volume)/Activity

The Media Efficiency, AKA ‘return on investment’, is the marketing spend required to generate one incremental unit.

ROI = Revenue(Contribution x Product Price ) / Spend

I assume the product price unit is $75,000, you may regard this product as a luxury car.

Exhibit 18: Media key Metrics
Exhibit 19: Media Effectiveness & ROI

Here, I can make a quick conclusion here TV made the most contribution(1,733), Radio has the best ROI(766%) and effectiveness(0.2042), however, the newspaper performed very poorly with only 3% ROI!

6.0 Budget Optimization Solution

And then the CMO and CGO pop up the question, how do we spend the money more wisely? Let’s solve it together just simply use the Excel Solver.

Exhibit 20: Optimized ROI
Exhibit 21: Radio — actual vs optimized spend

We will use Excel Solver to optimized the ROI. All the media transformations will still keep the same. By applying the Excel Solver, with the same media spend. We increased the original ROI from 1.94 to 2.30.

Therefore, we may apply the modeling for future media planning and sales forecast.

Solver condition:

  1. The objective is to maximize ROI
  2. We are going to change the media spend(optimized spending)
  3. Optimized spend no more than or less than 20% of the original media budget
  4. Spend total should keep the same as original

You may see Solver.xlsx

7.0 Avoiding common pitfalls

  1. Bad data in, bad data out: First, A good MMM requires sales and marketing data collected at similar frequencies. The common mistake is to record marketing efforts less frequently than sales. Second, sometimes the data itself is bad, it may conclude outliers or missing values. Third, media units may incompatible. For example, TV media will be counted as a Gross Rating Point (GRP) while not equivalent to digital world display ads which use impressions.
  2. Collinearity: As we mentioned in the previous chapter, it is important to check multicollinearity. Because it is common for advertisers to simultaneously increase/decrease/pause the amount of their marketing spends on multiple channels. For example, TV and radio spending at the same time and by a similar amount, this collinearity makes it difficult for MMM to determine whether TV or radio spending was more impactful on sales.
  3. Omitted Variable Bias: For example, when the price change of the product decrease or increase sales substantially, however, if we remove the price factor in the model, its impact may falsely be attributed to a simultaneous change that occurred in some marketing spending, leading to an incorrect conclusion about the advertising ROI and furthermore, may mislead the marketing strategy.

8.0 Summary

In this post, I showed you the basic knowledge of MMM with demo data. The real-world case can be more complicated.

As a marketing professional, it is extremely important to understand the marketing environment of the organization and help the marketing team to minimize the risk associated with unnecessary media spending.

The father of modern advertising John Wanamaker said “ Half the money I spend on advertising is wasted; the trouble is I don’t know which half.” Hopefully, you will be the person that tells CMO/CGO that which channel we should stop investing in and where we waste the media spending.

Let me know your thoughts below. If you like this post, please hit the claps button below, and don’t forget to share it on social media.

Check my multi-touch attribution article as well if you are interested. Cheers!

Marketing Mix
Marketing Mix Modeling
Media Mix Modeling
Media Mix Modelling
Media Mix
Recommended from ReadMedium