Marketing Mix Modeling

Understand MMM

A statistical method to study the impact of marketing campaigns on different pre-defined KPI metrics, e.g., traffic, clicks, visits, conversions, sales. Broader use: to better understand factors that affect business KPI over time.
- Privacy-friendly (important: it does not need user-level data but instead runs off aggregated data, e.g. campaign-level data).
- Can overcome signal loss (whereas MTA depends on online signal).
- Holistic: a measurement solution for cross-channel measurements. Measure all channels together on one analysis. Works for both marketing (online and offline) and non-marketing activities (price, promotion, seasonality, distribution).
- Flexible: business type (gaming, e-com, digital). KPI types (revenue, sales, website activity, etc.)
- Help allocate marketing budget across different channels, products, regions –> forecast the impact of future campaigns.
It is not a new approach.
Every MMM model is more or less a regression.
- Xs: campaign spends+activities, an other control variables:
  - Macro-economic factors (GPD, unemployment rate)
  - Seasonal factors: holidays, events.
  - Competing variables: competitor’s spending.
  - Price changes or promotions.
  - External factor: Weather.
  - Group indication: region, city
- Y: ROI, KPIs.
  - revenue, conversion, customer acquisition.
- Usually: based on MLE.
- Overall Steps at a high level:
  - External factors and marketing inputs.
  - Impression and business metrics
  - MMM statistical regression analysis.
  - Data outputs.

Why bother?

Understands your past campaigns to make decisions in the future: budget allocation.
What was my ROI? How did a certain channel (e.g., Facebook) drive my revenue or other KPI of interest?

Meridian (Google)

Why Meridian?

Traditional MMMs: use standard regression –> MLE: need a lot of data to be stable. No info about uncertainty.
- Provides only point estimates of each coefficient.
Bayesian MMMs: Bayes Theorem + MCMC sampling to estimate coeff.
- Provides est. of distribution of coeff.
- Bayesian models work better when having less data or with missing values, or being too sparse.
  - Thanks to assumption about Prior.

Concepts

Adstock: models the effect of spend on sales being not instantaneous but accumulating over time.
Saturation: models the effect of spend onf sales being not linear but saturates at some point (i.e., diminishing returns).
Other features
Geo-level modeling: Meridian can perform hierarchical modeling.
- Meaning you can model multiple regions together.
- This is a good tradeoff between having separate models (unpooled) vs one single model that averages all regions (pooled).

Components

Y: signal
Xs:
- Adstock (the main x). It’s called Media variables.
  - with slope, Hill function, geometric decay rate.
- External control covariates (z).
- Intercept, Stochastic Intercept.
  - time-varying, to account for trend, seasonality.

Example use cases

Effect of different ad channel spending on weekly sales.
- pip install google-meridian
- Control variables: holidays.
- Can viz the contribution by baseline and marketing channels on top of baseline.
- Results can be sensitive to different model’s parameters.
- Can have ROI for each channel.
- Can use the result to optimize budget: it propose an optimal allocation of spends to maximize revenue.
  - Use response curves: they describe the relationship between spend and the resulting incremental revenue.
  - Use that to viz diminishing points. But Meridian offers optimizer for you!

Robyn by Meta

Advanced marketing mix modeling using Meta Open Source project Robyn.

MMM predicts outcomes (e.g., revenues, conversions) using historical time-series data.
MMM is aggregated on different dimensions (e.g., channels, campaigns, time units) and doesn’t require event-level data with PII.
MMM is supported by 2 other measurement methods from Meta: MTA, Lift. These 3 are contained in “360 Measurement Approach”. Use all 3 to calibrate each other and see a complete picture of “Incrementality”.
- MMM: useful for cross-media measurement (because it is resilient to changes in the ecosystem). Broad budget allocation. Annual / Semi-annual.
- Lift: use it to calibrate MMM to make it closer to incrementality. Ground truth ROI and channel optimization. Episodic.
- MTA: use it for real-time optimization. But, it will have gaps in data, which MMM can help. MMM can account for key business drivers and consider broad allocation. So MTA: Cross-publisher optimization. Always-on.

How Robyn addresses the challenges of classic MMM.

Traditional MMM	Robyn
Multicollinearity and feature selection require manual treatment	Regularization and multi-objective optimization to mitigate multicollinearity and assist feature selection
Manual trend, seasonality and dummy variables	Automated trend, seasonal and dummy decomposition with Prophet
Usually mandatory 2+ years data	<1 year data possible with daily time unit
Concave saturation only	Concave and Convex with Hill function
Single loss function	Multiple loss functions
Manual budget allocation	Automated budget allocation

The main features of Robyn.
- Variable transformations: Based on 2 core assumptions of MMM –> Adstock transformation + Saturation transformation.
- Trend and seasonality decomposition with Prophet: to reduce human bias in the modeling process.
- Mutl-objective hyperparameter optimization with Nevergrad: use 3 objective fn for hyper-opt: Prediction error, Business error, Calibration error.
- Calibration with causal experiments: i.e., compare with lift results and use the difference between it and the predicted media contribution to calibrate.
- Model one-pager: auto exports various graphical and tabular outputs. A one-pager per Pareto-optimal model is exported.
- Budget allocation: 2 options:
  - Maximum response: Given a total budget and media-level constraints, the allocator calculates the optimum cross-media budget split by maximizing the response. For example: “What’s the optimum media split if I have budget X?”
  - Target efficiency (ROAS or cost per acquisition): Given a target ROAS (return on ad spend) or cost per acquisition and media-level constraints, the allocator calculates the optimum cross-media budget split and the total budget by maximizing the response. For example: “What’s the optimum media split and how much budget do I need if I want to hit ROAS X?”

How Facebook Robyn works
An Analyst’s Guide to MMM
Announcing the Python version of Project Robyn. Code on PyPI. Code on GitHub
Introduction to MMMM using Robyn
- Detailed explanation with R code example.

Bayesian MMM

Bayesian MMM.
- Use Bayesian Statistics, esp. MCMC sampling which is flexible to incorporate prior domain knowledge. –> Can be complex. –> Hence, there are a few best practices to help with building a Bayesian Regression Model.
- Incremental Return on Marketing Investment (ROMI).
  - Diminishing Returns (Saturation Rates) of ROMI. Model using Hill Function.
- Adstock Rates (Delay / Carryover Effect) - Estimating this effect is a core challenge in MMM.
  - Use pdf of a negative binomial distr. to model the shape of the adstock effect.
  - Pdf is defined on non-neg integer –> good, since we can use days to index the number of day from the date of spend.
  - Its pdf can also accommodate a range of shapes that can represent our intuitions about how the effect of spend should be distribted over time.
  - Check our their R code on carryover effect.
- Seasonality and Holidays
- Use the Bayesian MCMC library Stan. R code in Colab. bayesian_mmm_r.ipynb
  - This creates a function for our adstocks, creates the parameters for our model, handles transformations for saturation and adstocks, then builds the model.

Alternative Tools

Pymc3, numpyro.
Robyn by Meta.

Tho Le

Marketing Mix Modeling

Understand MMM

Why bother?

Meridian (Google)

Why Meridian?

Concepts

Other features

Components

Example use cases

Robyn by Meta

Bayesian MMM

Alternative Tools

Ref

Related Posts