Tho Le

A Data Scientist. Looking for knowledge!

Marketing Mix Modeling

07 Feb 2025 » marketing, models, retails

Understand MMM

  • A statistical method to study the impact of marketing campaigns on different pre-defined KPI metrics, e.g., traffic, clicks, visits, conversions, sales. Broader use: to better understand factors that affect business KPI over time.
    • Privacy-friendly (important: it does not need user-level data but instead runs off aggregated data, e.g. campaign-level data).
    • Can overcome signal loss (whereas MTA depends on online signal).
    • Holistic: a measurement solution for cross-channel measurements. Measure all channels together on one analysis. Works for both marketing (online and offline) and non-marketing activities (price, promotion, seasonality, distribution).
    • Flexible: business type (gaming, e-com, digital). KPI types (revenue, sales, website activity, etc.)
    • Help allocate marketing budget across different channels, products, regions –> forecast the impact of future campaigns.
  • It is not a new approach.
  • Every MMM model is more or less a regression.
    • Xs: campaign spends+activities, an other control variables:
      • Macro-economic factors (GPD, unemployment rate)
      • Seasonal factors: holidays, events.
      • Competing variables: competitor’s spending.
      • Price changes or promotions.
      • External factor: Weather.
      • Group indication: region, city
    • Y: ROI, KPIs.
      • revenue, conversion, customer acquisition.
    • Usually: based on MLE.
    • Overall Steps at a high level:
      • External factors and marketing inputs.
      • Impression and business metrics
      • MMM statistical regression analysis.
      • Data outputs.

Why bother?

  • Understands your past campaigns to make decisions in the future: budget allocation.
  • What was my ROI? How did a certain channel (e.g., Facebook) drive my revenue or other KPI of interest?

Meridian (Google)

Why Meridian?

  • Traditional MMMs: use standard regression –> MLE: need a lot of data to be stable. No info about uncertainty.
    • Provides only point estimates of each coefficient.
  • Bayesian MMMs: Bayes Theorem + MCMC sampling to estimate coeff.
    • Provides est. of distribution of coeff.
    • Bayesian models work better when having less data or with missing values, or being too sparse.
      • Thanks to assumption about Prior.

Concepts

  • Adstock: models the effect of spend on sales being not instantaneous but accumulating over time.
  • Saturation: models the effect of spend onf sales being not linear but saturates at some point (i.e., diminishing returns).

    Other features

  • Geo-level modeling: Meridian can perform hierarchical modeling.
    • Meaning you can model multiple regions together.
    • This is a good tradeoff between having separate models (unpooled) vs one single model that averages all regions (pooled).

Components

  • Y: signal
  • Xs:
    • Adstock (the main x). It’s called Media variables.
      • with slope, Hill function, geometric decay rate.
    • External control covariates (z).
    • Intercept, Stochastic Intercept.
      • time-varying, to account for trend, seasonality.

Example use cases

  • Effect of different ad channel spending on weekly sales.
    • pip install google-meridian
    • Control variables: holidays.
    • Can viz the contribution by baseline and marketing channels on top of baseline.
    • Results can be sensitive to different model’s parameters.
    • Can have ROI for each channel.
    • Can use the result to optimize budget: it propose an optimal allocation of spends to maximize revenue.
      • Use response curves: they describe the relationship between spend and the resulting incremental revenue.
      • Use that to viz diminishing points. But Meridian offers optimizer for you!

Robyn by Meta

  • Advanced marketing mix modeling using Meta Open Source project Robyn.
    • MMM predicts outcomes (e.g., revenues, conversions) using historical time-series data.
    • MMM is aggregated on different dimensions (e.g., channels, campaigns, time units) and doesn’t require event-level data with PII.
    • MMM is supported by 2 other measurement methods from Meta: MTA, Lift. These 3 are contained in “360 Measurement Approach”. Use all 3 to calibrate each other and see a complete picture of “Incrementality”.
      • MMM: useful for cross-media measurement (because it is resilient to changes in the ecosystem). Broad budget allocation. Annual / Semi-annual.
      • Lift: use it to calibrate MMM to make it closer to incrementality. Ground truth ROI and channel optimization. Episodic.
      • MTA: use it for real-time optimization. But, it will have gaps in data, which MMM can help. MMM can account for key business drivers and consider broad allocation. So MTA: Cross-publisher optimization. Always-on.
    • How Robyn addresses the challenges of classic MMM.

      Traditional MMMRobyn
      Multicollinearity and feature selection require manual treatmentRegularization and multi-objective optimization to mitigate multicollinearity and assist feature selection
      Manual trend, seasonality and dummy variablesAutomated trend, seasonal and dummy decomposition with Prophet
      Usually mandatory 2+ years data<1 year data possible with daily time unit
      Concave saturation onlyConcave and Convex with Hill function
      Single loss functionMultiple loss functions
      Manual budget allocationAutomated budget allocation
    • The main features of Robyn.
      • Variable transformations: Based on 2 core assumptions of MMM –> Adstock transformation + Saturation transformation.
      • Trend and seasonality decomposition with Prophet: to reduce human bias in the modeling process.
      • Mutl-objective hyperparameter optimization with Nevergrad: use 3 objective fn for hyper-opt: Prediction error, Business error, Calibration error.
      • Calibration with causal experiments: i.e., compare with lift results and use the difference between it and the predicted media contribution to calibrate.
      • Model one-pager: auto exports various graphical and tabular outputs. A one-pager per Pareto-optimal model is exported.
      • Budget allocation: 2 options:
        • Maximum response: Given a total budget and media-level constraints, the allocator calculates the optimum cross-media budget split by maximizing the response. For example: “What’s the optimum media split if I have budget X?”
        • Target efficiency (ROAS or cost per acquisition): Given a target ROAS (return on ad spend) or cost per acquisition and media-level constraints, the allocator calculates the optimum cross-media budget split and the total budget by maximizing the response. For example: “What’s the optimum media split and how much budget do I need if I want to hit ROAS X?”
  • How Facebook Robyn works
  • An Analyst’s Guide to MMM
  • Announcing the Python version of Project Robyn. Code on PyPI. Code on GitHub
  • Introduction to MMMM using Robyn
    • Detailed explanation with R code example.

Bayesian MMM

  • Bayesian MMM.
    • Use Bayesian Statistics, esp. MCMC sampling which is flexible to incorporate prior domain knowledge. –> Can be complex. –> Hence, there are a few best practices to help with building a Bayesian Regression Model.
    • Incremental Return on Marketing Investment (ROMI).
      • Diminishing Returns (Saturation Rates) of ROMI. Model using Hill Function.
    • Adstock Rates (Delay / Carryover Effect) - Estimating this effect is a core challenge in MMM.
      • Use pdf of a negative binomial distr. to model the shape of the adstock effect.
      • Pdf is defined on non-neg integer –> good, since we can use days to index the number of day from the date of spend.
      • Its pdf can also accommodate a range of shapes that can represent our intuitions about how the effect of spend should be distribted over time.
      • Check our their R code on carryover effect.
    • Seasonality and Holidays
    • Use the Bayesian MCMC library Stan. R code in Colab. bayesian_mmm_r.ipynb
      • This creates a function for our adstocks, creates the parameters for our model, handles transformations for saturation and adstocks, then builds the model.

Alternative Tools

  • Pymc3, numpyro.
  • Robyn by Meta.

Ref