Exponential Regression
Definition
Exponential regression is a parametric regression method used when the target changes multiplicatively rather than additively.
A common model form is:
\[Y = ae^{bx}\]where:
- $a$ controls the starting scale.
- $b$ controls the growth or decay rate.
- $e$ is Euler’s number.
Model Form
The basic exponential regression model is:
\[Y = ae^{bX} + \varepsilon\]For prediction:
\[\hat Y = \hat a e^{\hat b X}\]If $b > 0$, the relationship shows exponential growth.
If $b < 0$, the relationship shows exponential decay.
Log-Linear Form
Taking logs gives:
\[\log(Y) = \log(a) + bX\]This transforms the exponential model into a linear model on the log scale.
Let:
\[\alpha = \log(a)\]Then:
\[\log(Y) = \alpha + bX\]So exponential regression is often fitted by applying Linear Regression to the transformed target $\log(Y)$.
Core Idea
Linear regression assumes additive change:
\[Y = \beta_0 + \beta_1X\]Exponential regression assumes multiplicative change:
\[Y = ae^{bX}\]This means each unit increase in $X$ multiplies $Y$ by a constant factor.
Interpretation of Coefficients
In the model:
\[Y = ae^{bX}\]an increase of one unit in $X$ multiplies the expected value of $Y$ by:
\[e^b\]If $e^b = 1.10$, then each unit increase in $X$ is associated with a 10% increase in $Y$.
If $e^b = 0.90$, then each unit increase in $X$ is associated with a 10% decrease in $Y$.
Example: Retail Basket Value
Retail basket values are often strongly right-skewed.
A log transformation can make the distribution easier to model:
\[\log(\text{basket value})\]A model such as:
\[\log(Y) = \alpha + \beta X\]means that $X$ has a multiplicative effect on the original basket value.
Example: Demand Growth or Decay
Let:
- $Y$ = number of units sold.
- $X$ = time.
An exponential model can represent fast growth or decay:
\[\hat Y = ae^{bt}\]This can be useful for products with rapidly increasing or decreasing demand.
Relation to Lognormal Data
If:
\[\log(Y)\]is approximately normally distributed, then $Y$ is approximately lognormal.
Many financial and retail variables behave this way, especially transaction values.
Strengths
- Useful for multiplicative relationships.
- Handles positive skew better than raw-scale linear regression.
- Easy to fit using a log transformation.
- Coefficients have percentage-change interpretations.
Weaknesses
- Requires positive target values.
- Can be distorted by zeros.
- Can underpredict or overpredict extreme values after back-transformation.
- Assumes a constant multiplicative effect.
Important Warning
Do not apply a log transform to values that can be zero or negative.
For example, this is invalid when $Y \leq 0$:
\[\log(Y)\]In retail data, returns and cancellations may produce negative transaction values, so they must be handled separately before exponential regression.
Diagnostics
Useful checks include:
- Histogram of $Y$.
- Histogram of $\log(Y)$.
- Residual plot on the log scale.
- Predicted vs actual values.
- Error after back-transformation.
Exercises
- Plot basket values before and after applying $\log_{10}$.
- Fit a linear model using $\log(\text{basket value})$ as the target.
- Explain why exponential regression may be better than raw linear regression for basket value.