Poisson Regression

Definition

Poisson regression is a parametric regression method used when the target variable is a count.

It is used for outcomes such as:

Number of purchases.
Number of items.
Number of visits.
Number of events in a time period.

The target variable is usually non-negative:

\[Y \in \{0, 1, 2, 3, \ldots\}\]

Model Form

Poisson regression models the expected count:

\[E[Y \mid X=x] = \lambda(x)\]

The most common form uses a log link:

\[\log(\lambda(x)) = \beta_0 + \beta_1x_1 + \cdots + \beta_px_p\]

Equivalently:

\[\lambda(x) = e^{\beta_0 + \beta_1x_1 + \cdots + \beta_px_p}\]

This ensures that the predicted count is always positive.

Core Idea

Linear regression can predict negative counts, which is not valid.

Poisson regression avoids this by modeling the log of the expected count.

The model predicts:

\[\hat Y = \hat \lambda(x)\]

where $\hat \lambda(x)$ is the estimated expected count.

Poisson Distribution

Poisson regression assumes that the count variable follows a Poisson distribution conditional on $X$:

\[Y \mid X=x \sim \text{Poisson}(\lambda(x))\]

The probability of observing count $y$ is:

\[P(Y = y) = \frac{e^{-\lambda}\lambda^y}{y!}\]

Interpretation of Coefficients

A coefficient $\beta_j$ represents the change in the log expected count for a one-unit increase in $X_j$.

Exponentiating gives the multiplicative effect:

\[e^{\beta_j}\]

If $e^{\beta_j} = 1.2$, then a one-unit increase in $X_j$ multiplies the expected count by $1.2$.

That means a 20% increase in expected count.

Example: Basket Item Count Prediction

Let:

$Y$ = final basket item count.
$X$ = current observed basket size.

A Poisson regression model could estimate:

\[E[\text{final item count} \mid \text{current item count}]\]

Because the final basket size is a count, Poisson regression may be more appropriate than ordinary linear regression.

Example: Purchase Frequency

Let:

$Y$ = number of orders placed by a customer in the next 30 days.
$X$ = customer history features.

Poisson regression estimates:

\[E[\text{orders in next 30 days} \mid X]\]

This makes it useful for demand, frequency, and retention analysis.

Assumptions

The main assumptions are:

The target is a count.
Counts are independent given the predictors.
The log expected count is linear in the predictors.
The conditional mean and variance are approximately equal:

\[E[Y \mid X] \approx Var(Y \mid X)\]

Overdispersion

Real data often has variance larger than the mean:

\[Var(Y \mid X) > E[Y \mid X]\]

This is called overdispersion.

If overdispersion is strong, alternatives include:

Quasi-Poisson regression.
Negative binomial regression.
Zero-inflated models.

Strengths

Natural for count data.
Predictions are non-negative.
Interpretable through multiplicative effects.
Useful for event rates and purchase counts.

Weaknesses

Can perform badly with overdispersion.
Can underfit highly variable retail baskets.
Assumes a specific count distribution.
Sensitive to extreme counts.

Diagnostics

Useful checks include:

Residual deviance.
Mean vs variance comparison.
Predicted vs actual counts.
Overdispersion test.
Residual plots.

Poisson Regression

Definition

Model Form

Core Idea

Poisson Distribution

Interpretation of Coefficients

Example: Basket Item Count Prediction

Example: Purchase Frequency

Assumptions

Overdispersion

Strengths

Weaknesses

Diagnostics

Exercises

See

Parametric Regression

Logistic Regression

Exponential Regression

Poisson Regression

Definition

Model Form

Core Idea

Poisson Distribution

Interpretation of Coefficients

Example: Basket Item Count Prediction

Example: Purchase Frequency

Assumptions

Overdispersion

Strengths

Weaknesses

Diagnostics

Exercises

See

Parametric Regression

Logistic Regression

Exponential Regression

Sessions by Day

Productivity by Hour

Session Completion Rate

Time Spent by Task

Sessions by Day of Week

Session Duration Distribution