Logistic Regression

Definition

Logistic regression is a parametric regression method used when the target variable is binary.

It models the probability that an observation belongs to class $1$:

\[P(Y = 1 \mid X = x)\]

Instead of predicting $Y$ directly, it predicts a probability between $0$ and $1$.

Model Form

For one input variable:

\[P(Y = 1 \mid X = x) = \frac{1}{1 + e^{-(\beta_0 + \beta_1x)}}\]

For multiple input variables:

\[P(Y = 1 \mid X = x) = \frac{1}{1 + e^{-(\beta_0 + \beta_1x_1 + \cdots + \beta_px_p)}}\]

Logit Form

Logistic regression can also be written as:

\[\log\left(\frac{p}{1-p}\right) = \beta_0 + \beta_1x_1 + \cdots + \beta_px_p\]

where:

$p$ is the probability of class $1$.
$\frac{p}{1-p}$ is the odds.
$\log\left(\frac{p}{1-p}\right)$ is the log-odds.

Core Idea

Linear regression can predict values below $0$ or above $1$, which is not valid for probabilities.

Logistic regression fixes this by passing the linear score through the logistic function:

\[\sigma(z) = \frac{1}{1 + e^{-z}}\]

This maps any real number to the interval $(0, 1)$.

Prediction Rule

The model first predicts a probability:

\[\hat p = P(Y = 1 \mid X = x)\]

Then it can classify using a threshold, commonly $0.5$:

\[\hat Y = \begin{cases} 1 & \text{if } \hat p \geq 0.5 \\ 0 & \text{if } \hat p < 0.5 \end{cases}\]

Interpretation of Coefficients

A coefficient $\beta_j$ represents the change in log-odds for a one-unit increase in $X_j$.

Exponentiating the coefficient gives the odds ratio:

\[e^{\beta_j}\]

If $e^{\beta_j} > 1$, increasing $X_j$ increases the odds of class $1$.

If $e^{\beta_j} < 1$, increasing $X_j$ decreases the odds of class $1$.

Example: Customer Return Prediction

Let:

$Y = 1$ if a customer returns within 30 days.
$Y = 0$ otherwise.
$X$ = features such as basket value, item count, recency, and purchase frequency.

A logistic regression model estimates:

\[P(\text{return within 30 days} \mid X)\]

This makes it useful for churn, retention, and repeat-purchase analysis.

Relation to Conditional Mean

For binary $Y$, the conditional mean is equal to the probability of class $1$:

\[E[Y \mid X=x] = P(Y = 1 \mid X=x)\]

So logistic regression estimates a conditional mean, but forces it to stay between $0$ and $1$.

Strengths

Interpretable.
Good baseline for binary classification.
Produces probabilities.
Fast to train.
Works well when effects are approximately linear on the log-odds scale.

Weaknesses

Assumes a linear relationship in log-odds.
Can underfit nonlinear patterns.
Sensitive to highly correlated predictors.
Classification depends on the chosen threshold.

Diagnostics

Useful checks include:

Confusion matrix.
Accuracy.
Precision and recall.
ROC curve.
AUC.
Calibration plot.
Log loss.

Exercises

Fit a logistic regression model to predict whether a customer returns within 30 days.
Interpret one coefficient as an odds ratio.
Explain why logistic regression is more appropriate than linear regression for binary outcomes.

Logistic Regression

Definition

Model Form

Logit Form

Core Idea

Prediction Rule

Interpretation of Coefficients

Example: Customer Return Prediction

Relation to Conditional Mean

Strengths

Weaknesses

Diagnostics

Exercises

See

Parametric Regression

Linear Regression

Poisson Regression

Logistic Regression

Definition

Model Form

Logit Form

Core Idea

Prediction Rule

Interpretation of Coefficients

Example: Customer Return Prediction

Relation to Conditional Mean

Strengths

Weaknesses

Diagnostics

Exercises

See

Parametric Regression

Linear Regression

Poisson Regression

Sessions by Day

Productivity by Hour

Session Completion Rate

Time Spent by Task

Sessions by Day of Week

Session Duration Distribution