Conditional Expectation
Definition
Conditional expectation is the expected value of a random variable after some information is known.
The most important form for regression is:
\[E[Y \mid X=x]\]This means:
the expected value of $Y$ among observations where $X=x$.
Interpretation
Ordinary expected value answers:
\[E[Y]\]what is the average value of $Y$ overall?
Conditional expectation answers:
\[E[Y \mid X=x]\]what is the average value of $Y$ among cases where the input value is $x$?
Discrete Formula
If $Y$ is discrete:
\[E[Y \mid X=x] = \sum_y y P(Y=y \mid X=x)\]This is an average of possible $Y$ values, weighted by their conditional probabilities.
Regression Function
The function:
\[m(x) = E[Y \mid X=x]\]is called the regression function or conditional mean function.
A regression model tries to estimate this function from data:
\[\hat m(x) \approx E[Y \mid X=x]\]Basket Size Example
Let:
- $X$ = current observed basket size.
- $Y$ = final basket size.
Then:
\[E[Y \mid X=3]\]means:
the expected final basket size, given that the current basket has 3 observed items.
A simple estimator is:
\[\hat m(3) = \frac{1}{|S_3|}\sum_{i \in S_3}Y_i\]where $S_3$ is the set of historical baskets with current size $3$.
Binary Case
If $Y$ is binary, then:
\[Y \in \{0,1\}\]and:
\[E[Y \mid X=x] = P(Y=1 \mid X=x)\]So probability prediction is a special case of conditional expectation.
Relation to Regression
Regression is mainly the problem of estimating:
\[E[Y \mid X=x]\]Different regression methods estimate it in different ways.
Examples:
- Linear Regression assumes a straight-line form.
- Polynomial Regression assumes a polynomial form.
- Logistic Regression estimates a conditional probability.
- Conditional Mean Estimation estimates it by averaging similar observations.
Exercises
- Explain the difference between $E[Y]$ and $E[Y \mid X=x]$.
- Why is conditional expectation useful for prediction?
- For binary $Y$, explain why $E[Y \mid X=x]$ equals $P(Y=1 \mid X=x)$.