Polynomial Regression
Definition
Polynomial regression is a parametric regression method that models the target variable as a polynomial function of the input variable.
For degree $d$:
\[Y = \beta_0 + \beta_1X + \beta_2X^2 + \cdots + \beta_dX^d + \varepsilon\]Core Idea
Polynomial regression extends linear regression by adding powers of the input variable.
For example, a quadratic model is:
\[Y = \beta_0 + \beta_1X + \beta_2X^2 + \varepsilon\]This allows the fitted curve to bend.
Why It Is Still Parametric
Although the curve can be nonlinear, the model still has a fixed number of parameters:
\[\beta_0, \beta_1, \beta_2, \ldots, \beta_d\]Once the degree $d$ is chosen, the model structure is fixed.
Prediction Function
The fitted model is:
\[\hat Y = \hat\beta_0 + \hat\beta_1X + \hat\beta_2X^2 + \cdots + \hat\beta_dX^d\]Example
A quadratic model for final basket size could be:
\[\hat Y = \hat\beta_0 + \hat\beta_1k + \hat\beta_2k^2\]where:
- $k$ is the current observed basket size.
- $\hat Y$ is the predicted final basket size.
Degree of the Polynomial
The degree controls model flexibility.
| Degree | Model | Shape |
|---|---|---|
| 1 | Linear | Straight line |
| 2 | Quadratic | One bend |
| 3 | Cubic | More flexible curve |
| Higher | High-degree polynomial | Very flexible, but risky |
Overfitting Risk
High-degree polynomial regression can fit training data very closely but behave badly on new data.
This is overfitting.
Signs of overfitting:
- Very wavy fitted curve.
- Low training error but high test error.
- Extreme predictions outside the observed range.
Relation to Linear Regression
Polynomial regression is linear in the parameters even though it is nonlinear in $X$.
For example:
\[Y = \beta_0 + \beta_1X + \beta_2X^2 + \varepsilon\]is linear in:
\[\beta_0, \beta_1, \beta_2\]Therefore, it can be fitted using ordinary least squares.
Strengths
- More flexible than simple linear regression.
- Still easy to fit.
- Useful when the relationship has smooth curvature.
Weaknesses
- Can overfit.
- Can behave badly at the edges.
- Sensitive to outliers.
- Degree choice is subjective unless validated.
Example: Prediction Error Curve
If a basket-size predictor performs well for small and medium baskets but underestimates large baskets, a polynomial model may capture some curvature better than a straight line.
However, if the large baskets belong to a different population, polynomial regression alone may not solve the problem.
Exercises
- Fit polynomial regressions of degree 1, 2, and 3.
- Compare train error and test error.
- Explain why a high-degree polynomial may not be reliable for extreme basket sizes.