Nonparametric Regression
Definition
Nonparametric regression estimates the relationship between input variables and a target variable without assuming a fixed global mathematical form such as a line or polynomial.
The main target is often the conditional mean:
\[m(x) = E[Y \mid X=x]\]A nonparametric regression method estimates this function directly from data:
\[\hat m(x) \approx E[Y \mid X=x]\]Main Idea
Instead of assuming a model like:
\[Y = \beta_0 + \beta_1X + \varepsilon\]nonparametric regression asks:
What did similar observations do in the past?
The prediction is usually based on nearby or similar data points.
Why It Is Called Nonparametric
It does not mean there are no parameters at all.
It means there is no fixed finite-dimensional parameter vector that fully determines the whole shape of the regression function.
Instead, model complexity can grow with the amount of data.
Common Examples
- Conditional Mean Estimation
- Kernel Regression
- k-NN Regression
- Local Smoothing
- Regression trees
- Random forests
- Splines
Example: Basket Size Prediction
Let:
- $X = k$ be the current observed basket size.
- $Y$ be the final basket size.
A nonparametric estimator may predict final basket size by averaging historical baskets that had similar current basket size:
\[\hat m(k) = \frac{1}{|S_k|}\sum_{i \in S_k}Y_i\]where $S_k$ is the set of similar historical baskets.
Strengths
- Flexible.
- Can capture nonlinear patterns.
- Useful when the true relationship is unknown.
- Good exploratory modeling tool.
Weaknesses
- Needs more data.
- Less interpretable than simple parametric models.
- Can be unstable in sparse regions.
- Sensitive to the definition of similarity.
- May perform badly for rare extreme cases.
Bias-Variance Tradeoff
Nonparametric methods are controlled by smoothing choices.
Too little smoothing:
- Low bias.
- High variance.
- Noisy predictions.
Too much smoothing:
- High bias.
- Low variance.
- Overly flat predictions.
Relation to Data Mining
Nonparametric regression is common in data mining because it can discover patterns without imposing a strict formula beforehand.
For retail data, it can be used for:
- Basket size prediction.
- Customer return probability.
- Demand smoothing.
- Recommendation systems.
Exercises
- Explain why averaging historical outcomes for similar baskets is nonparametric.
- Give one reason nonparametric regression may fail for very large baskets.
- Compare parametric and nonparametric regression for basket-size prediction.