When is quantile regression worse than OLS?

by user14281   Last Updated May 01, 2018 08:19 AM

Apart from some unique circumstances where we absolutely must understand the conditional mean relationship, what are the situations where a researcher should pick OLS over Quantile Regression?

I don't want the answer to be "if there is no use in understanding the tail relationships", as we could just use median regression as the OLS substitute.



Answers 6


If you are interested in the mean, use OLS, if in the median, use quantile.

One big difference is that the mean is more affected by outliers and other extreme data. Sometimes, that is what you want. One example is if your dependent variable is the social capital in a neighborhood. The presence of a single person with a lot of social capital may be very important for the whole neighborhood.

Peter Flom
Peter Flom
October 09, 2012 12:50 PM

  1. When there are outliers in our data set
  2. When the distribution of data is skewed.
  3. When we wanna have comprehensive view of the relationship between predictors and outcome of interest.
  4. When the goel is to construct reference ranges for an outcome.
Abolfazl
Abolfazl
February 01, 2013 20:18 PM

There seems to be a confusion in the premise of the question. In the second paragraph it says, "we could just use median regression as the OLS substitute". Note that regressing the conditional median on X is (a form of) quantile regression.

If the error in the underlying data generating process is normally distributed (which can be assessed by checking if the residuals are normal), then the conditional mean equals the conditional median. Moreover, any quantile you may be interested in (e.g., the 95th percentile, or the 37th percentile), can be determined for a given point in the X dimension with standard OLS methods. The main appeal of quantile regression is that it is more robust than OLS. The downside is that if all assumptions are met, it will be less efficient (that is, you will need a larger sample size to achieve the same power / your estimates will be less precise).

gung
gung
February 01, 2013 21:00 PM

Both OLS and quantile regression (QR) are estimation techniques for estimating the coefficient vector $\beta$ in a linear regression model $$ y = X\beta + \varepsilon $$ (for the case of QR see Koenker (1978), p. 33, second paragraph).

For certain error distributions (e.g. those with heavy tails), the QR estimator $\hat\beta_{QR}$ is more efficient than the OLS estimator $\hat\beta_{OLS}$; recall that $\hat\beta_{OLS}$ is efficient only in the class of linear unbiased estimators. This is the main motivation for Koenker (1978) that suggests using the QR in place of OLS under a variety of settings. I think that for any moment of the conditional distribution $P_Y(y|X)$ we should use the one of $\hat\beta_{OLS}$ and $\hat\beta_{QR}$ that is more efficient (please correct me if I am wrong).

Now to answer your question directly, QR is "worse" than OLS (and thus $\hat\beta_{OLS}$ should be preferred over $\hat\beta_{QR}$) when $\hat\beta_{OLS}$ is more efficient than $\hat\beta_{QR}$. One such example is when the error distribution is Normal.

References:

  • Koenker, Roger, and Gilbert Bassett Jr. "Regression quantiles." Econometrica: Journal of the Econometric Society (1978): 33-50.
Richard Hardy
Richard Hardy
December 14, 2016 14:35 PM

Peter Flom had a great and concise answer, I just want to expand it. The most important part of the question is how to define "worse".

In order to define worse, we need to have some metrics, and the function to calculate how good or bad the fittings are called loss functions.

We can have different definitions of the loss function, and there is no right or wrong on each definition, but different definition satisfy different needs. Two well known loss functions are squared loss and absolute value loss.

$$L_{sq}(y,\hat y)=\sum_i (y_i-\hat y_i)^2$$ $$L_{abs}(y,\hat y)=\sum_i |y_i-\hat y_i|$$

If we use squared loss as a measure of success, quantile regression will be worse than OLS. On the other hand, if we use absolute value loss, quantile regression will be better.

Which is what Peter Folm's answer:

If you are interested in the mean, use OLS, if in the median, use quantile.

hxd1011
hxd1011
December 14, 2016 16:53 PM

what is the possible distribution other than the normal distribution of random error term of regression, if we want to apply quantile regression? can we use Weibull distribution for it?

s_haider
s_haider
May 01, 2018 07:31 AM

Related Questions



Multivariate Non-Linear Quantile Regression

Updated October 31, 2017 12:19 PM

Quantile regression - "check function"

Updated January 06, 2018 17:19 PM

Literature on IV quantile regression

Updated September 14, 2017 20:19 PM

Predicting values using quantile regression

Updated January 04, 2018 13:19 PM