next up previous home.png
Next: Bibliography Up: Appendix: Robust Estimation Methodology Previous: Robust Overdispersed Binomial Model


Background

The classical linear regression model for $ n$ observations and $ k$ regressors has the form $ y_i = x'_i\beta + \epsilon_i$, $ i=1,\dots,n$, where $ \beta$ is a vector of unknown coefficients, the data $ y_i$ and $ x_i$ are observed, and the unobserved disturbance $ \epsilon_i$ has conditional mean $ E(\epsilon_i \mid
x_i)=0$ and variance $ \sigma^2$. We seek parameter estimates $ \hat{\beta}$ that converge in probability to $ \beta$ as $ n$ gets large: we want consistent estimates.27 Least squares (LS) chooses $ \hat{\beta}$ to minimize the sum of squared residuals $ r_i=y_i-x_i\hat{\beta}$ over all $ i=1,\dots,n$:

$\displaystyle \hat{\beta}_{\text{LS}} = \operatornamewithlimits{argmin}_{\hat{\beta}} \sum_{i=1}^{n} r^{2}_i .$ (10)

Because even one contaminated data point can cause $ \hat{\beta}_{\text{LS}}$ to take values arbitrarily different from $ \beta$, LS has a breakdown point of $ 1/n$ (asymptotically, 0). LS is not robust.

For the regression model the maximum possible breakdown point is, asymptotically, 0.5. One popular estimator that achieves that maximum is least median of squares (LMS):

$\displaystyle \hat{\beta}_{\text{LMS}} = \operatornamewithlimits{argmin}_{\hat{\beta}} \operatornamewithlimits{med}_i r^{2}_i$ (11)

(Rousseeuw and Leroy, 1987; Rousseeuw, 1984).28 LMS has two important defects. $ \hat{\beta}_{\text{LMS}}$ is consistent for $ \beta$, but the estimator converges at the slow rate of $ n^{-1/3}$. LMS also is inefficient when the disturbance $ e_i$ is an identically and independently distributed Gaussian random variable (i.e., no $ \epsilon_i$ outliers) and the model is otherwise correctly specified. One way to achieve greater efficiency is to use LMS estimates as starting values for a redescending $ M$-estimator (Hampel et al., 1981).

Other estimators exist that achieve the maximum breakdown point while having a $ n^{-1/2}$ convergence rate and better Gaussian efficiency than LMS does. The LQD estimator (Croux et al., 1994) is defined by choosing $ \hat{\beta}$ to minimize the (approximately) first quartile of the absolute differences between pairs of residuals. Let

$\displaystyle Q_{n} = \{ \vert r_i - r_j \vert : i< j \} _{\binom{h_{k}}{2}:\binom{n}{2}}$ (12)

denote the $ \binom{h_{k}}{2}$ order statistic of the set $ \{ \vert r_i - r_j \vert
: i< j \}$ of absolute differences. The LQD estimator is

$\displaystyle \hat{\beta}_{\text{LQD}} = \operatornamewithlimits{argmin}_{\hat{\beta}} Q_{n} .$ (13)

For the linear regression model, LQD does not estimate the intercept term, because the differences $ r_i-r_j$ do not depend on the overall mean. If necessary an intercept must be estimated separately.29 For the remaining elements of $ \beta$, LQD is consistent with $ n^{-1/2}$ convergence rate and Gaussian efficiency of 67.1% (Croux et al., 1994). In addition to the efficiency gain over LMS, LQD provides a superior estimate of the scale (i.e., $ \sigma$) when $ \epsilon_i$ has an asymmetric distribution, because LQD does not estimate the scale by measuring a symmetric spread of the residuals around a central location value (Rousseeuw and Croux, 1993).

The LQD objective function is difficult to optimize. Because high breakdown point estimators are not smooth functions of the data, optimization techniques that are based solely on derivative information, such as Newton-Raphson, are highly unreliable (Stromberg, 1993). In general, high breakdown point objective functions have multiple minima. Therefore, the use of local optimization techniques is not reliable. But in our application, there does appear to be local hill-climbing information contained in the derivatives. Therefore, we use a global optimizer which makes use of derivative information: GENetic Optimization Using Derivatives (GENOUD) (Sekhon and Mebane, 1998). GENOUD combines evolutionary algorithm methods with a derivative-based, quasi-Newton method to solve difficult unconstrained optimization problems.30


next up previous home.png
Next: Bibliography Up: Appendix: Robust Estimation Methodology Previous: Robust Overdispersed Binomial Model
Jasjeet S. Sekhon 2001-03-04