General Linear Model Specification Error Test with Missing Data

Document Type : Original Paper

Authors

1 Department of statistics and computer sciences, University of mohaghegh ardabili< Ardabil< Iran

2 Department of Statistics and Computer Sciences, University of Mohaghegh Ardabili, Ardabil, Iran

3 Department of statistics, Shahid beheshti university, Tehran, Iran

Abstract

In this paper, we consider a general linear model where missing data may occur in response and covariate variables. We propose a new test based on Ramsy's test to identify goodness of fit for general linear model with missing data. We show that under the null hypothesis, our test functions for complete case analysis follow a Fisher distribution and the other test function used for analysis with available data converges in distribution to Quasi-Fisher distribution. Furthermore, we compare proposed test functions by using some simulation studies. Also, we apply our methods in analyzing a real data set.

Keywords

Main Subjects


[1] Madsen, H. and Thyregod, P. 2010. Introduction to general and generalized linear models. CRC Press.
[2] Ramsey, G. B. (1969). Test for specification error in classical linear least square regression analysis. Journal of the Royal Statistical Society, 31, 350-71.
[3] Griffith, D. A. and Chun, Y. (2016). Evaluating eigenvector spatial filter corrections for omitted georeference variables. Econometrics, 21, 1-12.
[4] Shukur, G. and Mantalos, P. (2004). Size and power of the RESET test as applied to systems of equations. A Bootstrap Approach, Journal of Modern Applied Statistical Methods, 3, 370-385.
[5] Sapra, S. (2005). A regression error specification test (RESET) for generalized linear model. Economics Bulletin, 3, 1-6.
[6] Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581-592.
[7] Little, R. J. A. and Rubin, D. B. (2002). Statistical analysis with missing data. Second Edition, Wiley-Interscience, New York.
[8] Basilevsky, A., Sabourin, D., Hum, D. and Anderson, A. (1985). Missing data estimators in the general linear model: an evaluation of simulated data as an experimental design. Communications in Statistics-Simulation and Computation, 14(2), 371-394.
[9] Little, R. J. A. (1992). Regression with missing X's: A review. Journal of the American Statistical Association, 87, 1227-1237.
[10] Wang, S. and Wang, C.Y. (2001). A note on kernel assisted estimators in missing covariate regression. Statistics and Probability letters, 55, 439-449.
[11] Hardle, W. and Mammen, E. (1993). Comparing nonparametric versus parametric regression fits. Annals of Statistics, 21, 1921-1947.
[12] Hardle, W., Mammen, E. and Muller, M. (1998). Testing parametric versus semiparametric modeling in generalized linear models. Journal of the American Statistical Association, 93(444), 1461-1474.
[13] Zhu, L.X. and Cui, H.J. (2005). Testing lack-of-fit for general linear errors in variables models. Statistica Sinica, 15, 1049--1068.
[14] Guo, X. and Xu, W. (2012). Goodness-of-fit tests for general linear models with covariates missed at random. Journal of Statistical Planning and Inference, 142, 2047-2058.
[15] Li, X. (2012). Lack-of-fit testing of a regression model with response missing at random. Journal of Statistical Planning and Inference, 142(1), 155-170.
[16] Zhao, L. P. and Lipsitz, S. (1992). Design and analysis of two-stage studies. Statistics in Medicine, 11, 769-782.
[17] Carpenter, J. R. and Kenward, M. G. (2006). A comparison of multiple imputation and doubly robust estimation for analyses with missing data. Journal of the Royal statistical Society, 169, 571-584.
[18] Chambers, J. M., Cleveland, W. S., Kleiner, B. and Tukey, P. A. (1983). Graphical methods for data analysis. Belmont, CA: Wadsworth.