The Pearson and likelihood ratio statistics are commonly used to test goodness-of-fit for models applied to data from a multinomial distribution. When data are from a table formed by cross-classification of a large number of variables, the common statistics may have low power and inaccurate Type I error level due to sparseness in the cells of the table. It has been proposed to assess model fit by using a new version of GFfit statistic based on orthogonal components of Pearson chi-square as a diagnostic to examine the fit on two-way subtables. However, due to variables with a large number of categories and small sample size, even the GFfit statistic may have low power and inaccurate Type I error level due to sparseness in the two-way subtable. In this paper, a method based on choosing different orthogonal components for the GFfit statistic on the subtables is developed to improve the performance of the GFfit statistic. Simulation results for power and type I error rate for several different cases along with comparisons to other diagnostics are presented.
Titolo: | A Power study of the GFfit statistic as a Lack-of-fit Diagnostic for sparse two-way subtables |
Autore/i: | Junfei Zhu; Mark Reiser; Maduranga Dassanayake; CAGNONE, SILVIA |
Autore/i Unibo: | |
Anno: | 2016 |
Titolo del libro: | 2016 JSM proceedings : papers presented at the Joint Statistical Meetings, Chicago,Illinois, July 30- August 4, 2016: and other ASA-sponsored conferences |
Pagina iniziale: | 2401 |
Pagina finale: | 2410 |
Abstract: | The Pearson and likelihood ratio statistics are commonly used to test goodness-of-fit for models applied to data from a multinomial distribution. When data are from a table formed by cross-classification of a large number of variables, the common statistics may have low power and inaccurate Type I error level due to sparseness in the cells of the table. It has been proposed to assess model fit by using a new version of GFfit statistic based on orthogonal components of Pearson chi-square as a diagnostic to examine the fit on two-way subtables. However, due to variables with a large number of categories and small sample size, even the GFfit statistic may have low power and inaccurate Type I error level due to sparseness in the two-way subtable. In this paper, a method based on choosing different orthogonal components for the GFfit statistic on the subtables is developed to improve the performance of the GFfit statistic. Simulation results for power and type I error rate for several different cases along with comparisons to other diagnostics are presented. |
Data stato definitivo: | 23-mar-2017 |
Appare nelle tipologie: | 4.01 Contributo in Atti di convegno |