When test forms are administered to different non-equivalent groups of examinees and are scored by item response theory (IRT), it is necessary to put item parameters estimated separately on two groups on the same scale. In the IRT models which include covariates about the examinees, we have two parameters which model uniform and non-uniform differential item functioning (DIF) and that have to be put on the same scale. The aim of this study is to propose conversion equations, which are used to put the uniform and non-uniform DIF parameters on the same scale. To estimate the coefficients of the conversion equations we will use four methods: mean/mean, mean/sigma, Haebara and Stocking-Lord. We give a simulation study and an empirical example. The results of the simulation study show that the coefficients of the conversion equations are substantially equal for the Haebara and Stocking-Lord methods, while they are different for the other methods. The results of the empirical example is that IRT with covariates produces a more informative test than using IRT without covariates for high abilities’ values and, when the mean-mean and the mean-sigma methods are used, we obtain more informative tests than when using concurrent calibration.

Linking Scales in Item Response Theory with Covariates

Valentina Sansivieri
;
2019

Abstract

When test forms are administered to different non-equivalent groups of examinees and are scored by item response theory (IRT), it is necessary to put item parameters estimated separately on two groups on the same scale. In the IRT models which include covariates about the examinees, we have two parameters which model uniform and non-uniform differential item functioning (DIF) and that have to be put on the same scale. The aim of this study is to propose conversion equations, which are used to put the uniform and non-uniform DIF parameters on the same scale. To estimate the coefficients of the conversion equations we will use four methods: mean/mean, mean/sigma, Haebara and Stocking-Lord. We give a simulation study and an empirical example. The results of the simulation study show that the coefficients of the conversion equations are substantially equal for the Haebara and Stocking-Lord methods, while they are different for the other methods. The results of the empirical example is that IRT with covariates produces a more informative test than using IRT without covariates for high abilities’ values and, when the mean-mean and the mean-sigma methods are used, we obtain more informative tests than when using concurrent calibration.
Valentina Sansivieri; Marie Wiberg
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/706053
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact