Sometimes, the integration of different data sources is the only suitable solution to microdata shortage. Among the several data integration methodologies, Statistical Matching (SM) imputation allows to integrate different datasets when the same records are not uniquely identifiable through the observed variables and/or beyond a modelled rescaling procedure from an observed sample. Particularly, nonparametric micro SM imputation (“hot deck”) techniques allow researchers both to work always with observed (real) data and to avoid model misspecification bias. Nevertheless, non-parametric methods still lack a proper theoretical formalisation and a sound methodology to evaluate the imputation quality. Therefore, we propose new combinations of distance functions and “hot deck” techniques, analysing how they perform in different donor-recipient datasets scenarios and elaborating a robust, recursive strategy for the imputation validation.
Riccardo, D., Meri, R. (2017). Non-parametric micro Statistical Matching techniques: some developments (Tecniche micro non-parametriche per Statistical Matching: alcuni sviluppi). Firenze University Press.
Non-parametric micro Statistical Matching techniques: some developments (Tecniche micro non-parametriche per Statistical Matching: alcuni sviluppi)
D'ALBERTO, RICCARDO
;Meri RaggiSupervision
2017
Abstract
Sometimes, the integration of different data sources is the only suitable solution to microdata shortage. Among the several data integration methodologies, Statistical Matching (SM) imputation allows to integrate different datasets when the same records are not uniquely identifiable through the observed variables and/or beyond a modelled rescaling procedure from an observed sample. Particularly, nonparametric micro SM imputation (“hot deck”) techniques allow researchers both to work always with observed (real) data and to avoid model misspecification bias. Nevertheless, non-parametric methods still lack a proper theoretical formalisation and a sound methodology to evaluate the imputation quality. Therefore, we propose new combinations of distance functions and “hot deck” techniques, analysing how they perform in different donor-recipient datasets scenarios and elaborating a robust, recursive strategy for the imputation validation.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.