In this paper we propose the use of a Small Area Estimation (SAE) approach so as to obtain reliable estimates of the number of “recruits replacing employees leaving the firm (substitute recruits – SR) and of “recruits filling new positions (new recruits – NR). These estimates would clearly be of use in analysing the composition of that human capital required by Italian Industrial Districts, and subsequently in evaluating local economic systems’ ability to achieve a long-term competitive long-edge by expanding and/or renewing human capital. As we are interested in evaluating the dynamics of Industrial Districts s compared with those of non-Districts s, we have chosen to focus on the manufacturing sector. We use model-based estimators relying on “area level” models, because the UNIONCAMERE will not release individual data, due to privacy constraints, but only domain-level data on request. In order to specify a suitable SAE model, we first need to deal with the following problems. Firstly, our targeted estimation is the number of recruits. In this case, the standard Normal-Normal model would not hold, and so we need to use a small-area model for count data. Secondly, in order to increase the reliability of estimates, we have taken account of the correlation between sampling errors associated to NR and SR estimators by adopting a multivariate SAE model that borrows strength not only from areas but also from this correlation. As sample correlations between NR and SR are mainly negative, we suggest the use of an SAE model based on the multivariate Poisson-Log Normal distribution. Unlike other multivariate distributions for counts proposed in the literature, this particular distribution allows for unconstrained (both positive and negative) correlations. Finally, a problem due to the instability of estimated variances of direct estimators has to be faced. Usually, an approximately unbiased estimate of the variance of direct estimators may be available. In SAE modelling, the variance of direct estimators is generally assumed to be known and equal to its estimate. In the case of large samples, this assumption is commonly stated and largely accepted, whereas in the case of small samples, the variance estimate suffers from instability as do direct point estimators. Linked to this, a further problem arises due to the rarity of recruitments in certain domains. In fact, when direct estimates of SR or of NR (or of both) are equal to zero, sampling variances and covariances are also equal to zero. Note that estimated variances equal to zero do not necessarily imply the high degree of accuracy of the estimates. This problem is known in the literature, but no solution has been proposed as yet. Some studies (that of the National Research Council, 2000, for example) adopt a logarithmic transformation of the mean (or total) direct estimates of the count data in order to use a linear SAE model: these studies simply discard those estimates equal to zero. In this way, the problem of “zero variance” is overcome, although the adopted solution can lead to biased estimates of the model parameter and to the discarding of a portion of the sample in an “insufficient sample size” setting. To deal with both the instability of estimated variances and of estimated variances equal to zero, we propose a smoothing covariance matrix solution based once again on the Poisson-Log Normal distribution, following a Generalized Variance Function approach. We adopt a fully hierarchical Bayesian approach. In this field, relatively complex models, such as the multivariate one, can be easily implemented; furthermore, posterior distributions can be approximated using MCMC algorithms. Moreover, computing small-area multivariate estimators, and above all, estimates of their MSE, could prove difficult within a frequentist context.
M.R. Ferrante, C. Trivisano (2006). Estimating firms’ labour demand forecasts using small area models for count data. MILANO : FRANCO ANGELI.
Estimating firms’ labour demand forecasts using small area models for count data
FERRANTE, MARIA;TRIVISANO, CARLO
2006
Abstract
In this paper we propose the use of a Small Area Estimation (SAE) approach so as to obtain reliable estimates of the number of “recruits replacing employees leaving the firm (substitute recruits – SR) and of “recruits filling new positions (new recruits – NR). These estimates would clearly be of use in analysing the composition of that human capital required by Italian Industrial Districts, and subsequently in evaluating local economic systems’ ability to achieve a long-term competitive long-edge by expanding and/or renewing human capital. As we are interested in evaluating the dynamics of Industrial Districts s compared with those of non-Districts s, we have chosen to focus on the manufacturing sector. We use model-based estimators relying on “area level” models, because the UNIONCAMERE will not release individual data, due to privacy constraints, but only domain-level data on request. In order to specify a suitable SAE model, we first need to deal with the following problems. Firstly, our targeted estimation is the number of recruits. In this case, the standard Normal-Normal model would not hold, and so we need to use a small-area model for count data. Secondly, in order to increase the reliability of estimates, we have taken account of the correlation between sampling errors associated to NR and SR estimators by adopting a multivariate SAE model that borrows strength not only from areas but also from this correlation. As sample correlations between NR and SR are mainly negative, we suggest the use of an SAE model based on the multivariate Poisson-Log Normal distribution. Unlike other multivariate distributions for counts proposed in the literature, this particular distribution allows for unconstrained (both positive and negative) correlations. Finally, a problem due to the instability of estimated variances of direct estimators has to be faced. Usually, an approximately unbiased estimate of the variance of direct estimators may be available. In SAE modelling, the variance of direct estimators is generally assumed to be known and equal to its estimate. In the case of large samples, this assumption is commonly stated and largely accepted, whereas in the case of small samples, the variance estimate suffers from instability as do direct point estimators. Linked to this, a further problem arises due to the rarity of recruitments in certain domains. In fact, when direct estimates of SR or of NR (or of both) are equal to zero, sampling variances and covariances are also equal to zero. Note that estimated variances equal to zero do not necessarily imply the high degree of accuracy of the estimates. This problem is known in the literature, but no solution has been proposed as yet. Some studies (that of the National Research Council, 2000, for example) adopt a logarithmic transformation of the mean (or total) direct estimates of the count data in order to use a linear SAE model: these studies simply discard those estimates equal to zero. In this way, the problem of “zero variance” is overcome, although the adopted solution can lead to biased estimates of the model parameter and to the discarding of a portion of the sample in an “insufficient sample size” setting. To deal with both the instability of estimated variances and of estimated variances equal to zero, we propose a smoothing covariance matrix solution based once again on the Poisson-Log Normal distribution, following a Generalized Variance Function approach. We adopt a fully hierarchical Bayesian approach. In this field, relatively complex models, such as the multivariate one, can be easily implemented; furthermore, posterior distributions can be approximated using MCMC algorithms. Moreover, computing small-area multivariate estimators, and above all, estimates of their MSE, could prove difficult within a frequentist context.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.