Abstract
Missing data occur in almost all the surveys and may create serious problems because restricting the analysis to complete cases leads to loss of precision and invalid inferences. Hence, missing data are commonly treated by imputation, that is, they are filled in with plausible values. In a previous work we proposed a copula-based method that allows to impute by accounting for both the (complex) dependence structure underlying the data and the shape of the margins. The method employs the conditional density functions of the missing variables given the observed ones. These functions are derived analytically once parametric models for the margins and the copula are specified. In this paper, we extend our method in a semiparametric fashion in that the margins are estimated non-parametrically through local likelihood methods. We compare the performance of the two versions of the imputation method in terms of the preservation of both the dependence structure and the microdata in dierent simulated scenarios by varying copula, marginal distributions and the level of the dependence parameter. The method has a wide range of applicability and has been implemented in the R software package CoImp.