Combining two estimates applied in a survey of copyright volumes at higher educational institutions in Norway using the bootstrap

Publikasjonsdetaljer

This survey is intended for estimating several types of copyright volume at different educational institutions in Norway. Using a calibration factor being equal to the ratio of the true number of machine pages taken from all machines at an institution to the corresponding estimated number, will make the estimates less biased and less variable. There are two reasonable estimates. In addition to suggesting a bootstrap procedure for selecting one of them, we propose to fit a weighted average of both. The weight is estimated through bootstrapping by minimizing the mean square error or the variance coefficient of the combined estimate summed over all possible types of copyright material. We expect to assign more weight to the estimate which has less mean square error or variance coefficient. It is, however, not straightforward to analyze theoretically the bias and variance of the estimates. Such an analysis will need a simultanous model for the number of machine pages and the number of original pages taken by each person. Analyzing the data by bootstraping gave significantly better performance for the combined estimate compared to using the best of the two estimates chosen by bootstrap selection. However, setting the weights equal to 0.5 gave the overall best performance.