We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.

Durham Research Online
You are in:

Confidence intervals, missing data and imputation : a salutary illustration.

Gorard, S. (2014) 'Confidence intervals, missing data and imputation : a salutary illustration.', International journal of research in educational methodology., 5 (3). pp. 693-698.


This paper confirms that confidence intervals are not a generally useful measure or estimate of anything in practice. CIs are recursive in definition and reversed in logic, meaning that they are widely misunderstood. Perhaps most importantly, they should not be used with cases that do not form a complete and true random sample from a known population – the latter is a key premise underlying their calculation. This means that, whatever their merits, CIs should not be used in the vast majority of real-life social science analyses. The second part of the paper illustrates the dangers of ignoring this premise, perhaps on some purported pragmatic grounds. Using 100 simulations of a sample of 100 integers from a uniform population with members in the range 0 to 9, it shows that CIs are very misleading as soon as there is deviation from randomness. For example, when 5% of the cases in each sample are deleted a reported 95% CI would be no better than a 66% CI in reality. If 10% of the lowest score cases are replaced with the achieved mean for the sample, then a reported 95% CI would be more like a 43% CI in reality. In addition, the simulation shows that the mean and standard deviation for any sample are correlated (an issue of linked scale). This illustrates that using the sample standard deviation as an estimate for the SD of the sampling distribution in order to try and assess whether the sample mean is close to the mean of the sampling distribution will simply make matters worse. The best and only available estimate of the sampling distribution mean, in practice, is the sample mean.

Item Type:Article
Keywords:Confidence intervals, Credible intervals, Attrition, Missing data.
Full text:(VoR) Version of Record
Download PDF
Publisher Web site:
Publisher statement:This work is licensed under a Creative Commons Attribution 3.0 License.
Date accepted:No date available
Date deposited:02 June 2014
Date of first online publication:April 2014
Date first made open access:No date available

Save or Share this output

Look up in GoogleScholar