Confidence Interval for mean difference with non-mutually exclusive 1 to many matches in R

by Skye   Last Updated September 14, 2018 15:19 PM

This is somewhat both a programming question, and a stats question. So sorry for the overlap (although it seems that there is a large overlap between the realm of stats and programming of professionals). I have a dataset with ~1000 cases matched to ~100,000 controls (each case matched to multiple controls). However, the matches are not mutually exclusive. That is, some of the controls that are matched to one case, may also be matched to another case. However, I do not consider it many-to-many matching because I can't match the cases to each other (an explanation of the reason for this would make this post unbearably long, so I'll try to spare it if possible)

What I would like to do is compute a mean difference (essentially a grand mean difference), with confidence intervals. Lets assume the variable of interest is normally distributed. How can I construct these confidence intervals in R and properly account for this matching scheme? I'm open to bootstrapping if necessary, but would like to avoid it if possible (mainly in the interest of time).

Related Questions

Is there a way to generate confidence sets in R?

Updated July 04, 2017 18:19 PM

Confidence interval

Updated March 13, 2017 11:19 AM

When is bootstrapping helpful and used?

Updated August 22, 2018 18:19 PM

Confidence Region of a multivariate KDE in Python?

Updated February 28, 2019 18:19 PM