Calculating CI's using bootstrapping on the holdout test dataset

by guest.who.worries   Last Updated January 02, 2019 21:19 PM

I’m trying to calculate 95% confidence intervals for the sensitivity and specificity of a decision model that I’m building.

I’ve split my dataset into 90/10 train and test sets. I’ve used the 90% train set to perform hyperparameter turning, and then used the optimal decision model selected from within the 90% train dataset to evaluate the 10% holdout dataset, which is fully independent, and not used in the hyperparameter tuning process.

My problem is, what's the best approach to obtain 95% confidence intervals for the training dataset? Should I bootstrap multiple subsets of the test data against the optimal model identified using hyperparameter tuning, and use those for the calculation? In example uses of bootstrapping that I found, different subsets of the train and test dataset are used for bootsrtapping. However, I don’t want to do that because I want my testing to be across a truly holdout (aka validation) dataset.

Related Questions

Calculating a confidence interval for a weighted sample

Updated December 10, 2017 20:19 PM

ranking based on lower confidence interval

Updated September 09, 2018 20:19 PM

Interpretation of correlation coefficient

Updated May 22, 2020 22:19 PM

Statistical confidence interval / Error rate of sample

Updated November 10, 2017 12:19 PM