How can I implement 5 times repeated 10 fold cross validation using the randomForest package ( instead of caret)?

by Willow9898   Last Updated May 22, 2020 22:19 PM

I would like to eventually use the PIMP-Algorithm (Permutation Variable Importance Measure) in order to get p values for the variables' importance. However, the formula

          "PIMP"(X, y, rForest, S = 100, parallel = FALSE, ncores=0, seed = 123, ...)

requires rForest which is an object of class randomForest.

I can carry out the 5 times repeated 10 fold cross-validation fine using caret.

   rf.fit <- train(T2DS ~ ., 
            data = mod_train.new, 
            method = "rf",     
            importance = TRUE, 
            trControl = trainControl(method = "repeatedcv", 
                                     number = 10, 
                                     repeats = 5))

However, I cannot seem to find any examples of documentation as to how to implement this using randomForest. The below is incorrect.

   rf.fit.try <- randomForest(T2DS ~., data=mod_train.new, importance=TRUE, 
      trControl=trainControl(method="repeatedcv", number=10, repeats=5))

Please could anybody suggest how the repeated measures cross-validation can be done using the randomForest package, or an alternative way I can calculate p values for my variable importances following permutation?



Related Questions


How to define samples in caret package?

Updated April 29, 2015 23:08 PM

Howe to plot an average ROC on test data

Updated April 17, 2017 09:19 AM