trainControl:

use for model testing, data resampling/
boot, cv, repeatedcv, etc.

By default, simple bootstrap resampling is used for line 3 in the algorithm above. Others are available, such as repeated K-fold cross-validation, leave-one-out etc. The function trainControl can be used to specifiy the type of resampling:

bootstrap: randomly select data from datasets with replacement

  1. trainControl <- trainControl(method="boot", number=100)

cross-validation: The k-fold cross validation method involves splitting the dataset into k-subsets. Each subset is
held out while the model is trained on all other subsets. This process is completed until accuracy
is determine for each instance in the dataset, and an overall accuracy estimate is provided.

  1. trainControl <- trainControl(method="cv", number=10)

RepeatedCV : cv for multiple repeats

  1. trainControl <- trainControl(method="repeatedcv", number=10, repeats=3)

LOOCV:

  1. trainControl <- trainControl(method="LOOCV")
  1. library(caret)
  2. # load the iris dataset
  3. data(iris)
  4. # define training control
  5. trainControl <- trainControl(method="boot", number=100)
  6. # evalaute the model
  7. fit <- train(Species~., data=iris, trControl=trainControl, method="nb")
  8. # display the results
  9. print(fit)

train:

  1. set.seed(825)
  2. gbmFit1 <- train(Class ~ ., data = training,
  3. method = "gbm",
  4. trControl = fitControl,
  5. ## This last option is actually one
  6. ## for gbm() that passes through
  7. verbose = FALSE)
  8. gbmFit1

Tuning:

trainControl(search = “random”)

train(tuneLength = number)

  1. # Random Search
  2. trainControl <- trainControl(method="repeatedcv", number=10, repeats=3, search="random")
  3. set.seed(seed)
  4. mtry <- sqrt(ncol(x))
  5. rfRandom <- train(Class~., data=dataset, method="rf", metric=metric, tuneLength=15,
  6. trControl=trainControl)
  7. print(rfRandom)
  8. plot(rfRandom)

search by grid

  1. gbmGrid <- expand.grid(interaction.depth = c(1, 5, 9),
  2. n.trees = (1:30)*50,
  3. shrinkage = 0.1,
  4. n.minobsinnode = 20)
  5. nrow(gbmGrid)
  6. set.seed(825)
  7. gbmFit2 <- train(Class ~ ., data = training,
  8. method = "gbm",
  9. trControl = fitControl,
  10. verbose = FALSE,
  11. ## Now specify the exact models
  12. ## to evaluate:
  13. tuneGrid = gbmGrid)
  14. gbmFit2