KFold will provide train/test indices to split data in train and test sets. It will split dataset into k
consecutive folds (without shuffling by default).Each fold is then used a validation set once while the k - 1
remaining folds form the training set (source).
Let's say, you have some data indices from 1 to 10. If you use n_fold=k
, in first iteration you will get i
'th (i<=k)
fold as test indices and remaining (k-1)
folds (without that i
'th fold) together as train indices.
An example
import numpy as np
from sklearn.cross_validation import KFold
x = [1,2,3,4,5,6,7,8,9,10,11,12]
kf = KFold(12, n_folds=3)
for train_index, test_index in kf:
print (train_index, test_index)
Output
Fold 1: [ 4 5 6 7 8 9 10 11] [0 1 2 3]
Fold 2: [ 0 1 2 3 8 9 10 11] [4 5 6 7]
Fold 3: [0 1 2 3 4 5 6 7] [ 8 9 10 11]
Import Update for sklearn 0.20:
KFold object was moved to the sklearn.model_selection
module in version 0.20. To import KFold in sklearn 0.20+ use from sklearn.model_selection import KFold
. KFold current documentation source
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…