It is really the same issue as the post you link to. preProcess
works only on numeric data and you have:
> str(etitanic)
'data.frame': 1046 obs. of 6 variables:
$ pclass : Factor w/ 3 levels "1st","2nd","3rd": 1 1 1 1 1 1 1 1 1 1 ...
$ survived: int 1 1 0 0 0 1 1 0 1 0 ...
$ sex : Factor w/ 2 levels "female","male": 1 2 1 2 1 2 1 2 1 2 ...
$ age : num 29 0.917 2 30 25 ...
$ sibsp : int 0 1 1 1 1 0 1 0 2 0 ...
$ parch : int 0 2 2 2 2 0 0 0 0 0 ...
You can't center and scale pclass
or sex
as-is so they need to be converted to dummy variables. You can use model.matrix
or caret's dummyVars
to do this:
> new <- model.matrix(survived ~ . - 1, data = etitanic)
> colnames(new)
[1] "pclass1st" "pclass2nd" "pclass3rd" "sexmale" "age"
[6] "sibsp" "parch"
The -1
gets rid of the intercept. Now you can run preProcess
on this object.
btw making preProcess
ignore non-numeric data is on my "to do" list but it might cause errors for people not paying attention.
Max
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…