large-scale regression in R with a sparse feature matrix

Question

Welcome To Ask or Share your Answers For Others

large-scale regression in R with a sparse feature matrix

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

large-scale regression in R with a sparse feature matrix

I'd like to do large-scale regression (linear/logistic) in R with many (e.g. 100k) features, where each example is relatively sparse in the feature space---e.g., ~1k non-zero features per example.

It seems like the SparseM package slm should do this, but I'm having difficulty converting from the sparseMatrix format to a slm-friendly format.

I have a numeric vector of labels y and a sparseMatrix of features X in {0,1}. When I try

model <- slm(y ~ X)

I get the following error:

Error in model.frame.default(formula = y ~ X) : 
invalid type (S4) for variable 'X'

presumably because slm wants a SparseM object instead of a sparseMatrix.

Is there an easy way to either a) populate a SparseM object directly or b) convert a sparseMatrix to a SparseM object? Or perhaps there's a better/simpler way to do this?

(I suppose I could explicitly code the solutions for linear regression using X and y, but it would be nice to have slm working.)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T17:45:42+0000

A belated answer: glmnet will also support sparse matrices and both of the regression models requested. This can use the sparse matrices produced by the Matrix package. I advise looking into regularized models via this package. As sparse data often involves very sparse support for some variables, L1 regularization is useful for knocking these out of the model. It's often safer than getting some very spurious parameter estimates for variables with very low support.

Categories

large-scale regression in R with a sparse feature matrix

large-scale regression in R with a sparse feature matrix

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags