Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

r - Combing a list of unequal data.frames

I'm trying to combine a list of unequal data.frames; the obvious do.call(rbind, df.lst) fails but the real problem is padding it out with NAs.

df.lst <- list(A=data.frame(a=c(1,2),b=c(5,4),d=c(2,3),e=c(1,1),f=c(1,2),g=c(1,2)),
               B=data.frame(a=c(1,2),b=c(3,2),d=c(2,3)),
               C=data.frame(a=c(1,2),b=c(4,3),d=c(1,2),e=c(1,3))
               )

I can see that I need to find the maximum number of columns in the longest data.frame; I can do this with the following code,

max(sapply(df.lst,ncol))

but after that I'm stuck. It is suggested that it is possible to index over the list and this automatically fills it with NAs.

Once I have the padded list, I anticipate a simple do.call() as described earlier. (I'm trying to keep the answer to base R and although there are many similar questions I can't seem to find an answer to this precise one).

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

If you want to stick with base R, you can do something like this:

### Get all the columns names
col <- unique(unlist(sapply(df.lst, names)))
col
## [1] "a" "b" "d" "e" "f" "g"

### Fill the missing columns with NA
df.lst <- lapply(df.lst, function(df) {
  df[, setdiff(col, names(df))] <- NA
  df
})

### Then Bind it
do.call(rbind, df.lst)
##     a b d  e  f  g
## A.1 1 5 2  1  1  1
## A.2 2 4 3  1  2  2
## B.1 1 3 2 NA NA NA
## B.2 2 2 3 NA NA NA
## C.1 1 4 1  1 NA NA
## C.2 2 3 2  3 NA NA

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...