I'm trying to perform count of an indicator on several (actually hundreds) groups separately (NOT on all combinations of all groups). I'll demonstrate it by simplified example:
Assume I have that dataset
data<-cbind(c(1,1,1,2,2,2)
,c(1,1,2,2,2,3)
,c(3,2,1,2,2,3))
> data
[,1] [,2] [,3]
[1,] 1 1 3
[2,] 1 1 2
[3,] 1 2 1
[4,] 2 2 2
[5,] 2 2 2
[6,] 2 3 3
and an indicator
some_indicator<-c(1,0,0,1,0,1)
then I want to run without loops (like apply by column) something like,
aggregate(some_indicator,list(data[,1]),sum)
aggregate(some_indicator,list(data[,2]),sum)
aggregate(some_indicator,list(data[,3]),sum)
which will generate the following result:
[,1] [,2] [,3]
[1,] 1 1 0
[2,] 2 1 1
[3,] 0 1 2
i.e. for each column (values subset do not change much between columns), count the indicator by value and merge it.
Currently I wrote it with a loop over columns, but I need much more efficient way, since there are lot of columns and It takes over an hour.
Thanks in advance,
Michael.
See Question&Answers more detail:
os