r - Consolidate duplicate rows

Question

Welcome To Ask or Share your Answers For Others

r - Consolidate duplicate rows

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Consolidate duplicate rows

I have a data frame where one column is species' names, and the second column is abundance values. Due to the sampling procedure, some species appear more than once (i.e., there is more than one row with Species X in it). I would like to consolidate those entries and sum their abundances.

For example, given this data frame:

set.seed(6)
df=data.frame(
  x=c("sp1","sp2","sp3","sp3","sp4","sp2","sp3"),
  y=rpois(7,2)); df

which produces:

    x y
1 sp1 2
2 sp2 4
3 sp3 1
4 sp3 1
5 sp4 3
6 sp2 5
7 sp3 5

I would like to instead produce:

    x y
1 sp1 2    
2 sp2 9     (5+4)
3 sp3 7     (5+1+1)
5 sp4 3

Thanks in advance for any help you can provide!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-17T00:13:32+0000

This works:

library(plyr)
ddply(df,"x",numcolwise(sum))

in words: (1) split the data frame df by the "x" column; (2) for each chunk, take the sum of each numeric-valued column; (3) stick the results back into a single data frame. (dd in ddply stands for "take a d ata frame as input, return a d ata frame")

Another, possibly clearer, approach:

aggregate(y~x,data=df,FUN=sum)

See quick/elegant way to construct mean/variance summary table for a related (slightly more complex) question.

Categories

r - Consolidate duplicate rows

r - Consolidate duplicate rows

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags