I have two data frames:
df1 has these columns: participantid, formid, c1, c2, c3, c4
df2 has these columns: participantid, c5, c6, c7, c8
I want a union of all participantids from the first data frame where formid = 'some value' and all of participantids from the second dataframe. I am only interested in a list of participantids. I am not interested in any of the other columns: c1, c2, c3, c4,...
I have tried:
union(df1[df1$formid == "some value", "participantid"], df2["participantid"])
union(df1[df1$formid == "some value", "participantid"], df2[["participantid"]])
union(df1[df1$formid == "some value", "participantid"], df2$participantid)
Neither worked.
Any pointers?
Thank you in advance!
Edit: I have tried the following code and it works:
df1 <- data.frame(participantid = c("A1", "A2", "A3", "A4"),
formid = c("F1","F1","F1","F2"),
c1 = c(0,0,0,0))
df2 <- data.frame(participantid = c("B1", "B2", "B3", "B4"),
c2 = c(0,0,0,0))
union(df1[df1$formid == "F1", "participantid"], df2$participantid)
When I run class(df2$participantid)
or class(df1[df1$formid == "F1", "participantid"])
, it returns [1] "factor"
My real data is coming from CSV files and when I run on this real data class(df1[df1$formid == "F1", "participantid"])
it returns [1] "tbl_df" "tbl" "data.frame"
and when I run class(df2$participantid)
it returns [1] "character"
. Do you guys know why that is?
Edit #2: I was able to reproduce my predicament using dummy CSV files:
df1 CSV file:
participantid,formid,c1
A1,F1,0
A2,F1,0
A3,F1,0
A4,F2,0
df2 CSV file:
participantid,c2
B1,0
B2,0
B3,0
B4,0
When I run the union command above I get this:
[[1]]
[1] "A1" "A2" "A3"
[[2]]
[1] "B1"
[[3]]
[1] "B2"
[[4]]
[1] "B3"
[[5]]
[1] "B4"
with a length() of 5, when it should have been a length of 7. Does this make sense?
I was expecting the output to be either
"A1" "A2" "A3" "B1" "B2" "B3" "B4"
or
"A1"
"A2"
"A3"
"B1"
"B2"
"B3"
"B4"
Edit #3: I am going to answer my own question. This worked for me in the end:
union(df1[df1$formid == "F1",]$participantid, df2$participantid)
question from:
https://stackoverflow.com/questions/65928488/union-between-two-dataframe-columns