Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
522 views
in Technique[技术] by (71.8m points)

regex - Remove words in one column present in another column in R

I have a dataframe that is in this format:

A <- c("John Smith", "Red Shirt", "Family values are better")
B <- c("John is a very highly smart guy", "We tried the tea but didn't enjoy it at all", "Family is very important as it gives you values")

df <- as.data.frame(A, B)

My intention is to get the result back as:

ID   A                           B
1    John Smith                  is a very highly smart guy
2    Red Shirt                   We tried the tea but didn't enjoy it at all
3    Family values are better    is very important as it gives you

I have tried:

test<-df %>% filter(sapply(1:nrow(.), function(i) grepl(A[i], B[i])))

But it doesn't give me what I want.

Any suggestions/help?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

One solution is to use mapply along with strsplit.

The trick is to split df$A in separate words and collapse those words separated by | and then use it as pattern in gsub to replace with "".

lst <- strsplit(df$A, split = " ")

df$B <- mapply(function(x,y){gsub(paste0(x,collapse = "|"), "",df$B[y])},lst,1:length(lst))
df
# A                                           B
# 1               John Smith                  is a very highly smart guy
# 2                Red Shirt We tried the tea but didn't enjoy it at all
# 3 Family values are better          is very important as it gives you 

Another option is as:

df$B <- mapply(function(x,y)gsub(x,"",y) ,gsub(" ", "|",df$A),df$B)

Data:

A <- c("John Smith", "Red Shirt", "Family values are better")
B <- c("John is a very highly smart guy", "We tried the tea but didn't enjoy it at all", "Family is very important as it gives you values")

df <- data.frame(A, B, stringsAsFactors = FALSE)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...