Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
6.5k views
in Technique[技术] by (71.8m points)

r - creating dataframe from string of names and other information

Trying to clean up some data and having trouble with the code for this case. The strings look like this:

string <- 'Inactive: UTA Jarrell Brantley, Juwan Morgan DET Killian Hayes, Derrick Rose '

I want to make them into a dataframe that looks like this:

output <- data.frame('player' = c('Jarrell Brantley','Juwan Morgan','Killian Hayes','Derrick Rose'),
'team' = c('UTA','UTA','DET','DET'), 'status' = c('INACTIVE','INACTIVE','INACTIVE','INACTIVE'))

This is running through a for loop with many different strings, but the pattern of the string is always like this: "INACTIVE: team1 player_name1, player_name2, player_name3, team2 player_name4, player_name5 " (always space after final player_name)

I already have each team1 and team2 defined as objects team_away and team_home respectively, so those can be used as the 'UTA' and 'DET' strings in this case. Note that the number of players after team1 or team2 are not constant; sometimes there are 2 each, sometimes 4 different players after team1 with no mention of team2, etc.

Have tried different sub calls but I'm struggling with the proper syntax. Any help would be greatly appreciated!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I'll suggest a tidyverse pipe.

I'd think the status field is self-evident, so I'll skip that part for now. The rest:

library(dplyr)
library(tidyr)
gsub("^.*inactive:", "", string, ignore.case = TRUE) %>%
  trimws(.) %>%
  strsplit(., "\s{2,}") %>%
  lapply(., strcapture, pattern = "^\s*(\S+)\s+(.*)$", proto = list(team="", player="")) %>%
  bind_rows(.) %>%
  mutate(player = strsplit(player, ",\s*")) %>%
  unnest(player)
# # A tibble: 4 x 2
#   team  player          
#   <chr> <chr>           
# 1 UTA   Jarrell Brantley
# 2 UTA   Juwan Morgan    
# 3 DET   Killian Hayes   
# 4 DET   Derrick Rose    

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...