Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
300 views
in Technique[技术] by (71.8m points)

r - 创建数据框的值对的列表(名称/值)(Create a list (names/values) of value pairs of a data frame)

I have this kind of data:

(我有这种数据:)

print(weights)

   RECAGE  RECAGE_Q RECQ3   RECQ3_Q Q3A       Q3A_Q Q5_1_REC Q5_1_REC_Q Q5_2_REC Q5_2_REC_Q
1       1 0.1219512     1 0.2132521   1 0.132723112        1 0.55411585        1  0.5923664
2       2 0.3384146     2 0.2437167   2 0.161708619        2 0.20503049        2  0.1396947
3       3 0.3582317     3 0.2505712   3 0.048054920        3 0.16692073        3  0.1320611
4       4 0.1814024     4 0.2924600   4 0.025934401        4 0.07393293        4  0.1358779
5      NA        NA    NA        NA   5 0.007627765       NA         NA       NA         NA
6      NA        NA    NA        NA   6 0.027459954       NA         NA       NA         NA
7      NA        NA    NA        NA   7 0.078565980       NA         NA       NA         NA
8      NA        NA    NA        NA   8 0.016781083       NA         NA       NA         NA
9      NA        NA    NA        NA   9 0.092295957       NA         NA       NA         NA
10     NA        NA    NA        NA  10 0.221205187       NA         NA       NA         NA
11     NA        NA    NA        NA  11 0.051106026       NA         NA       NA         NA
12     NA        NA    NA        NA  12 0.012204424       NA         NA       NA         NA
13     NA        NA    NA        NA  13 0.043478261       NA         NA       NA         NA
14     NA        NA    NA        NA  14 0.021357742       NA         NA       NA         NA
15     NA        NA    NA        NA  15 0.035850496       NA         NA       NA         NA
16     NA        NA    NA        NA  16 0.023646072       NA         NA       NA         NA

My goal is to create a list (without the NA values) for every pairs (eg RECAGE and RECAGE_Q, RECQ3 RECQ3_Q and Q3A etc.) in the data frame.

(我的目标是为数据帧中的每对(例如RECAGE和RECAGE_Q,RECQ3 RECQ3_Q和Q3A等)创建一个列表(不包含NA值)。)

The names should be the values of any first column, the values should be the values of any second column.

(名称应该是任何第一列的值,值应该是任何第二列的值。)

The result should look like this:

(结果应如下所示:)

library(dplyr)

RECAGE = weights %>%
  select(RECAGE, RECAGE_Q) %>%
  drop_na()

RECQ3 = weights %>%
  select(RECQ3, RECQ3_Q) %>%
  drop_na() 

Q3A = weights %>%
  select(Q3A, Q3A_Q) %>%
  drop_na() 


Q5_1_REC = weights %>%
  select(Q5_1_REC, Q5_1_REC_Q) %>%
  drop_na() 

Q5_2_REC = weights %>%
  select(Q5_2_REC, Q5_2_REC_Q) %>%
  drop_na() 

a = RECAGE$RECAGE_Q
names(a) = as.numeric(RECAGE$RECAGE)

b = Q3A$Q3A_Q
names(b) = as.numeric(Q3A$Q3A)

c = Q5_1_REC$Q5_1_REC_Q
names(c) = as.numeric(Q5_1_REC$Q5_1_REC)

d = Q5_2_REC$Q5_2_REC_Q
names(d) = as.numeric(Q5_2_REC$Q5_2_REC)


t <- list(a, b, c, d)
names(t) <- c("RECAGE", "Q3A", "Q5_1_REC", "Q5_2_REC")

I want to create a function in order to avoid to create the list manually.

(我想创建一个函数,以避免手动创建列表。)

An approach is to do it like this:

(一种方法是这样做的:)

input_weights = lapply(weights, function(x) as.vector(x))

input_weights = lapply(input_weights, function(x) x[!is.na(x)])

..from here I can′t get any further.

(..从这里我不能再进一步了。)

Thanks for any advise.

(感谢您的任何建议。)

Here is a dput:

(这是dput:)

structure(list(RECAGE = c(1, 2, 3, 4, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA), RECAGE_Q = c(0.121951219512195, 0.338414634146341, 
0.358231707317073, 0.18140243902439, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA), RECQ3 = c(1, 2, 3, 4, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA), RECQ3_Q = c(0.213252094440213, 
0.243716679360244, 0.250571210967251, 0.292460015232293, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Q3A = c(1, 2, 3, 
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16), Q3A_Q = c(0.132723112128146, 
0.161708619374523, 0.0480549199084668, 0.0259344012204424, 0.007627765064836, 
0.0274599542334096, 0.0785659801678108, 0.0167810831426392, 0.0922959572845156, 
0.221205186880244, 0.0511060259344012, 0.0122044241037376, 0.0434782608695652, 
0.0213577421815408, 0.0358504958047292, 0.0236460717009916), 
    Q5_1_REC = c(1, 2, 3, 4, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA), Q5_1_REC_Q = c(0.554115853658537, 0.205030487804878, 
    0.166920731707317, 0.0739329268292683, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA), Q5_2_REC = c(1, 2, 3, 4, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Q5_2_REC_Q = c(0.592366412213741, 
    0.13969465648855, 0.13206106870229, 0.13587786259542, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, 
16L), class = "data.frame")
  ask by Banjo translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

One way is to use split.default to split alternate columns and create a named vector using first column of each group as names and second column as value after dropping NA values.

(一种方法是使用split.default拆分备用列,并在删除NA值后使用每个组的第一列作为名称,第二列作为值来创建命名向量。)

out <- lapply(split.default(weights, gl(ncol(weights)/2, 2)), function(x) {
   inds = !is.na(x[[2]])
   setNames(x[[2]][inds], x[[1]][inds])
})
names(out) <- names(weights)[c(TRUE, FALSE)]

#$RECAGE
#      1       2       3       4 
#0.12195 0.33841 0.35823 0.18140 

#$RECQ3
#      1       2       3       4 
#0.21325 0.24372 0.25057 0.29246 

#$Q3A
#        1         2         3         4         5         6         7 
#0.1327231 0.1617086 0.0480549 0.0259344 0.0076278 0.0274600 0.0785660 
#        8         9        10        11        12        13        14 
#0.0167811 0.0922960 0.2212052 0.0511060 0.0122044 0.0434783 0.0213577 
#       15        16 
#0.0358505 0.0236461 

#$Q5_1_REC
#       1        2        3        4 
#0.554116 0.205030 0.166921 0.073933 

#$Q5_2_REC
#      1       2       3       4 
#0.59237 0.13969 0.13206 0.13588 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...