Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.3k views
in Technique[技术] by (71.8m points)

r - pass variables and names to data.table function

I have a report that needs to be applied for different names of data.tables [both j and by]. The only way I get it done it by wrapping the arguments in an eval(substitute(value)) function. This makes the code less readable. I have named the j argument "variable", but I would like to pass the j argument of the function to the setnames functions.

So, the questions are:

is there a way to avoid the eval(substitute(value)) construction?

can I pass the j argument to the setnames function?

library(data.table)
library(ggplot2)
data(diamonds, package = "ggplot2")
dt = as.data.table(diamonds)

var.report = function(df, value, by.value) {
  var.report = df[, list( .N,
                    sum(is.finite(eval(substitute(value)))), # count values
                    sum(is.na(eval(substitute(value)))) # count NA
  ), by = eval(substitute(by.value))]

  setnames(var.report, c("variable", "N","n.val","n.NA"))

  return(var.report)
}


var.report(dt, depth, clarity)
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

How about eval(substitute'ing the entire body of the function (or just data.table calculation if you want to be more specific):

var.report = function(df, value, by.value) {
  eval(substitute({
    var.report = df[, list( .N,
                      sum(is.finite(value)), # count values
                      sum(is.na(value)) # count NA
    ), by = by.value]

    setnames(var.report, c("variable", "N","n.val","n.NA"))

    return(var.report)
  }))
}

var.report(dt, depth, clarity)
#   variable     N n.val n.NA
#1:      SI2  9194  9194    0
#2:      SI1 13065 13065    0
#3:      VS1  8171  8171    0
#4:      VS2 12258 12258    0
#5:     VVS2  5066  5066    0
#6:     VVS1  3655  3655    0
#7:       I1   741   741    0
#8:       IF  1790  1790    0

I don't really understand the second question and I'd normally assign the names in the original expression, which helps keeping track of things better, like so:

var.report = df[, list(N     = .N,
                       n.val = sum(is.finite(value)), # count values
                       n.NA  = sum(is.na(value)) # count NA
                      )
                , by = list(variable = by.value)]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...