Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
420 views
in Technique[技术] by (71.8m points)

r - How to find the percentage of NAs in a data.frame?

I am trying to find the percentage of NAs in columns as well as inside the whole dataframe:

The first method which I have commented gives me zero and the second method which is not commented gives me a matrix. Not sure what I am missing. Any hint is truly appreciated!

cp.2006<-read.csv(file="cp2006.csv",head=TRUE)

#countNAs <- function(x) { 
#  sum(is.na(x)) 
#} 
#total=0
#for (i in col(cp.2006)) {
#  total=countNAs(i)+total
#}
#print(total)
count<-apply(cp.2006, 1, function(x) sum(is.na(x)))
dims<-dim(cp.2006)
num<-dims[1]*dims[2]
NApercentage<-(count/num) * 100
print(NApercentage)
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
x = data.frame(x = c(1, 2, NA, 3), y = c(NA, NA, 4, 5))

For the whole dataframe:

sum(is.na(x))/prod(dim(x))

Or

mean(is.na(x))

For columns:

apply(x, 2, function(col)sum(is.na(col))/length(col))

Or

colMeans(is.na(x))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...