r - Aggregating a data.frame with NAs using data.table -
i have large data.frame
character
column , several numeric columns contain na
's.
here few example rows:
df <- data.frame(id=rep("a",3),v1=c(na,1,na),v2=c(2,5,2),v3=c(na,na,na),v4=c(0,0,0),stringsasfactors=f)
since df$id
repeats want aggregate
df
df$id
, , apply sum
other columns.
i did this:
require(data.table) setdt(df)[,lapply(.sd, function(x) sum(x,na.rm=t)),by=.(id)]
and getting this:
id v1 v2 v3 v4 1: 1 9 0 0
so column v3
na
's in df
, hence gets value of 0
, poses problem me since in such case i'd keep na
value in other cases (where aggregation on mix of numerics , na
's, i'd remove na
's otherwise sum na
). example shows (df$v4
) have columns 0
, therefore can't replace 0
's na
's aggregated data.frame
.
in other words desired outcome:
id v1 v2 v3 v4 1: 1 9 na 0
any idea how data.table
's .sd
aggregation achieve this?
df[,lapply(.sd, function(x) ifelse(all(is.na(x)),na,sum(x,na.rm=t))),by=.(id)] id v1 v2 v3 v4 1: 1 9 na 0
Comments
Post a Comment