Sum based of the name of variables in R -
let's have df contains id, gender, , several numerical variables, , max1, max2, , max3, where
max1 = variable name of first max values of x1,x2,x3,x4,x5
max2 = variable name of second max values of x1,x2,x3,x4,x5
max3 = variable name of third max values of x1,x2,x3,x4,x5
### generate data set.seed(123) id <- c(1,2,3,4,5,6,7,8,9,10) gender <- c("m", "m", "m", "f", "f", "m", "m", "f", "f", "m") x1 <- rnorm(10, 0, 1) x2 <- rnorm(10, 0, 1) x3 <- rnorm(10, 0, 1) x4 <- rnorm(10, 0, 1) x5 <- rnorm(10, 0, 1) df <- data.frame(id, gender, x1, x2, x3, x4, x5) maxes <- t(sapply(1:nrow(df), function(i) { names(sort(df[i,3:7], decreasing=t)[1:3]) })) colnames(maxes) <- c("max1","max2", "max3") df <- cbind(df, maxes) now need create new column (call ir m_sum) has sum values of max1 , max2.
for example, id=1, max1 = x2 , max2 = x4, m_sum shold equal 1.2240818 + 0.42646422 = 1.650546.
how using apply in 1 call?
df$m_sum <- apply(df, 1, function(x) as.double(x[x[ "max1" ]]) + as.double(x[x[ "max2" ]])) #[1] 1.65054602 0.15189652 2.45383397 3.04708946 2.02954308 3.50197809 1.39170465 0.09146139 1.48132102 #[10] 1.17044583
Comments
Post a Comment