r - collapsing repeated observations counting sum of a given variable to a certain point -
id bleed episodes j1 0 0 ji 0 1 j1 0 1 ji yes 0 j2 0 0 j2 0 1 j2 0 1 j2 0 1 j2 yes 0 j2 0 0 j3 0 1 j3 0 1 j3 0 0 j3 0 1 j3 yes 0 j3 0 0
i want collapse data count number of episodes bleed occurs every individual this
id episodes j1 2 j2 3 j3 3
the observations made @ different times, didn’t include time variable, weekly
with sample input
dd <- structure(list(id = structure(c(1l, 1l, 1l, 1l, 2l, 2l, 2l, 2l, 2l, 2l, 3l, 3l, 3l, 3l, 3l, 3l), .label = c("j1", "j2", "j3"), class = "factor"), bleed = structure(c(1l, 1l, 1l, 2l, 1l, 1l, 1l, 1l, 2l, 1l, 1l, 1l, 1l, 1l, 2l, 1l), .label = c("0", "yes"), class = "factor"), episodes = c(0l, 1l, 1l, 0l, 0l, 1l, 1l, 1l, 0l, 0l, 1l, 1l, 0l, 1l, 0l, 0l)), .names = c("id", "bleed", "episodes" ), class = "data.frame", row.names = c(na, -16l))
you can accomplish task dplyr
library(dplyr) dd %>% group_by(id) %>% mutate(bleed_count=cumsum(bleed=="yes")) %>% filter(bleed_count==0) %>% summarize(episodes = sum(episodes))
use use cumsum()
on boolean value track when bleed occurs. sum values before first bleed
Comments
Post a Comment