sql - How to spread projects across active years? -
i have excel spread sheet on 10000 projects , each project has start date , end date different other. need spread projects across years active in order have exact count year. how do starting example?
start date end date entityno amount 4/1/2001 8/31/2012 1 500 1/1/2005 12/31/2007 2 100
the solution thought of difference between start date , end date (end date - start date + 1) in excel have duration in days each project. after getting number of days, e.g., 4100 days, divide duration total number of days in year(365.25 days) give me range of years got stuck when came assigning each project correct years.
expected output dataset of on 10,000 projects:
start date end date no of years entityno amount 4/1/2001 8/31/2012 11 1 500 2002 2003 2004 2005 2006 2007 upto 8/31/2012
followed project different entity number
start date end date entity no amount 1/1/2005 12/31/2007 2 100
output
years entity no amount 1/1/2005 2 100 2006 2 100 2/31/2007 2 100
this solution needs r package lubridate
.
library(tidyverse) library(lubridate) # create example data frame dat <- tribble( ~`start date`, ~`end date`, ~`entityno`, ~`amount`, "4/1/2001", "8/31/2012", 1, 500, "1/1/2005", "12/31/2007", 2, 100 ) dat %>% mutate(`start date` = mdy(`start date`), `end date` = mdy(`end date`)) %>% mutate(`start year` = year(`start date`), `end year` = year(`end date`)) %>% mutate(`no of years` = `end year` - `start year`) %>% select(`start date`, `end date`, `no of years`, entityno, amount)
or can use string manipulation method.
dat %>% mutate(`no of years` = as.numeric(substring(`end date`, nchar(`end date`) - 3)) - as.numeric(substring(`start date`, nchar(`start date`) - 3))) %>% select(`start date`, `end date`, `no of years`, entityno, amount)
Comments
Post a Comment