r - Using a tibble to apply linear models by group to new data -

September 15, 2011

let's have 2 datasets same group of irises on 2 years:

# create data reproducible results. iris.2007 <- iris iris.2008 <- iris iris.2008[1:4] <- 2*iris.2008[1:4]  # let's make 2008 data different

i fit separate linear model each species in 2007 data, can this:

# first nest species. iris.2007.nested <- iris.2007 %>%                     group_by(species) %>%                     nest() # apply linear model call group using data. iris.2007.nested <- iris.2007.nested %>%                     mutate(models = map(data,                     ~ lm(petal.length ~ petal.width, data = .)))

when @ results, make sense nicely-organized tibble.

head(iris.2007.nested) # tibble: 3 × 3      species              data   models       <fctr>            <list>   <list> 1     setosa <tibble [50 × 4]> <s3: lm> 2 versicolor <tibble [50 × 4]> <s3: lm> 3  virginica <tibble [50 × 4]> <s3: lm>

now let's same thing 2008 data.

# first nest species. iris.2008.nested <- iris.2008 %>%                     group_by(species) %>%                     nest() # apply linear model call species using data. iris.2008.nested <- iris.2008.nested %>%                     mutate(models = map(data,                     ~ lm(petal.length ~ petal.width, data = .)))

again, end nice tibble.

head(iris.2008.nested) # tibble: 3 × 3      species              data   models       <fctr>            <list>   <list> 1     setosa <tibble [50 × 4]> <s3: lm> 2 versicolor <tibble [50 × 4]> <s3: lm> 3  virginica <tibble [50 × 4]> <s3: lm>

now use linear models 2008 data predict results using 2007 data. thinking best way combine 2 datasets (retaining group structure), here happens when try merge 2 nested tibbles:

iris.both.nested <- merge(iris.2007.nested, iris.2008.nested, by='species')

as can see below, tibble no longer seems have same format individual tibbles above. specifically, organization hard discern (note not including full output in chunk, idea).

head(iris.both.nested)      species 1     setosa 2 versicolor 3  virginica  data.x 1 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, ... ... <truncated> 1 1.327563, 0.5464903, -0.03686145, -0.03686145, -0.1368614, 0.06313855, ...

and although can still apparently use models fitted 2008 data (as models.y) data 2007 (as data.x):

iris.both.nested.pred <- iris.both.nested %>%                          mutate( pred = map2(models.y,                           data.x, predict))

the result again not nicely-organized tibble: (again not showing full output)

head(iris.both.nested.pred)      species 1     setosa 2 versicolor 3  virginica  data.x 1 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, ... ... <truncated> 1 1.327563, 0.5464903, -0.03686145, -0.03686145, -0.1368614, ...

so question -- process working though tibbles become strangely organized after merge? or missing something? thanks!

i double nest first , apply models later

# data iris.2007 <- iris iris.2008 <- iris iris.2008[1:4] <- 2*iris.2008[1:4]   joined<-bind_rows( cbind(dset=rep("iris.2007",length(iris.2007$species)),iris.2007) ,cbind(dset=rep("iris.2008",length(iris.2008$species)),iris.2008) )  # double nesting joined_nested<-   joined %>% group_by(dset) %>% nest(.key=data1) %>%   mutate(data1 = map(data1, ~.x %>% group_by(species) %>% nest))  # apply linear model call group using data. joined_nested_models<- joined_nested %>% mutate(data1 = map(data1, ~.x %>%              mutate(models = map(data,                                  ~ lm(petal.length ~ petal.width, data = .)))                                                                               )) joined_nested_models %>% unnest # # tibble: 6 × 4 #        dset    species              data   models #       <chr>     <fctr>            <list>   <list> # 1 iris.2007     setosa <tibble [50 × 4]> <s3: lm> # 2 iris.2007 versicolor <tibble [50 × 4]> <s3: lm> # 3 iris.2007  virginica <tibble [50 × 4]> <s3: lm> # 4 iris.2008     setosa <tibble [50 × 4]> <s3: lm> # 5 iris.2008 versicolor <tibble [50 × 4]> <s3: lm> # 6 iris.2008  virginica <tibble [50 × 4]> <s3: lm>

which tidier version of inner_join

Search This Blog

MOno

r - Using a tibble to apply linear models by group to new data -

Comments

Post a Comment

Popular posts from this blog

'hasOwnProperty' in javascript -

python - ValueError: No axis named 1 for object type <class 'pandas.core.series.Series'> -

java - How to implement an entity bound odata action in olingo v4.3 -