Getting percentage of character variables and re-group small parts in R -


i have been trying write small piece of code can:

  • take character variable ->
  • get percentages of possible values taken variable ->
  • re-name small percentages "other" instead of original value.

i working in r, example:

#toy data x: x <-c("other","other","other","","office","other","other",   "other","other","sales","","office","other",   "mgr","other","other","mgr","","other","office",   "other","profexe","mgr","mgr","other")  x_freq <- plyr::count(x) names(x_freq) <- c("modality","count") x_freq$prob <- x_freq$count/sum(x_freq$count) small <- x_freq$modality[...] 

the ... saying, if probability not reach level, small taking variable name , rename "other". code not neat , clean, wonder if there other simpler way code it.

how writing function

small_to_other <- function(x, min.fraction=.05) {     counts <- table(x)/length(x)     x[x %in% names(counts)[counts<min.fraction]] <- "other"     x } 

here set default 5% category less 5% gets other. can call

small_to_other(x) # changes "profexe" other 

if wanted rid of less 15%, can do

small_to_other(x, .15) # change "profexe", "office" , "" "other" 

Comments

Popular posts from this blog

Command prompt result in label. Python 2.7 -

javascript - How do I use URL parameters to change link href on page? -

amazon web services - AWS Route53 Trying To Get Site To Resolve To www -