function - Using optimize() to find the shortest interval that takes 95% area under a curve in R -

February 15, 2014

background:

i have curve y-values produced small r function below (neatly annotated). if run entire r code, see curve (but remember, it's function if changed argument values, different curve):

question:

obviously, 1 can determine/assume many intervals cover/take 95% of total area under curve. using, optimize(), how can find shortest (in x-value units) of these many possible 95% intervals? corresponding x-values the 2 ends of shortest 95% interval?

note: idea of shortest interval uni-modal curve mine makes sense. in reality, shortest 1 1 tends toward middle height (y-value) larger, x-value doesn't need large intended interval cover/take 95% of total area under curve.

here r code (please run entire code):

ppp <- function(f, n, df1, df2, petasq, alpha, beta) {   pp <- function(petasq) dbeta(petasq, alpha, beta)  ll <- function(petasq) df(f, df1, df2, (petasq * n) / (1 - petasq) )   marg <- integrate(function(x) pp(x)*ll(x), 0, 1)[[1]]  po <- function(x) pp(x)*ll(x) / marg return(po(petasq) )  } ## @@@ end of r function.  # use function above y-values plot:  petasq  <- seq(0, 1, = .0001) ## these x-values plot f  <- 30       # function needed argument df1 <- 3       # function needed argument df2 <- 108     # function needed argument n  <- 120      # function needed argument alpha = 5      # function needed argument beta = 4       # function needed argument   ## use ppp() function y-values x-value range above: y.values <- ppp(f, n, df1, df2, petasq, alpha, beta)  ## plot petasq (as x-values) against y.values: plot(petasq, y.values, ty="l", lwd = 3 )

based on revised question, found optimization minimizes shortest distance (in x-value units) between left , right boundaries:

ppp <- function(petasq, f, n, df1, df2, alpha, beta) {   pp <- function(petasq) dbeta(petasq, alpha, beta)  ll <- function(petasq) df(f, df1, df2, (petasq * n) / (1 - petasq) )   marg <- integrate(function(x) pp(x)*ll(x), 0, 1)[[1]]  po <- function(x) pp(x)*ll(x) / marg return(po(petasq) ) }  petasq  <- seq(0, 1, = .0001) ## these x-values plot f  <- 30       # function needed argument df1 <- 3       # function needed argument df2 <- 108     # function needed argument n  <- 120      # function needed argument alpha = 5      # function needed argument beta = 4       # function needed argument  optim_func <- function(x_left) {     int_function <- function(petasq) {         ppp(petasq, f=f, n=n, df1=df1, df2=df2, alpha=alpha, beta=beta)     }      # every left value, find corresponding right value gives 95% area.        find_95_right <- function(x_right) {         (0.95 - integrate(int_function, lower=x_left, upper=x_right, subdivisions = 10000)$value)^2     }     x_right_obj <- optimize(f=find_95_right, interval=c(0.5,1))     if(x_right_obj$objective > .machine$double.eps^0.25) return(100)      #return distance between left , right     return(x_right_obj$minimum - x_left) }  #minimize distance between left , right x_left <- optimize(f=optim_func, interval=c(0.30,0.40))$minimum find_95_right <- function(x_right) {     (0.95 - integrate(int_function, lower=x_left, upper=x_right, subdivisions = 10000)$value)^2 }     int_function <- function(petasq) {         ppp(petasq, f=f, n=n, df1=df1, df2=df2, alpha=alpha, beta=beta)     } x_right <- optimize(f=find_95_right, interval=c(0.5,1))$minimum

see comments in code. satisfies question :) results:

> x_right [1] 0.5409488 > x_left [1] 0.3201584

also, can plot distance between left , right function of left boundary:

left_x_values <- seq(0.30, 0.335, 0.0001) distance <- sapply(left_x_values, optim_func)  plot(left_x_values, distance, type="l")

Search This Blog

MOno