function - Using optimize() to find the shortest interval that takes 95% area under a curve in R -
background:
i have curve y-values produced small r function below (neatly annotated). if run entire r code, see curve (but remember, it's function if changed argument values, different curve):
question:
obviously, 1 can determine/assume many intervals cover/take 95% of total area under curve. using, optimize(), how can find shortest (in x-value units) of these many possible 95% intervals? corresponding x-values the 2 ends of shortest 95% interval?
note: idea of shortest interval uni-modal curve mine makes sense. in reality, shortest 1 1 tends toward middle height (y-value) larger, x-value doesn't need large intended interval cover/take 95% of total area under curve.
here r code (please run entire code):
ppp <- function(f, n, df1, df2, petasq, alpha, beta) { pp <- function(petasq) dbeta(petasq, alpha, beta) ll <- function(petasq) df(f, df1, df2, (petasq * n) / (1 - petasq) ) marg <- integrate(function(x) pp(x)*ll(x), 0, 1)[[1]] po <- function(x) pp(x)*ll(x) / marg return(po(petasq) ) } ## @@@ end of r function. # use function above y-values plot: petasq <- seq(0, 1, = .0001) ## these x-values plot f <- 30 # function needed argument df1 <- 3 # function needed argument df2 <- 108 # function needed argument n <- 120 # function needed argument alpha = 5 # function needed argument beta = 4 # function needed argument ## use ppp() function y-values x-value range above: y.values <- ppp(f, n, df1, df2, petasq, alpha, beta) ## plot petasq (as x-values) against y.values: plot(petasq, y.values, ty="l", lwd = 3 )
based on revised question, found optimization minimizes shortest distance (in x-value units) between left , right boundaries:
ppp <- function(petasq, f, n, df1, df2, alpha, beta) { pp <- function(petasq) dbeta(petasq, alpha, beta) ll <- function(petasq) df(f, df1, df2, (petasq * n) / (1 - petasq) ) marg <- integrate(function(x) pp(x)*ll(x), 0, 1)[[1]] po <- function(x) pp(x)*ll(x) / marg return(po(petasq) ) } petasq <- seq(0, 1, = .0001) ## these x-values plot f <- 30 # function needed argument df1 <- 3 # function needed argument df2 <- 108 # function needed argument n <- 120 # function needed argument alpha = 5 # function needed argument beta = 4 # function needed argument optim_func <- function(x_left) { int_function <- function(petasq) { ppp(petasq, f=f, n=n, df1=df1, df2=df2, alpha=alpha, beta=beta) } # every left value, find corresponding right value gives 95% area. find_95_right <- function(x_right) { (0.95 - integrate(int_function, lower=x_left, upper=x_right, subdivisions = 10000)$value)^2 } x_right_obj <- optimize(f=find_95_right, interval=c(0.5,1)) if(x_right_obj$objective > .machine$double.eps^0.25) return(100) #return distance between left , right return(x_right_obj$minimum - x_left) } #minimize distance between left , right x_left <- optimize(f=optim_func, interval=c(0.30,0.40))$minimum find_95_right <- function(x_right) { (0.95 - integrate(int_function, lower=x_left, upper=x_right, subdivisions = 10000)$value)^2 } int_function <- function(petasq) { ppp(petasq, f=f, n=n, df1=df1, df2=df2, alpha=alpha, beta=beta) } x_right <- optimize(f=find_95_right, interval=c(0.5,1))$minimum see comments in code. satisfies question :) results:
> x_right [1] 0.5409488 > x_left [1] 0.3201584 also, can plot distance between left , right function of left boundary:
left_x_values <- seq(0.30, 0.335, 0.0001) distance <- sapply(left_x_values, optim_func) plot(left_x_values, distance, type="l") 

Comments
Post a Comment