Help for package conf

Type:

Package

Title:

Visualization and Analysis of Statistical Measures of Confidence

Version:

1.9.1

Maintainer:

Christopher Weld <ceweld241@gmail.com>

Imports:

graphics, stats, statmod, fitdistrplus, pracma, rootSolve, utils

Description:

Enables: (1) plotting two-dimensional confidence regions, (2) coverage analysis of confidence region simulations, (3) calculating confidence intervals and the associated actual coverage for binomial proportions, (4) calculating the support values and the probability mass function of the Kaplan-Meier product-limit estimator, and (5) plotting the actual coverage function associated with a confidence interval for the survivor function from a randomly right-censored data set. Each is given in greater detail next. (1) Plots the two-dimensional confidence region for probability distribution parameters (supported distribution suffixes: cauchy, gamma, invgauss, logis, llogis, lnorm, norm, unif, weibull) corresponding to a user-given complete or right-censored dataset and level of significance. The crplot() algorithm plots more points in areas of greater curvature to ensure a smooth appearance throughout the confidence region boundary. An alternative heuristic plots a specified number of points at roughly uniform intervals along its boundary. Both heuristics build upon the radial profile log-likelihood ratio technique for plotting confidence regions given by Jaeger (2016) <doi:10.1080/00031305.2016.1182946>, and are detailed in a publication by Weld et al. (2019) <doi:10.1080/00031305.2018.1564696>. (2) Performs confidence region coverage simulations for a random sample drawn from a user- specified parametric population distribution, or for a user-specified dataset and point of interest with coversim(). (3) Calculates confidence interval bounds for a binomial proportion with binomTest(), calculates the actual coverage with binomTestCoverage(), and plots the actual coverage with binomTestCoveragePlot(). Calculates confidence interval bounds for the binomial proportion using an ensemble of constituent confidence intervals with binomTestEnsemble(). Calculates confidence interval bounds for the binomial proportion using a complete enumeration of all possible transitions from one actual coverage acceptance curve to another which minimizes the root mean square error for n <= 15 and follows the transitions for well-known confidence intervals for n > 15 using binomTestMSE(). (4) The km.support() function calculates the support values of the Kaplan-Meier product-limit estimator for a given sample size n using an induction algorithm described in Qin et al. (2023) <doi:10.1080/00031305.2022.2070279>. The km.outcomes() function generates a matrix containing all possible outcomes (all possible sequences of failure times and right-censoring times) of the value of the Kaplan-Meier product-limit estimator for a particular sample size n. The km.pmf() function generates the probability mass function for the support values of the Kaplan-Meier product-limit estimator for a particular sample size n, probability of observing a failure h at the time of interest expressed as the cumulative probability percentile associated with X = min(T, C), where T is the failure time and C is the censoring time under a random-censoring scheme. The km.surv() function generates multiple probability mass functions of the Kaplan-Meier product-limit estimator for the same arguments as those given for km.pmf(). (5) The km.coverage() function plots the actual coverage function associated with a confidence interval for the survivor function from a randomly right-censored data set for one or more of the following confidence intervals: Greenwood, log-minus-log, Peto, arcsine, and exponential Greenwood. The actual coverage function is plotted for a small number of items on test, stated coverage, failure rate, and censoring rate. The km.coverage() function can print an optional table containing all possible failure/censoring orderings, along with their contribution to the actual coverage function.

Depends:

R (≥ 4.0.0)

License:

GPL (≤ 2)

Encoding:

UTF-8

RoxygenNote:

7.2.3

Suggests:

knitr, rmarkdown, spelling

VignetteBuilder:

knitr

Language:

en-US

NeedsCompilation:

Packaged:

2024-05-05 19:12:31 UTC; christopherweld

Author:

Christopher Weld

[aut, cre], Kexin Feng [aut], Hayeon Park [aut], Yuxin Qin [aut], Xingyu Wang [aut], Heather Sasinowska [aut], Lawrence Leemis [aut], Yuan Chang [ctb], Brock Crook [ctb], Chris Kuebler [ctb], Andrew Loh [ctb], Xin Zhang [ctb]

Repository:

CRAN

Date/Publication:

2024-05-05 21:10:15 UTC

conf: Visualization and Analysis of Statistical Measures of Confidence

Description

Enables:

confidence region plots in two-dimensions corresponding to a user given dataset, level of significance, and parametric probability distribution (supported distribution suffixes: cauchy, gamma, invgauss, lnorm, llogis, logis, norm, unif, weibull),
coverage simulations (if a point of interest is within or outside of a confidence region boundary) for either random samples drawn from a user-specified parametric distribution or for a user-specified dataset and point of interest,
calculating confidence intervals and the associated actual coverage for binomial proportions, and
calculating the support values and the probability mass function of the Kaplan-Meier product-limit estimator.
plotting the actual coverage function for a randomly right-censored data set with exponential failure times and exponential censoring times.

Request from authors: Please properly cite any use of this package and/or its algorithms, which are detailed in the corresponding publication by Weld et al. (2018) <doi:10.1080/00031305.2018.1564696>, Park and Leemis (2019) <doi:10.1002/sim.8189>, Feng et al. (2022) <doi:10.1007/s00180-021-01183-3>, and Qin et al. (2023) <doi:10.1080/00031305.2022.2070279>. Additionally, we welcome and appreciate your feedback and insights as to how this resource is being leveraged to improve whatever it is you do. Please include your name and academic and/or business affiliation in your correspondence.

Details

This package includes the functions:

confidence intervals for binomial proportions: binomTest,
actual coverage calculation for binomial proportions: binomTestCoverage,
actual coverage plots for binomial proportions: binomTestCoveragePlot,
ensemble confidence intervals for binomial proportions: binomTestEnsemble,
minimum root mean square confidence intervals for binomial proportions: binomTestMSE,
confidence region coverage analysis: coversim,
confidence region plots: crplot,
actual coverage plot and table: km.coverage,
enumeration of Kaplan-Meier product-limit estimator outcomes: km.outcomes,
probability mass function of the Kaplan-Meier product-limit estimator: km.pmf,
Kaplan-Meier product-limit estimator support values: km.support, and
probability mass functions of the Kaplan-Meier product-limit estimator: km.surv.

Vignettes

The CRAN website https://CRAN.R-project.org/package=conf contains links for vignettes on the crplot, coversim, km.outcomes, km.pmf, km.support, and km.surv functions.

Acknowledgments

The lead author thanks The Omar Bradley Fellowship for Research in Mathematics for funding that partially supported this work.

Author(s)

Christopher Weld, Kexin Feng, Hayeon Park, Yuxin Qin, Xingyu Wang, Heather Sasinowska, Larry Leemis

Maintainer: Christopher Weld <ceweld241@gmail.com>

Confidence Intervals for Binomial Proportions

Description

Generates lower and upper confidence interval limits for a binomial proportion using different types of confidence intervals.

Usage

 binomTest(n, x,
           alpha = 0.05,
           intervalType = "Clopper-Pearson")

Arguments

n

sample size

x

number of successes

alpha

significance level for confidence interval

intervalType

type of confidence interval used; either "Clopper-Pearson", "Wald", "Wilson-Score", "Jeffreys", "Agresti-Coull", "Arcsine", or "Blaker"

Details

Generates a lower and upper confidence interval limit for a binomial proportion using

various types of confidence intervals,
various sample sizes, and
various numbers of successes.

When the binomTest function is called, it returns a two-element vector in which

the first element is the lower bound of the confidence interval, and
the second element is the upper bound of the confidence interval.

This confidence interval is constructed by calculating lower and upper bounds associated with the confidence interval procedure specified by the intervalType argument. Lower bounds that are negative are set to 0 and upper bounds that are greater than 1 are set to 1.

Author(s)

Hayeon Park (hpark031@gmail.com), Larry Leemis (leemis@math.wm.edu)

Examples

  binomTest(10, 6)
  binomTest(100, 30, intervalType = "Agresti-Coull")

Actual Coverage Calculation for Binomial Proportions

Description

Calculates the actual coverage of a confidence interval for a binomial proportion for a particular sample size n and a particular value of the probability of success p for several confidence interval procedures.

Usage

  binomTestCoverage(n, p,
                    alpha = 0.05,
                    intervalType = "Clopper-Pearson")

Arguments

n

sample size

p

population probability of success

alpha

significance level for confidence interval

intervalType

type of confidence interval used; either "Clopper-Pearson", "Wald", "Wilson-Score", "Jeffreys", "Agresti-Coull", "Arcsine", or "Blaker"

Details

Calculates the actual coverage of a confidence interval procedure at a particular value of p for

various types of confidence intervals,
various probabilities of success p, and
various sample sizes n.

The actual coverage for a particular value of p, the probability of success of interest, is

c(p) = \sum_{x=0}^n {I(x,p) {n \choose x} p^x (1-p)^{n-x}},

where I(x,p) is an indicator function that determines whether a confidence interval covers p when X = x (see Vollset, 1993).

The binomial distribution with arguments size = n and prob = p has probability mass function

p(x) = {n \choose x} p^x (1-p)^{n-x}

for x = 0, 1, 2, \ldots, n.

The algorithm for computing the actual coverage for a particular probability of success begins by calculating all possible lower and upper bounds associated with the confidence interval procedure specified by the intervalType argument. The appropriate binomial probabilities are summed to determine the actual coverage at p.

Author(s)

Hayeon Park (hpark031@gmail.com), Larry Leemis (leemis@math.wm.edu)

References

Vollset, S.E. (1993). Confidence Intervals for a Binomial Proportion. Statistics in Medicine, 12, 809-824.

Examples

  binomTestCoverage(6, 0.4)
  binomTestCoverage(n = 10, p = 0.3, alpha = 0.01, intervalType = "Wilson-Score")

Coverage Plots for Binomial Proportions

Description

Generates plots for the actual coverage of a binomial proportion using various types of confidence intervals. Plots the actual coverage for a given sample size and stated nominal coverage 1 - alpha.

Usage

  binomTestCoveragePlot(n,
                        alpha = 0.05,
                        intervalType = "Clopper-Pearson",
                        plo = 0,
                        phi = 1,
                        clo = 1 - 2 * alpha,
                        chi = 1,
                        points = 5 + floor(250 / n),
                        showTrueCoverage = TRUE,
                        gridCurves = FALSE)

Arguments

n

sample size

alpha

significance level for confidence interval

intervalType

type of confidence interval used; either "Clopper-Pearson", "Wald", "Wilson-Score", "Jeffreys", "Agresti-Coull", "Arcsine", or "Blaker"

plo

lower limit for percentile (horizontal axis)

phi

upper limit for percentile (horizontal axis)

clo

lower limit for coverage (vertical axis)

chi

upper limit for coverage (vertical axis)

points

number of points plotted in each segment of the plot; if default, varies with 'n' (see above)

showTrueCoverage

logical; if TRUE (default), a solid red line will appear at 1 - alpha

gridCurves

logical; if TRUE, display acceptance curves in gray

Details

Generates an actual coverage plot for binomial proportions using

various types of confidence intervals, and
various sample sizes.

When the function is called with default arguments,

the horizontal axis is the percentile at which the coverage is evaluated,
the vertical axis is the actual coverage percentage at each percentile, that is, the probability that the true value at a percentile is contained in the corresponding confidence interval, and
the solid red line is the stated coverage of 1 - alpha.

The actual coverage for a particular value of p, the percentile of interest, is

c(p) = \sum_{x=0}^n {I(x,p) {n \choose x} p^x (1-p)^{n-x}},

where I(x,p) is an indicator function that determines whether a confidence interval covers p when X = x (see Vollset, 1993).

The binomial distribution with arguments size = n and prob = p has probability mass function

p(x) = {n \choose x} p^x (1-p)^{n-x}

for x = 0, 1, \ldots, n.

The algorithm for plotting the actual coverage begins by calculating all possible lower and upper bounds associated with the confidence interval procedure specified by the intervalType argument. These values are concatenated into a vector which is sorted. Negative values and values that exceed 1 are removed from this vector. These values are the breakpoints in the actual coverage function. The points argument gives the number of points plotted on each segment of the graph of the actual coverage.

The plo and phi arguments can be used to expand or compress the plots horizontally.

The clo and chi arguments can be used to expand or compress the plots vertically.

By default, the showTrueCoverage argument plots a solid horizontal red line at the height of the stated coverage. The actual coverage is plotted with solid black lines for each segment of the actual coverage.

The gridCurves argument is assigned a logical value which indicates whether the acceptance curves giving all possible actual coverage values should be displayed as gray curves.

Author(s)

Hayeon Park (hpark031@gmail.com), Larry Leemis (leemis@math.wm.edu)

References

Vollset, S.E. (1993). Confidence Intervals for a Binomial Proportion. Statistics in Medicine, 12, 809–824.

Examples

  binomTestCoveragePlot(6)
  binomTestCoveragePlot(10, intervalType = "Wilson-Score", clo = 0.8)
  binomTestCoveragePlot(n = 100, intervalType = "Wald", clo = 0, chi = 1, points = 30)

Ensemble Confidence Intervals for Binomial Proportions

Description

Generates lower and upper confidence interval limits for a binomial proportion using an ensemble of confidence intervals.

Usage

  binomTestEnsemble(n, x,
                    alpha = 0.05,
                    CP = TRUE,
                    WS = TRUE,
                    JF = TRUE,
                    AC = TRUE,
                    AR = TRUE)

Arguments

n

sample size

x

number of successes

alpha

significance level for confidence interval

CP

logical; if TRUE (default), include Clopper-Pearson confidence interval procedure in the ensemble

WS

logical; if TRUE (default), include Wilson-Score confidence interval procedure in the ensemble

JF

logical; if TRUE (default), include Jeffreys confidence interval procedure in the ensemble

AC

logical; if TRUE (default), include Agresti-Coull confidence interval procedure in the ensemble

AR

logical; if TRUE (default), include Arcsine confidence interval procedure in the ensemble

Details

Generates lower and upper confidence interval limits for a binomial proportions using

various sample sizes,
various numbers of successes, and
various combinations of confidence intervals.

When the binomTestEnsemble function is called, it returns a two-element vector in which

the first element is the lower bound of the Ensemble confidence interval, and
the second element is the upper bound of the Ensemble confidence interval.

To construct an Ensemble confidence interval that attains an actual coverage that is close to the stated coverage, the five constituent confidence interval procedures can be combined. Since these intervals vary in width, the lower limits and the actual coverage of the constituent confidence intervals at the maximum likelihood estimator are calculated. Likewise, the upper limits and the actual coverage of the constituent confidence intervals at the maximum likelihood estimator are calculated. The centroids of the lower and upper constituent confidence intervals for points falling below and above the stated coverage are connected with a line segment. The point of intersection of these line segments and the stated coverage gives the lower and upper bound of the Ensemble confidence interval. Special cases to this approach are given in the case of (a) the actual coverages all fall above or below the stated coverage, and (b) the slope of the line connecting the centroids is infinite.

If only one of the logical arguments is TRUE, the code returns a simple confidence interval of that one procedure.

The Wald confidence interval is omitted because it degenerates in actual coverage for x = 0 and x = n.

Author(s)

Hayeon Park (hpark031@gmail.com), Larry Leemis (leemis@math.wm.edu)

References

Park, H., Leemis, L. (2019), "Ensemble Confidence Intervals for Binomial Proportions", Statistics in Medicine, 38 (18), 3460-3475.

Examples

  binomTestEnsemble(10, 3)
  binomTestEnsemble(100, 82, CP = FALSE, AR = FALSE)
  binomTestEnsemble(33, 1, CP = FALSE, JF = FALSE, AC = FALSE, AR = FALSE)

RMSE-Minimizing Confidence Intervals for Binomial Proportions

Description

Generates lower and upper confidence interval limits for a binomial proportion that minimizes the root mean square error (RMSE) of the actual coverage function.

Usage

  binomTestMSE(n, x,
               alpha = 0.05,
               smooth = 1,
               showRMSE = TRUE,
               showAll = FALSE)

Arguments

n

sample size

x

number of successes

alpha

significance level for confidence interval

smooth

smoothness index

showRMSE

a logical variable indicating whether to show the value of RSME

showAll

a logical variable indicating whether to show confidence intervals of all possible number of successes

Details

Generates lower and upper confidence interval limits for a binomial proportion for

various sample sizes,
various numbers of successes.

When the binomTestMSE function is called, it returns a two-element vector in which

the first element is the lower bound of the RMSE-minimizing confidence interval, and
the second element is the upper bound of the RMSE-minimizing confidence interval.

An RMSE-minimizing two-sided 100 * (1 - alpha) percent confidence interval for p is constructed from a random sample of size n from a Bernoulli(p) population. The parameter x gives the number of successes in the n mutually independent Bernoulli trials. For n <= 15, all possible jumps between acceptance curves associated with the actual coverage function are enumerated based on their one-to-one relationship with the symmetric Dyck paths. For each sequence of jumps between acceptance curves, the confidence interval bounds that are returned are associated with discontinuities in the actual coverage function that together result in the lowest possible RMSE. A set of smoothness constraints that build on four existing non-conservative confidence intervals (Wilson-score, Jeffreys, Arcsine, and Agresti-Coull) is used if the smoothness index smooth is set to one. These constraints ensure that the RMSE-confidence interval achieves smoothness, a preferable property of the binomial confidence interval that is related to lower bound differences for adjacent values of x. There is a trade-off between the RMSE and the smoothness. For n > 100, smoothness is required. The RMSE usually increases if the smoothness constraints are used. For n > 15, only the symmetric Dyck paths associated with the Wilson–score, Jeffreys, Arcsine, and Agresti–Coull confidence interval procedures are used instead of enumerating because the computation time increases in a factorial fashion in n. The minimal RMSE is not guaranteed for n > 15 because another symmetric Dyck path other than those associated with the four existing confidence interval procedures might prove to be optimal. However, this procedure does ensure a lower RMSE than any of the four existing confidence intervals for all n.

Author(s)

Kexin Feng (kfeng@caltech.edu), Larry Leemis (leemis@math.wm.edu), Heather Sasinowska (hdsasinowska@wm.edu)

References

Feng, K., Sasinowska, H., Leemis, L. (2022), "RMSE-Minimizing Confidence Interval for the Binomial Parameter", Computational Statistics, 37 (4), 2022, 1855-1885.

Examples

  binomTestMSE(10, 3)

Confidence Region Coverage

Description

Creates a confidence region and determines coverage results for a corresponding point of interest. Iterates through a user specified number of trials. Each trial uses a random dataset with user-specified parameters (default) or a user specified dataset matrix ('n' samples per column, 'iter' columns) and returns the corresponding actual coverage results. See the CRAN website https://CRAN.R-project.org/package=conf for a link to a coversim vignette.

Usage

coversim(alpha, distn,
                n         = NULL,
                iter      = NULL,
                dataset   = NULL,
                point     = NULL,
                seed      = NULL,
                a         = NULL,
                b         = NULL,
                kappa     = NULL,
                lambda    = NULL,
                mu        = NULL,
                s         = NULL,
                sigma     = NULL,
                theta     = NULL,
                heuristic = 1,
                maxdeg    = 5,
                ellipse_n = 4,
                pts       = FALSE,
                mlelab    = TRUE,
                sf        = c(5, 5),
                mar       = c(4, 4.5, 2, 1.5),
                xlab      = "",
                ylab      = "",
                main      = "",
                xlas      = 0,
                ylas      = 0,
                origin    = FALSE,
                xlim      = NULL,
                ylim      = NULL,
                tol       = .Machine$double.eps ^ 1,
                info      = FALSE,
                returnsamp  = FALSE,
                returnquant = FALSE,
                repair    = TRUE,
                exact     = FALSE,
                showplot  = FALSE,
                delay     = 0 )

Arguments

alpha

significance level; scalar or vector; resulting plot illustrates a 100(1 - alpha)% confidence region.

distn

distribution to fit the dataset to; accepted values: 'cauchy', 'gamma', 'invgauss', 'logis', 'llogis', 'lnorm', 'norm', 'unif', 'weibull'.

n

trial sample size (producing each confidence region); scalar or vector; needed if a dataset is not given.

iter

iterations (or replications) of individual trials per parameterization; needed if a dataset is not given.

dataset

a 'n' x 'iter' matrix of dataset values, or a vector of length 'n' (for a single iteration).

point

coverage is assessed relative to this point.

seed

random number generator seed.

a