Help for package evidence

Version:

0.8.10

Date:

2018-04-15

Title:

Analysis of Scientific Evidence Using Bayesian and Likelihood Methods

Author:

Robert van Hulst

Maintainer:

Robert van Hulst <rvhulst@ubishops.ca>

BugReports:

https://github.com/rvhulst/evidence/

Depends:

rstan, rstanarm, loo, lattice, stats, utils, graphics, grDevices

Imports:

LearnBayes, LaplacesDemon,

ByteCompile:

TRUE

Description:

Bayesian (and some likelihoodist) functions as alternatives to hypothesis-testing functions in R base using a user interface patterned after those of R's hypothesis testing functions. See McElreath (2016, ISBN: 978-1-4822-5344-3), Gelman and Hill (2007, ISBN: 0-521-68689-X) (new edition in preparation) and Albert (2009, ISBN: 978-0-387-71384-7) for good introductions to Bayesian analysis and Pawitan (2002, ISBN: 0-19-850765-8) for the Likelihood approach. The functions in the package also make extensive use of graphical displays for data exploration and model comparison.

License:

GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]

NeedsCompilation:

Packaged:

2018-05-15 14:42:44 UTC; rvhulst

Repository:

CRAN

Date/Publication:

2018-05-15 15:19:39 UTC

evidence: Functions and Data for Bayesian and Likelihood Analysis

Description

The functions in this package include Bayesian and likelihood alternatives to the standard statistical hypothesis tests that form part of base R. Their aim is to provide a wider perspective on how statistical evidence can be analyzed than the usual hypothesis-testing one. In view of the increasing importance in science of Bayesian and likelihood inference a wider exposure to these alternatives has become overdue.

This package makes Bayesian and likelihood analyses of simple statistical problems as convenient as traditional frequentist ones are in R. In addition, it makes effective use of R's excellent plotting capabilities, and facilitates exploratory data analysis and an interactive approach to modeling. Both data exploration and model exploration are crucial in data analysis, and these are facilitated by an interactive and graphics-centered approach.

Details

Package:	evidence
Type:	Package
Version:	0.8.10
Date:	2018-04-15
License:	GPL

Author(s)

Robert van Hulst

Maintainer: <rvhulst.ubishops.ca>

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Made-up data for a balanced one-way anova.

Description

Made-up data with easy numbers for practicing one-way anova by hand to understand how an anova works.

Usage

data(AOV1)

Format

A data frame with 15 observations on the following 2 variables.

y: response
i: predictor, a factor with 3 levels

Details

Note that the design is balanced.

Source

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

  data(AOV1)
  summary(aov(y ~ i, data=AOV1))

Made-up data for an unbalanced one-way anova.

Description

Made-up data with easy numbers for practicing one-way anova by hand to understand how an anova works.

Usage

data(AOV2)

Format

A data frame with 22 observations on the following 2 variables:

y: response
i: predictor: a factor with 4 levels

Details

Note that the design is unbalanced.

Source

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

  data(AOV2, package)
  summary(aov(y ~ i, data=AOV2))

A contingency table for heart attacks and aspirin use.

Description

The Physicians health study data cross-classified according to Infarct (heart attack or not) and Group (Placebo or Aspirin).

Usage

data(Aspirin)

Format

A 2 by 2 matrix of counts with row names:

Infarct:Yes and Infarct:No,

and column names:

Group:placebo and Group:aspirin.

Source

Steering Committee of the Physicians' Health Study Research Group. 1989. Final report of the aspirin component of the ongoing Physicians' Health Study. N Engl J Med, 321:129–135.

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Bayesian analysis of one sample from a Normal distribution with imprecise priors.

Description

This function performs a standard Bayesian analysis of a single sample of a population presumably following a Normal distribution. Imprecise priors for the mean and the standard deviation are used.

Usage

B1Nmean(x, plotit = TRUE, hists = FALSE, pdf = FALSE)

Arguments

x

a vector of sample values

plotit

should the function produce plots? Defaults to TRUE.

hists

should histograms of the posterior distribution for the data with twenty posterior predictive histograms also be plotted? Defaults to FALSE.

pdf

should the histograms be saved as a pdf-file? Defaults to FALSE.

Value

none produced: text and graphical output are produced

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

## Not run: 
data(Fat)
B1Nmean(Fat$Height)

## End(Not run)

Bayesian analysis of a Normal sample using a SIR prior.

Description

This function performs a standard Bayesian analysis of a single sample of a population assumed to follow a Normal distribution. A Standard Improper Reference prior is assumed.

Usage

B1Nsir(x, r = 10000, alpha = 0.05)

Arguments

x

a vector of sample values

r

the number of samples to be taken from the posterior distribution (defaults to 10000)

alpha

1 - level of credibility, so that for alpha = 0.05 (the default) credible intervals will have 95% credibility

Value

none returned; the function produces a plot of the posterior distribution and prints some statistics.

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(darwin)
B1Nsir(darwin$difference)

Bayesian analysis of the binomial parameter for one sample.

Description

This function computes the posterior distribution of the binomial probability \pi when given the number of “successes” and the sample size, as well as one of a choice of priors. A plot of the posterior distribution is produced with the 95% credible interval of \pi.

Usage

B1prop(s, n, p = 0.5, alpha = 0.05, prior = c("uniform", "near_0.5",
  "not_near_0.5", "near_0", "near_1", "custom"), params = NULL)

Arguments

s

the number of sampling units with the feature

n

the number of sampling units examined

p

an optional hypothesized probability

alpha

1 - alpha is the desired level of credibility of a credible interval

prior

one of: "uniform", "near_0.5", "not_near_0.5", "near_0", "near_1", "custom", which are all beta distributions with appropriate parameter values. Note that if prior="custom" the following argument has to be supplied:

params

a vector with the a and b parameters of the custom beta prior

Value

the posterior probability

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

B1prop(13, 100, .1, prior="near_0")

simulates Bayesian updating of the binomial parameter `\pi`.

Description

Provides a simple demonstration of how the posterior distribution improves as increasing amounts of data become available. A Binomial variable with a known parametric probability is sampled, and as increasing numbers of samples become available the posterior distribution is re-evaluated and plotted.

Usage

B1propSim(p, N = 100, prior = c("uniform", "near_0.5",
  "not_near_0.5", "near_0", "near_1"))

Arguments

p

the “real” binomial probability; if a number samller than 0 or one lager than 1 isentered the function will choose an arbitrary probability

N

the number of observations to accumulate

prior

one of: "uniform", "near_0.5", "not_near_0.5", "near_0", or "near_1".

Value

none returned; the function is run for the plot it produces.

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

B1propSim(p = 0.44, prior = "near_0.5")

Bayesian analysis of the means of two Normal samples using SIR priors.

Description

Produces exploratory plots (boxplots and, if the sample sizes are equal), a quantile-quantile plot of the two samples. Also produces Bayesian posterior densities of the two sample means and of the difference between the means. The priors used are standard improper reference priors.

Usage

B2Nsir(formula, data, var.equal = TRUE, alpha = 0.05, plotit = TRUE, r = 10000)

Arguments

formula

the standard formula interface: response ~ factor

data

a data.frame containing the response and the two-level factor

var.equal

if TRUE the group variances are assumed to be equal, if FALSE two separate group variances are estimated

alpha

1 - level of credibility, so that for alpha = 0.05 (the default) credible intervals will have 95% credibility

plotit

should plots be produced?

r

the number of samples from the posterior distribution; can usually be left at its default value of 10000

Details

Note that in the first plot the second sub-plot is NOT a normality plot but a quantile-quantile plot that compares the observations in the two groups.

Value

none returned; the function produces several plots and prints some statistics.

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(bodytemp)
B2Nsir(temperature ~ gender, bodytemp)

Bayesian analysis of the binomial parameters for two samples.

Description

This function computes the posterior distributions of the binomial parameters \pi[1] and \pi[2] when given the numbers of “successes” and the sample sizes for the two samples. It uses uniform priors. A plot of the posterior distributions of the two \pi's is produced, and a plot of the posterior distribution of \pi[1] - \pi[2] with its 95% credible interval.

Usage

B2props(s, n, alpha = 0.05)

Arguments

s

a vector containing the 2 numbers of sampling units with the feature ("success")

n

a vector containing the 2 numbers of sampling units examined

alpha

1 - level of credibility, so that for alpha = 0.05 (the default) credible intervals will have 95% credibility

Value

None, the inferred difference between the probabilities and its 95% credible interval is calculated and several plots are produced

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

B2props(c(13, 22), c(78, 92))

Bayesian analysis of a 2 x 2 contingency table.

Description

A 2 x 2 contingency table (in matrix form) is analyzed in a Bayesian way using uniform priors. The posterior probabilities of each of the the two outcomes given the other factor levels are calculated. See MacKay(2003, p. 460).

Usage

Bft2x2(X, div = 100, plotit = TRUE)

Arguments

X

a contingency table in the form of a 2 x 2 matrix with row and column names

div

optional: the number of divisions for the row and column variables for use in calculations (can be left at 100)

plotit

should plots be produced? (defaults to TRUE)

Details

Note that the rows of the 2 x 2 matrix are assumed to represent the "outcomes" and the columns the "treatments"—where these expressions are applicable. Note also that to obtain properly labeled plots the matrix has to be supplied with dimnames.

Value

the matrix of div x div posterior probabilities that was plotted

Author(s)

Robert van Hulst

References

MacKay, D.J.C. 2003. Information Theory, Inference, and Learning Algorithms. Cambridge University Press, Cambridge.

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(Glasses)
Bft2x2(Glasses)

a simple example of the bias–variance trade-off.

Description

A total of eight models are fitted to a data set consisting of seven predictors. The response is the exact fit with a variable amount of zero-mean noise added. This is repeated a certain number of times (by default, 100 times). Plots of Bias^2 and variance vs. the number of parameters are produced.

Usage

BiasVarTO(times = 100)

Arguments

times

the number of repeats to average bias and variance over (default 100)

Value

none produced, the function produces two plots

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Simulated clutch size data for birds with different nesting locations.

Description

These made-up data do respect the average clutch sizes (number of eggs laid in a single brood) and incubation periods that were observed in different European bird species with four different types of nests, as reported in Case(2000).

Usage

data(BirdsCS)

Format

A data frame with 40 observations on the following 3 variables:

Nest: kind of nest, a factor with levels hole, roofed, niche, and open
Inc.Per: average duration of the incubation period (days)
ClutchSize: the typical number of eggs in a nest

Source

Case, T.J. An Illustrated Guide to Theoretical Ecology. Oxford University Press, New York.

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(BirdsCS)
library(graphics)
coplot(ClutchSize ~ Inc.Per | Nest, BirdsCS, panel=panel.smooth)

Bayesian analysis of n >= 2 Normal means with standard improper reference priors.

Description

Several exploratory plots are produced, after which this function calculates and plots the posterior densities of the treatment means and their differences. Pooled or separate variances can be specified. Note that this function uses Standard Improper Reference (SIR) priors.

Usage

BnNsir(formula, data, var.equal = TRUE, alpha = 0.05, plotit = TRUE,
 r = 10000)

Arguments

formula

the usual formula interface: response ~ factor

data

a data.frame containing the response and the factor variables

var.equal

should a pooled variance be used? Specify var.equal = FALSE if you want separate variances to be fitted

alpha

1 - level of credibility, so that for alpha = 0.05 (the default) credible intervals will have 95% credibility

plotit

are plots desired?

r

the number of samples of the posterior that should be taken

Value

none returned: the function is used for the plots and the printed information it produces

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(PlantGrowth)
BnNsir(weight ~ group, PlantGrowth)

Bayesian regression model comparison with Bayes factors.

Description

This function compares different linear models on the basis of their Bayes factors and by graphically comparing posterior model probabilities.

Usage

Bregbf(form.list, data, l=length(form.list))

Arguments

form.list

a list of linear models, each expressed by a model formula, that should be compared; the models must all be applicable to the same data frame and use the same response variable

data

a data frame to be analyzed

l

the number of models to be compared; defaults to all models in the form.list

Details

Note that a list containing several appropriate models for the data frame should be prepared beforehand. See the example for how to do this.

Value

A list with model parameter probabilities is silently returned.

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

## Not run: 
data(PlantGrowth)
frmlst <- list(
model0 = formula(weight ~ 1),
model1 = formula(weight ~ group) )
Bregbf(form.list=frmlst, data=PlantGrowth)
data(fev)
frmlst.fev <- list(
formula(FEV ~ Age),
formula(FEV ~ Smoke),
formula(FEV ~ Age + Smoke),
formula(FEV ~ Age * Smoke)
)
Bregbf(frmlst.fev, fev)

## End(Not run)

Bayesian t-test using reference priors.

Description

The Bayesian “t-test” developed by Bernardo and Perez (2007) that calculates the Bayes-factor against the null hypothesis of no difference.

Usage

Bt.test(formula, data, plotit = TRUE)

Arguments

formula

the usual formula interface: response ~ factor

data

a data.frame with the response values and the factor values for all samples; the factor can only have two factor levels

plotit

is plotted output required?

Value

none supplied: the function is used for the plotted and printed output it produces

Author(s)

Robert van Hulst

References

J. Bernardo and S. Perez. Comparing normal means: New methods for an old problem. Bayesian Analysis, 2:45–58, 2007.

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(bodytemp)
Bt.test(temperature ~ gender, bodytemp)

Bt.test(heart.rate ~ gender, bodytemp)

Contingency Table Analysis in different ways

Description

An n x n contingency table is analyzed in frequentist, information-theoretical, likelihood, and Bayesian ways. Note that for the Bayesian analysis package LearnBayes needs to be installed.

Usage

CTA(X, extBayes = FALSE)

Arguments

X

a matrix with non-negative integers representing the counts for the row-column levels

extBayes

should a Bayesian analysis with a near-independence prior (instead of only an independence prior) be done as well? Defaults to FALSE.

Value

none provided: the function is run for its graphical and numerical output

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(Smoking)
CTA(Smoking)

Made-up data to illustrate Simpson's paradox.

Description

These made-up data illustrate the discrete form (contingency table form) of Simpson's paradox.

Usage

data(Clin)

Format

A three-dimensional array of frequencies with:

rows indicating "outcome" (either "death" or "cured"),

columns indicating "male" (either "Yes" or "No"), and

layers indicating "clinic" (either "A" or "B").

Source

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(Clin)
Clin[1,,]
prop.table(Clin[1,,], 2)

Human body fat and several covariates for calculating it.

Description

Data from Johnson (1996) on human body fat: determined by under-water weight and several covariates to estimate it statistically.

Usage

data(Fat)

Format

A data frame with 252 observations on the following 19 variables:

Case: case number
PBF.B: percentage body fat estimated using Brozek's equation
PBF.S: percentage body fat estimated using Siri's equation
Dens: Density (gm/cm^3)
Age: Age (yrs)
Weight: Weight (lbs)
Height: Height (inches)
AI: Adiposity index = Weight/Height^2 (kg/m^2)
FFWt: Fat Free Weight using Brozek's formula (lbs)
Neck: Neck circumference (cm)
Chest: Chest circumference (cm)
Abd: Abdomen circumference (cm)
Hip: Hip circumference (cm)
Thigh: Thigh circumference (cm)
Knee: Knee circumference (cm)
Ankle: Ankle circumference (cm)
Biceps: Extended biceps circumference (cm)
FArm: Forearm circumference (cm)
Wrist: Wrist circumference (cm)

Source

Johnson, R. 1996. Fitting percentage of body fat to simple body measurements. Journal of Statistics Education 2(1), 1–6.

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(Fat)
qqnorm(Fat$Height)
qqline(Fat$Height)

A contingency table of 16 British youths categorized as juvenile delinquents or not, and as wearing glasses or not.

Description

Data from Heidelberger and Holland(2004) categorizing a random sample of 16 British juveniles on the basis of whether they were juvenile delinquents or not, and whether wore glasses or not.

Usage

data(Glasses)

Format

A matrix with 16 counts cross-classified on Juvenile delinquency (rows) and the wearing of glasses (columns).

Source

Heiberger, R.M. and Holland, B.(2004) Statistical Analysis and Data Display: An Intermediate Course with Examples in S-PLUS, R, and SAS. Springer, New York.

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(Glasses)
Bft2x2(Glasses)

generates the 100 * (1 - alpha)% most probable interval of a distribution of empirical values

Description

function used to produce a Bayesian credible interval of a unimodal distribution of empirical values using the Highest Posterior Probability approach

Usage

HPDcrd(x, alpha = 0.05)

Arguments

x

a vector of empirical values

alpha

1 - alpha is the desired level of credibility

Value

a vector of the lower and upper limits of the 95% credible interval calculated using a standard algorithm

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

HPDcrd(rnorm(1000))

Morphology of horseshoe crabs.

Description

Data on horseshoe crab morphology collected by Brockman(1996) and used by Agresti(2012).

Usage

data(HSCrab)

Format

A data frame with 173 observations on the following 5 variables:

Col: an indicator variable for the carapace color
spineW: coded width of the spine
Width: maximal width of the carapace (cm)
Satell: number of satellite males
Weight: weight in g

Source

Brockman, H.J.(1996) Satellite male groups in horseshoe crabs, Limulus polyphemus Ethology 102(1), 1–21.

References

Agresti, A.(2012) Categorical Data Analysis (3rd ed.) Wiley, New York.

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(HSCrab)
plot(Weight ~ Width, col = Col, data = HSCrab)

Likelihood analysis of the binomial parameter for one sample.

Description

When given the number of “successes” and the sample size this function plots the normed likelihood of values of the binomial parameter \pi and calculates the likelihood ratio for a hypothesized value and the maximum likelihood value for the sample, as well as an approximate frequentist p-value.

Usage

L1prop(x, n, p.hypoth, pLset=0.05)

Arguments

x

the number of sampling units with the feature

n

the number of sampling units examined

p.hypoth

the hypothesized probability

pLset

the desired likelihood for the likelihood interval

Value

none, the normed likelihood for different values of the binomial probability is plotted with the likelihood interval, and some information is printed

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Pawitan, Y. 2001. In All Likelihood. Oxford University Press, Oxford.

Examples

L1prop(13, 78, 0.02)

Likelihood analysis of the binomial parameters for two samples.

Description

When given the numbers of “successes” and the sample sizes for the two samples, this function plots the normed likelihoods of the two samples and calculates the likelihood ratio for two different models, one fitting two binomial parameters, and one fitting only one.

Usage

L2prop(x, n)

Arguments

x

a vector containing the 2 numbers of sampling units with the feature

n

a vector containing the 2 numbers of sampling units examined

Value

none, the inferred difference between the probabilities and its 95% credible interval are calculated and a plot is produced

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

L2prop(c(13, 22), c(78, 92))

Computes the posterior probability of having a certain disease from prevalence, sensitivity, and specificity data.

Description

If experimental data on the sensitivity and the specificity of a diagnostic test are available, and the prevalence of the the condition is known with its raw data, then this function estimates the posterior probability of having the condition, with its 95% credible interval.

Usage

MedDiagn(x0, n0, x1, n1, x2, n2, N = 10000,
  alpha = 0.05, pdf = FALSE)

Arguments

x0

prevalence raw data: number of people with a certain condition

n0

number of people examined for that condition

x1

sensitivity data: number of people with the disease for whom this test was positive

n1

total number of people in the sensitivity sample

x2

specificity raw data: number of people who did not have the disease who tested negative

n2

total number of people in the specificity sample

N

number of cases to be simulated (best left at 10000 or greater

alpha

credibility required (default 95%)

pdf

set this to TRUE only if you want to keep a pdf-file of the posterior probability plot

Value

none returned: a plot and printed information are produced

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

MedDiagn(105, 35000, 72, 80, 640, 800)

computes the Negative Predictive Value.

Description

The negative predictive value (NPV) of a diagnostic test is the probability that someone with a negative diagnostic test for a condition does not have the condition. The NPV can easily be calculated from the prevalence, the sensitivity, and the specificity, but this function automates the procedure.

Usage

NPV(sens, spec, prev)

Arguments

sens

the sensitivity of the test

spec

the specificity of the test

prev

the prevalence of the disease

Value

the negative predictive value

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

NPV(0.9, 0.8, 0.003)

calculates the positive predictive value (PPV) of a diagnostic test.

Description

The positive predictive value (PPV) of a dianostic test is the probability that someone with a positive diagnostic test for a condition does have the condition. The PPV can easily be calculated from the prevalence, the sensitivity, and the specificity, but this function automates the procedure.

Usage

PPV(sens, spec, prev)

Arguments

sens

the sensitivity of the test

spec

the specificity of the test

prev

the prevalence of the disease

Value

the positive predictive value of the test

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples


PPV(0.9, 0.8, 0.003)

Data of the growth of tissue cultures on five different media.

Description

These data came from a designed experiment reported in Sokal and Rohlf(1995), box 9.4. The growth (in arbitrary units) of pea sections grown in tissue culture on five different sugars was replicated ten times.

Usage

data(SRb94)

Format

A data frame with 50 observations on the following 2 variables:

L: length difference in mm
Treatm: a factor with levels "Contr", "fruct.", "gluc.", "gluc&fruct.", and "sucr."

Source

Sokal, R.R., and Rohlf, F.J. Biometry. Freeman, New York.

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(SRb94)
with(SRb94, meansplot(L, Treatm))

A support function that calculates the sum of squares of a data vector.

Description

The sum of squares of the input vector is returned.

Usage

SSQ(x)

Arguments

x

a vector of numbers without missing values

Value

the sum of squares of x

Author(s)

Robert van Hulst

Examples

SSQ(x = rnorm(n=100))

Mortality due to heart infarct in smokers and non-smokers.

Description

The data are from a retrospective study that compared mortality due to a heart infarct in people who smoked and sex-matched controls who did not.

Usage

data(Smoking)

Format

A matrix with 781 observations cross-classified on the following 2 factors: “Infarct” (”Yes” or “Control”, rows), and “EverSmoked” (”Yes” or “No”, columns).

Source

unknown

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Data on the incidence of hypertension and three indicator variables.

Description

A total of 433 persons were tested for hypertension and checked for whether they were smokers, obese, or snored. The data are in Altman(1991).

Usage

data(Snoring)

Format

A data frame with 8 observations on the following 5 variables:

smoking: did the person smoke (1) or not (0)?
obese: was the person obese (1) or not (0)?
snoring: did the person snore (1) or not (0)?
n: the number of persons observed with these covariates
hypert: did the person suffer from hypertension (1) or not (0)?

Source

Altman, D.G. 1991. Practical Statistics for Medical Research. Chapman \& Hall, London.

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(Snoring)
fit <- glm(cbind(hypert, n - hypert) ~ smoking + obese + snoring,
  family=binomial, data=Snoring)
summary(fit)

function to plot diverse Beta distributions for use as Binomial priors

Description

This function just plots some Beta distributions with commonly used parameters

Usage

binPriorsPlot()

Value

none produced, the function just produces one (compound) plot

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

binPriorsPlot()

Data on body temperature, heart rate, and gender of 130 human subjects.

Description

These data were collected by Mackowiak, Wasserman, and Levine(1992), and have been used, among others, by Ntzoufras(2009).

Usage

data(bodytemp)

Format

A data frame with 130 observations on the following 3 variables:

temperature: body temperature in degrees Fahrenheit
gender: a factor with levels 'female' and 'male'
heart.rate: heart rate in beats per minute

Source

Mackowiak, P.A., Wasserman, S.S., and Levine, M.M.(1992) A critical appraisal of 98.6 degrees F, the upper limit of the normal body temperature, and other legacies of Carl Reinhold August Wunderlich. JASA 268, 1578–1580.

Ntzoufras, I.(2009) Bayesian Modeling Using Winbugs. Wiley, Hoboken, N.J.

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(bodytemp)
B2Nsir(temperature ~ gender, bodytemp)

Mortality data of moth larvae due to increasing doses of insecticide.

Description

Batches of twenty larvae were exposed to increasing doses of insecticide, and the number of survivors and their sexes were noted. These data were reported by Collett(1991) and used by Venables and Ripley(1994 and later editions). They resulted from an experiment to study the toxicity of a pyrethroid insecticide to the tobacco budworm Heliothis virescens of different doses of the insecticide.

Usage

data(budworm)

Format

A data frame with 12 observations on the following 3 variables:

ldose: the log of the dose of the insecticide
dead: the number of budworms that were dead a day later
sex: a factor with two levels: “F” and “M”

Source

Collett, D. 1991. Modelling Binary Data. Chapman and Hall, London.

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Venables, W.N. and Ripley, B.D. 1994. Modern Applied Statistics with S-PLUS. Springer Verlag, New York.

Examples

data(budworm)
fit <- glm(cbind(dead, 20 - dead) ~ ldose, data=budworm,
family=binomial)
summary(fit)

Made-up data that are not unlike the actual data collected by Nespolo et al.(2003).

Description

Nespolo et al.(2003) collected data on the metabolic rates (as measured by oxygen consumption) of crickets kept and acclimated at three different temperatures. Since the original data were not available and only a statistical summary was published, we simulated these data to approximately agree with the statistical summary.

Usage

data(crickets)

Format

A data frame with 292 observations on the following 3 variables:

VO2: oxygen consumption in \mul/h (a measure of basal metabolic rate)
mass: weight of the cricket in mg
temp: temperature in degrees C.

Source

Nespolo et al., 2003.

References

Nespolo, R.F., Lardies, M.A., and Bozinovic, F. 2003. Intrapopulational variation in the standard metabolic rate of insects: repeatability, thermal dependence and sensitivity of (Q[10]) on oxygen consumption in a cricket. Journal of Experimental Biology 206, 4309–4315.

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(crickets)
crickets7 <- subset(crickets, crickets$temp==7)
with(crickets7, scatter.smooth(mass, VO2))

Charles Darwin's (1876) data on the fecundity of selfed and crossed corn plants.

Description

Charles Darwin(1876) provided data on the difference in the heights attained by selfed and crossed mother plants.

Usage

data(darwin)

Format

A data frame with 15 observations on the following variable:

difference: the difference in height in inches between each paired pair of offspring of a selfed and a crossed mother plant

Source

Darwin, C.R. 1876. The effects of cross and self fertilisation in the vegetable kingdom. John Murray, London.

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(darwin)
with(darwin, qqnorm(difference) )
with(darwin, qqline(difference) )

Data on lung capacity of 654 children and adolescents.

Description

These data come from Rosner (2006), and represent forced expiratory volume (FEV) in l/s and several covariates.

Usage

data(fev)

Format

A data frame with 654 observations on the following 6 variables:

Id: an identification code
Age: age in years
FEV: forced expiratory volume in l/s
Hgt: height in inches
Sex: gender: 0 for female, 1 for male
Smoke: smokes (1) or not (0)

Source

Rosner, B. 2006. Fundamentals of Biostatistics. 6th ed. Duxbury Press.

Examples

data(fev)
splom(fev[c(3, 2, 4, 5, 6)], main="fev data")

Simon Newcomb's measurements of the speed of light

Description

Simon Newcom's measured in the late 1900's the time it took light to cover a certain distance. The data are reported in Stigler(1977) and have been widely used since to illustrate statistical inference.

Usage

data(lightspeed)

Format

A vector with 66 observations of the travel time of light.

Source

Stigler, S.M. (1977) Do robust estimators work with real data? Annals of Statistics 5, 1055–1098.

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(lightspeed)
qqnorm(lightspeed)
qqline(lightspeed)

A dot plot is produced for several related models showing for each model its LOOIC-value with its credible interval.

Description

The LOOIC-value (like the non-Bayesian AIC-value) is a useful measure of model performance for model prediction.

Usage

looicplot(looiclist, modnames, perc = 90)

Arguments

looiclist

a list of character-valued names of rstanarm model objects

modnames

a character-valued vector of model names for each of the models

perc

the percentage credibility for the credible intervals (defaults to 90%)

Value

None provided, but a printed list of looic-values, their standard errors, and credible intervals, and a dot plot with the same information are produced.

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

## Not run: 
data(budworm)
Mbudworm1 <- stan_glm(formula = cbind(dead, 20 - dead) ~ ldose,
                      family = binomial, data = budworm,
                      prior = student_t(df = 7),
                      prior_intercept = student_t(df = 7))
Mbudworm2 <- stan_glm(formula = cbind(dead, 20 - dead) ~ ldose * sex,
                      family = binomial, data = budworm,
                      prior = student_t(df = 7),
                      prior_intercept = student_t(df = 7))
Mbudworm3 <- stan_glm(formula = cbind(dead, 20 - dead) ~ ldose + sex,
                      family = binomial, data = budworm,
                      prior = student_t(df = 7),
                      prior_intercept = student_t(df = 7))
looicplot(looiclist = list("Mbudworm1", "Mbudworm2", "Mbudworm3"),
          modnames = c("~ ldose", "~ ldose + sex", "~ ldose * sex") )

## End(Not run)

Plots a simple strip chart of the observations with group means and grand mean.

Description

A strip chart of the first argument grouped by the second argument is produced. This function is useful for looking at experimental data with a numeric response and a factorial predictor.

Usage

meansplot(y, grp)

Arguments

y

a vector of observed values

grp

a factor of the same length as the observation vector indicating the treatment under which each observation was obtained

Value

none returned: the function is used for the plot it produces

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

data(PlantGrowth)
with(PlantGrowth, meansplot(weight, group))

produces a Normality plot for the argument surrounded by eight other Normality plots for Normal distributions having the same mean and standard deviation as the argument

Description

Normality plots can be hard to judge if one is not experienced. This function plots a Normality plot for the data surrounded by eight other Normality plots for samples with the same mean and standard deviation that were randomly generated. The eight plots provide an idea of the variability to be expected in Normally distributed data.

Usage

nineplot(x)

Arguments

x

a vector of observations to be examined for Normality

Value

none produced: the function is used for the plot it produces

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

nineplot(rt(100, 2))

A robust comparison of the location and the scale of the input vector.

Description

A large sample of Normal-distributed data with more than 10% of the observations further than 1.5 times the IQR from the median shows signs of overdispersion, as recommended in Gelman et al., 2014.

Usage

overdispersionCheck(x)

Arguments

x

an input vector of reals without missing values

Value

The function prints the approximate percentage of observations that are further from the median than would be expected in a normal distribution.

Author(s)

Robert van Hulst

References

Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., and Rubin, D.B. 2014. Bayesian Data Analysis. Third Ed.. CRC Press

Examples

overdispersionCheck(rt(100, 1))

Conversion of a frequentist p-value to the lower bound of the Bayes factor against the null hypothesis assuming equal odds of the null and the alternative

Description

This function computes the approximate lower bound to the Bayes factor of the null hypothesis against the alternative, assuming equal odds of the null and the alternatlve.

Usage

p2BF(p)

Arguments

p

the frequentist p-value (which has to be less than 1/e or 0.37)

Value

the approximate lower bound of the Bayes factor of the null hypothesis against the alternative

Note

the p-value should be less than 1/e (= 0.37).

Author(s)

Robert van Hulst

References

Sellke, T., Bayarri, M.J., and Berger, J.O. 2001. Calibration of p Values for Testing Precise Hypotheses. Am. Statistician 55(1) pp 62–71.

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

p2BF(p = 0.05)

Conversion of a frequentist p-value to a lower bound of the posterior probability that the null hypothesis is true assuming equal odds of the null and the alternative

Description

This function computes the approximate lower bound to the posterior probability of the null hypothesis assuming equal odds of the null and the alternative. See Sellke et al.(2001) for the derivation, and note that the posterior probability of the null hypothesis is what many incorrectly assume the p-value is measuring.

Usage

p2minpp(p)

Arguments

p

the frequentist p-value (which has to be less than 1/e or 0.37)

Value

the approximate lower bound of the posterior probability of the null hypothesis

Note

the p-value should be less than 1/e (0.37).

Author(s)

Robert van Hulst

References

Sellke, T., Bayarri, M.J., and Berger, J.O. 2001. Calibration of p Values for Testing Precise Hypotheses. Am. Statistician 55(1) pp 62–71.

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

p2minpp(p=0.05)

Carcinogenesis data on rats exposed to a carcinogen.

Description

Up to forty eight rats were exposed to the carcinogen retinyl acetate or to a placebo in their diet, after which the number of tumors they developed was evaluated.

Usage

data(rats)

Format

A data frame with 71 observations on the following 2 variables:

y: number of rats that developed tumors
N: number of rats in group

Source

Gail, M.H., Santner, T.J., and Brown, C.C. (1980) An analysis of comparative carcinogenesis experiments based on multiple times to tumor. Biometrics 36, 255–266.

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Universal Fisherian significance test with confidence interval.

Description

Given a critical value alpha, this function performs a Fisherian significance test of the null hypothesis at level p, reports the result of the test, as well as the lower and upper values of the corresponding confidence interval. See Kadane(2016) for the idea for this.

Usage

sigtestCI(p)

Arguments

p

the desired significance level

Details

Note that this function does not require any data: if a rare (as long as p is sufficiently small) event occurs, H[0] is deemed to be implausible, and rejected. If such an event does not occur, we can simply try to do the experiment again. A Neyman-Pearson hypothesis test does require data and also an alternative hypothesis. For a NP hypothesis test we can (and should) consider the power of the test (the probability of rejecting H[0] when H[a] is true).

Value

A message informing the user if H0 was rejected or not and the lower and upper boundaries of the corresponding confidence interval.

Author(s)

Robert van Hulst

References

Kadane, J.B. 2016. Beyond hypothesis testing. Entropy 18, 199.

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

sigtestCI(p=0.05)

Conversion of 2 props input to 2x2 contingency table

Description

This function converts the successes and totals vectors required as input for function B2props to a 2x2 contingency table for input to CTA or Bft2x2.

Usage

sn2ft2x2(s, n)

Arguments

s

a vector of length 2 of successes

n

a vector of length 2 of numbers of trials

Value

a 2 x 2 contingency table equivalent to the two arguments

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

Examples

sn2ft2x2(c(47, 59), c(120, 125))

Plotting routine for dataframes of looic values.

Description

Produces a dotchart with error bars as summary of a dataframe with model names (‘modnames’), LOOIC-values (‘looic’), standard errors (‘se’), lower values (‘lwr’), and upper values (‘upr’) .

Usage

sumchart(df, rownames, groups, perc)

Arguments

df

data.frame name

rownames

model names

groups

row names

perc

the percentage of credibility desired

Value

A plot is produced.

Author(s)

Robert van Hulst

References

van Hulst, R. 2018. Evaluating Scientific Evidence. ms.

weight gain in rats

Description

Rats were fed diets with different quantities of protein from either animal or plant sources. The weight gained at the end of the experiment was the response variable.

Usage

data("weightgain")

Format

A data frame with 40 observations on the following 3 variables

source: source of protein given, a factor with levels Beef and Cereal
type: amount of protein given, a factor with levels High and Low
weightgain: weight gain in grams

Source

Hand, D.J., Daly, F., Lunn, A.D., McConway, K.J. and Ostrowski, E. 1994. A Handbook of Small Datasets, Chapman and Hall, London.

Examples

  data("weightgain")
  with(weightgain, table(source, type))

evidence: Functions and Data for Bayesian and Likelihood Analysis

Description

Details

Author(s)

References

Made-up data for a balanced one-way anova.

Description

Usage

Format

Details

Source

Examples

Made-up data for an unbalanced one-way anova.

Description

Usage

Format

Details

Source

Examples

A contingency table for heart attacks and aspirin use.

Description

Usage

Format

Source

Bayesian analysis of one sample from a Normal distribution with imprecise priors.

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Bayesian analysis of a Normal sample using a SIR prior.

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Bayesian analysis of the binomial parameter for one sample.

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

simulates Bayesian updating of the binomial parameter \pi.

Description

Usage

Arguments

Value

Author(s)

References

Examples

Bayesian analysis of the means of two Normal samples using SIR priors.

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Bayesian analysis of the binomial parameters for two samples.

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Bayesian analysis of a 2 x 2 contingency table.

Description

Usage

simulates Bayesian updating of the binomial parameter `\pi`.