Title: Multivariate Asymptotic Non-Parametric Test of Association
Version: 1.0.1
Date: 2023-10-09
Description: The Multivariate Asymptotic Non-parametric Test of Association (MANTA) enables non-parametric, asymptotic P-value computation for multivariate linear models. MANTA relies on the asymptotic null distribution of the PERMANOVA test statistic. P-values are computed using a highly accurate approximation of the corresponding cumulative distribution function. Garrido-Martín et al. (2022) <doi:10.1101/2022.06.06.493041>.
License: GPL-3
Encoding: UTF-8
Depends: R (≥ 3.3.2)
Suggests: testthat
LazyData: true
URL: https://github.com/dgarrimar/manta
BugReports: https://github.com/dgarrimar/manta/issues
RoxygenNote: 7.2.3
NeedsCompilation: yes
Packaged: 2023-10-09 16:21:35 UTC; dgarrido
Author: Diego Garrido-Martín [aut, cre], Ferran Reverter [aut], Miquel Calvo [aut], Roderic Guigó [aut]
Maintainer: Diego Garrido-Martín <dgarrido@ub.edu>
Repository: CRAN
Date/Publication: 2023-10-10 10:30:02 UTC

Algorithm AS 204

Description

Distribution of a positive linear combination of \chi^2 random variables.

Usage

AS204(
  c,
  lambda,
  mult = rep(1, length(lambda)),
  delta = rep(0, length(lambda)),
  maxit = 1e+05,
  eps = 1e-14,
  mode = 1
)

Arguments

c

value point at which distribution is to be evaluated.

lambda

the weights \lambda_j.

mult

the multiplicities m_j.

delta

the non-centrality parameters \delta^2_j.

maxit

the maximum number of terms K (see Details).

eps

the desired level of accuracy.

mode

if "mode" > 0 then \beta=mode\lambda_{min}, otherwise \beta=2/(1/\lambda_{min}+1/\lambda_{max}).

Details

Algorithm AS 204 evaluates the expression

P [X < c] = P [ \sum_{j=1}^n \lambda_j \chi^2(m_j, \delta^2_j) < c ]

where \lambda_j and c are positive constants and \chi^2(m_j, \delta^2_j) represents an independent \chi^2 random variable with m_j degrees of freedom and non-centrality parameter \delta^2_j. This can be approximated by the truncated series

\sum_{k=0}^{K-1} a_k P [\chi^2(m+2k) < c/\beta]

where m = \sum_{j=1}^n m_j and \beta is an arbitrary constant (as given by argument "mode").

The C++ implementation of algorithm AS 204 used here is identical to the one employed by the farebrother method in the CompQuadForm package, with minor modifications.

Value

The function returns the probability P[X > c] = 1 - P[X < c] if the AS 204 fault indicator is 0 (see Note below), and NULL if the fault indicator is 4, 5 or 9, as the corresponding faults can be corrected by increasing "eps". Other faults raise an error.

Note

The algorithm AS 204 defines the following fault indicators: -j) one or more of the constraints \lambda_j > 0, m_j > 0 and \delta^2_j \ge 0 is not satisfied. 1) non-fatal underflow of a_0. 2) one or more of the constraints n > 0, c > 0, maxit > 0 and eps > 0 is not satisfied. 3) the current estimate of the probability is < -1. 4) the required accuracy could not be obtained in maxit iterations. 5) the value returned by the procedure does not satisfy 0 \le P [X < c] \le 1. 6) the density of the linear form is negative. 9) faults 4 and 5. 10) faults 4 and 6. 0) otherwise.

Author(s)

Diego Garrido-Martín

References

P. Duchesne, P. Lafaye de Micheaux, Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods, Computational Statistics and Data Analysis, Vol. 54, (2010), 858-862

Farebrother R.W., Algorithm AS 204: The distribution of a Positive Linear Combination of chi-squared random variables, Journal of the Royal Statistical Society, Series C (applied Statistics), Vol. 33, No. 3 (1984), 332-339

See Also

farebrother


Simulated Measurements of Five Disease Biomarkers

Description

A simulated dataset containing the levels of 5 biomarkers, measured in 100 individuals, with different scales. Missing observations appear as NA.

Usage

data(biomarkers)

Format

A matrix with 100 rows and 5 numerical variables:

biomarker1

levels of biomarker1

biomarker2

levels of biomarker2

...

Author(s)

Diego Garrido-Martín


Non-parametric, Asymptotic P-values for Multivariate Linear Models

Description

Fits a multivariate linear model and computes test statistics and asymptotic P-values for predictors in a non-parametric manner.

Usage

manta(
  formula,
  data,
  transform = "none",
  type = "II",
  contrasts = NULL,
  subset = NULL,
  fit = FALSE
)

Arguments

formula

object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which manta is called.

transform

transformation of the response variables: "none", "sqrt" or "log". Default is "none".

type

type of sum of squares: "I", "II" or "III". Default is "II".

contrasts

an optional list. See contrasts.arg in model.matrix.default. Default is "contr.sum" for ordered factors and "contr.poly" for unordered factors. Note that this is different from the default setting in options("contrasts").

subset

subset of predictors for which summary statistics will be reported. Note that this is different from the "subset" argument in lm.

fit

logical. If TRUE the multivariate fit on transformed and centered responses is returned.

Details

A Y matrix is obtained after transforming (optionally) and centering the original response variables. Then, the multivariate fit obtained by lm can be used to compute sums of squares (type-I, type-II or type-III), pseudo-F statistics and asymptotic P-values for the terms specified by the formula in a non-parametric manner. The designations "type-II" and "type-III" correspond exactly to those used in Anova. "type-I" refers to sequential sums of squares.

Value

manta returns an object of class "manta", a list containing:

call

the matched call.

aov.tab

ANOVA table with Df, Sum Sq, Mean Sq, F values, partial R-squared and P-values.

type

the type of sum of squares ("I", "II" or "III").

precision

the precision in P-value computation.

transform

the transformation applied to the response variables.

na.omit

incomplete cases removed (see na.omit).

fit

if fit = TRUE the multivariate fit done on the transformed and centered response variables is also returned.

Author(s)

Diego Garrido-Martín

See Also

lm, Anova


Sums of Squares and Pseudo-F Statistics from a Multivariate Fit

Description

Computes the sum of squares, degrees of freedom, pseudo-F statistics and partial R-squared for each predictor from a multivariate fit. It also returns the eigenvalues of the residual covariance matrix.

Usage

manta.ss(fit, X, type = "II", subset = NULL, tol = 0.001)

Arguments

fit

multivariate fit obtained by lm.

X

design matrix obtained by model.matrix.

type

type of sum of squares ("I", "II" or "III"). Default is "II".

subset

subset of predictors for which summary statistics will be reported. Note that this is different from the "subset" argument in lm.

tol

e[e/sum(e) > tol], where e is the vector of eigenvalues of the residual covariance matrix. Required to prevent long running times of algorithm AS 204. Default is 0.001 to ensure minimal loss of accuracy.

Details

Different types of sums of squares (i.e. "I", "II" and "III") are available.

Value

A list containing:

SS

sums of squares for all predictors (and residuals).

df

degrees of freedom for all predictors (and residuals).

f.tilde

pseudo-F statistics for all predictors.

r2

partial R-squared for all predictors.

e

eigenvalues of the residual covariance matrix.

Author(s)

Diego Garrido-Martín

See Also

AS204


Asymptotic P-values

Description

Computes asymptotic P-values given the numerator of the pseudo-F statistic, its degrees of freedom and the eigenvalues of the residual covariance matrix.

Usage

p.asympt(ss, df, lambda, eps = 1e-14, eps.updt = 2, eps.stop = 1e-10)

Arguments

ss

numerator of the pseudo-F statistic.

df

degrees of freedom of the numerator of the pseudo-F statistic.

lambda

eigenvalues of the residual covariance matrix.

eps

the desired level of accuracy.

eps.updt

factor by which eps is updated to retry execution of algorithm AS 204 when it fails with fault indicator 4, 5 or 9.

eps.stop

if eps > eps.stop, execution of algorithm AS 204 is not retried and the function raises an error. Default is 1e-10.

Value

A vector containing the P-value and the level of accuracy.

Author(s)

Diego Garrido-Martín

See Also

AS204


Simulated Metadata for 100 Patients

Description

A simulated dataset containing the age, gender and disease status of 100 individuals. Missing observations appear as NA.

Usage

data(patients)

Format

A matrix with 100 rows and 3 variables:

age

Age of the patient (numerical)

gender

Gender of the patient (factor with levels: "male" and "female")

status

Disease status of the patient (ordered factor with levels: "healthy", "mild" and "severe")

Author(s)

Diego Garrido-Martín


Print Coefficient Matrices (Multiple P-value Precision Limits)

Description

Modification of printCoefmat to use multiple P-value precision limits.

Usage

printCoefmat.mp(
  x,
  digits = max(3L, getOption("digits") - 2L),
  signif.stars = getOption("show.signif.stars"),
  signif.legend = signif.stars,
  dig.tst = max(1L, min(5L, digits - 1L)),
  cs.ind = 1:k,
  tst.ind = k + 1,
  zap.ind = integer(),
  P.values = NULL,
  has.Pvalue = nc >= 4 && substr(colnames(x)[nc], 1, 3) == "Pr(",
  eps.Pvalue = .Machine$double.eps,
  na.print = "NA",
  ...
)

Author(s)

Diego Garrido-Martín

See Also

printCoefmat