Date: | 2025-07-03 |
Title: | Simple Multivariate Statistical Estimation and Tests |
Version: | 1.2.3 |
Description: | A collection of simple parameter estimation and tests for the comparison of multivariate means and variation, to accompany Chapters 4 and 5 of the book Multivariate Statistical Methods. A Primer (5th edition), by Manly BFJ, Navarro Alberto JA & Gerow K (2024) <doi:10.1201/9781003453482>. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Depends: | R (≥ 3.6) |
LazyData: | true |
Imports: | stringr, data.table, Hotelling, biotools |
Suggests: | car, knitr, rmarkdown, testthat (≥ 3.0.0), |
Config/testthat/edition: | 3 |
VignetteBuilder: | knitr |
URL: | https://github.com/ganava4/smsets |
BugReports: | https://github.com/ganava4/smsets/issues |
NeedsCompilation: | no |
Packaged: | 2025-07-04 17:58:55 UTC; jorgen |
Author: | Jorge Navarro Alberto
|
Maintainer: | Jorge Navarro Alberto <ganava4@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-07-08 09:50:05 UTC |
smsets
Description
A collection of simple parameter estimation and tests for the comparison of multivariate means and variation, to accompany Chapters 4 and 5 of the book Multivariate Statistical Methods. A Primer (5th edition), by Manly BFJ, Navarro Alberto JA & Gerow K (2024) doi:10.1201/9781003453482.
Details
The goal of smsets is to produce simple multivariate statistical tests for
means and variances / covariances for one single factor with two or more
levels, including multiple two-sample t- and Levene’s tests, Hotelling’s
T^2
test, extended two-sample Levene’s tests for multivariate data,
one-way MANOVA, van Valen’s test and Box’s M test. A Penrose's distance
calculator is also implemented.
Author(s)
Maintainer: Jorge Navarro Alberto ganava4@gmail.com (ORCID) [copyright holder]
References
Manly, B.F.J., Navarro Alberto, J.A. and Gerow, K. (2024) Multivariate Statistical Methods. A Primer. 5th Edn. CRC Press.
See Also
Useful links:
F approximation of Box's M test
Description
An R function which implements an F approximation for testing the homogeneity of covariance matrices by Box's M. This is an alternative approach to the chi square approximation which requires group sample-sizes to be at least 20.
Usage
BoxM.F(x, group)
Arguments
x |
A data frame with |
group |
The classification factor defining m samples or groups.
It must be one of the columns in |
Details
For m
samples, the M
statistic is given by the equation
M = \frac{\prod_{j=1}^{m} |\mathbf{C}_j|^{(n_{j}-1)/2}}{|\mathbf{C}|^{(n-m)/2}}
where
n_{j}
is the sample size of the j
-th sample,
|\mathbf{C}_j|
is the determinant of the covariance matrix for the
j
th sample,
|\mathbf{C}|
is the determinant of the pooled covariance matrix,
n
is the total number of observations.
Large values of M
provide evidence that the samples are not from
populations with the same covariance matrix. In addition to the observed
M-value itself, the F-approximation involves the sample sizes and the
number of variables analyzed. See the reference for details. Box's test is
sensitive to deviations from normality in the distribution of the variables.
Value
Returns an object of class "BoxM.F"
, a list containing the
following components:
name | A character string describing the function. |
Cov.Mat | A list containing the m sample covariance matrices |
Cov.pooled | The pooled covariance matrix |
BoxM.stat | The approximate F-statistic |
F.BoxM | The calculated F-statistic |
df.v1 | Numerator degrees of freedom for the F statistic |
df.v2 | Denominator degrees of freedom for the F statistic |
Pvalue | P-value for the F statistic |
group | a character string specifying the name of the classification factor defining groups. |
levels.group | a vector of length m, showing the levels
in factor group . |
data.name | a character string giving the name of the data. |
variables | a character string vector containing the variable names. |
data | the data frame analyzed. |
Author(s)
Jorge Navarro Alberto, ganava4@gmail.com
References
Manly, B.F.J., Navarro Alberto, J.A. and Gerow, K. (2024) Multivariate Statistical Methods. A Primer. 5th Edn. Chapman and Hall/CRC.
Examples
data(skulls)
resBoxM.F <- BoxM.F(skulls, Period)
# Brief output
resBoxM.F
Hotelling's T^2
test with extra information
Description
An R function which implements Hotelling's T^2
test assuming equal
covariance matrices, with extra information.
Usage
Hotelling.mat(x, group, level1)
Arguments
x |
a data frame with one two-level factor and p response variables. |
group |
two-level factor defining groups. It must be one of the columns
in |
level1 |
a character string identifying Sample 1. The string must be one
of the factor levels in |
Details
This function is a simplified version of the function
hotelling.test
implemented in the Hotelling
package for the comparison of mean values of two multivariate samples, under
the assumption that covariance matrices are equal. The summary
methods
in Hotelling.mat
gives more detailed information of the calculations
behind the T^2
test.
Value
Returns an object of class "Hotelling.mat"
, a list containing
the following components:
name | A character string describing the function. | |
T2.list | A list containing two data frames with the mean vector
for the two samples, two covariance matrices, one matrix per sample,
the pooled covariance matrix, the inverse of the pooled covariance matrix,
the Hotelling's T^2 statistic, the F -statistic, the degrees of
freedom for the F -statistic and the P-value.
| |
group | a character string specifying the name of the two-level factor defining groups. | |
levels.group | a vector of length two, showing the two levels in
factor group . | |
data.name | a character string giving the name of the data. | |
data | the data frame analyzed. | |
Author(s)
Jorge Navarro Alberto, ganava4@gmail.com
References
Manly, B.F.J., Navarro Alberto, J.A. and Gerow, K. (2024) Multivariate Statistical Methods. A Primer. 5th Edn. Chapman and Hall/CRC.
Examples
data(sparrows)
results.T2 <- Hotelling.mat(sparrows, group = Survivorship, level1 = "S")
# Brief output
results.T2
Levene's test for two multivariate samples based on Hotelling's
T^2
test with extra information
Description
An R function for the comparison of multivariate variation in two samples,
which implements Levene's test based on Hotelling's T^2
.
Usage
LeveneT2(x, group, level1, var.equal = TRUE)
Arguments
x |
A data frame with one two-level factor and p response variables. |
group |
Two-level factor defining groups. It must be one of the columns
in |
level1 |
A character string identifying Sample 1. The string must be one
of the factor levels in |
var.equal |
A logical variable indicating whether to treat the
within-sample covariance matrices of absolute deviations around medians for
samples 1 and 2 as equal or not. The default is |
Details
LeveneT2
makes use of Hotelling's T^2
to test the variation in
two multivariate samples. This test is an alternative procedure that should
be more robust than Box's test which is known to be rather sensitive to the
assumption that the samples are from multivariate normal distributions.
In LeveneT2
the data values are transformed into absolute deviations
from their respective sample medians
ADM_{ijk} = |x_{ijk}-M_{jk}|
where
x_{ijk}
is the value of variable X_{k}
for the i
th
individual in sample j
, and
M_{jk}
is the median of X_{k}
in sample j
.
The unequal variation question between samples j = 1
and j = 2
becomes a T^2
-test for the difference of the mean ADM
vectors.
Value
Returns an object of class "LeveneT2"
, a list containing the
following components:
name | A character string describing the function. |
medians | A list containing two vectors. The first vector
medians1 contains the medians for all variables in sample 1 as
declared in parameter level1 , and the second vector holds the
corresponding medians for the other sample. |
bygroup.data | A list with two data frames matlevel1 and
matlevel2 containing the original variables for samples 1 and 2
respectively |
absdev.median | A list with two data frames
abs.dev.median1 and abs.dev.median2 containing the absolute
deviations from sample medians for samples 1 and 2, respectively. |
LeveneT2.test | A list of class hotelling.test containing
the list stats and the scalar pval , produced by function
hotelling.test implemented in package
Hotelling |
var.equal | a logical variable indicating whether the two
variances were treated as being equal TRUE or not FALSE . |
group | a character string specifying the name of the two-level factor defining groups. |
levels.group | a vector of length two, showing the two levels in
factor group . |
data.name | a character string giving the name of the data. |
variables | a character string vector containing the variable names. |
data | the data frame analyzed. |
The extractor function print.LeveneT2
returns an
annotated output of the test.
Author(s)
Jorge Navarro Alberto, ganava4@gmail.com
References
Curran, J. and Hersh, T. (2021). Hotelling: Hotelling's T^2 Test and Variants. R package version 1.0-8, https://CRAN.R-project.org/package=Hotelling.
Manly, B.F.J., Navarro Alberto, J.A. and Gerow, K. (2024) Multivariate Statistical Methods. A Primer. 5th Edn. Chapman and Hall/CRC.
Nel, D.G. and van de Merwe, C.A. (1986). A solution to the multivariate Behrens-Fisher problem. Comm. Statist. Theor. Meth., A15, 12, 3719-3736.
Examples
data(sparrows)
LeveneT2.sparrows <- LeveneT2(sparrows, group = Survivorship, level1 = "S",
var.equal = TRUE)
# Brief output
LeveneT2.sparrows
Multiple two-sample Levene tests for the comparison of variation in multivariate data
Description
Performs multiple two-sample Levene tests, based on two-sample t-tests
applied to absolute differences around medians for more than one response
vector, with corrected significance levels using any of the adjustment
methods for multiple comparisons offered by p.adjust
.
This function includes the argument alternative =
useful to specify
the type of alternative, either one-sided (lower-/ upper-tail) or two-sided.
Effects sizes are also computed with respect to the two-sample t-tests.
Usage
Levenetests2s.mv(
x,
group,
level1,
alternative = "two.sided",
var.equal = FALSE,
P.adjust = "none",
unit = "units"
)
Arguments
x |
a data frame with one two-level factor and p response variables. |
group |
two-level factor defining groups. It must be one of the columns
in |
level1 |
a character string identifying Sample 1. The string must be one
of the factor levels in |
alternative |
a character string specifying the alternative hypothesis,
must be one of |
var.equal |
a logical variable indicating whether to treat the two
variances as being equal. If |
P.adjust |
p-value correction method, a character string. Can be abbreviated. See 'Details'. |
unit |
Physical units of the response variable useful to fully characterize raw effect sizes |
Details
This function focuses on the univariate Levene test for the comparison of
mean values for two samples, when more than one variable is involved in the
data analysis, so that type one error rates ("false significances") in the
series of Levene tests are adjusted according to the number of response
variables analyzed. The pairwise comparisons between the two levels in
group
with corrections for multiple testing are made over more than
one response vector.
The methods implemented in P.adjust
are the same as those contained in
the p.adjust.methods
: "bonferroni"
, "holm"
,
"hochberg"
, "hommel"
, "BH"
, (Benjamini-Hochberg) or its
alias "fdr"
(False Discovery Rate), and "BY"
(Benjamini &
Yekutieli). The default pass-through option ("none"
) is also included.
Value
Returns an object of class "Levenetests2s.mv"
, a list containing the
following components:
name | A character string describing the function. |
medians | A list containing two vectors of length p,
being p the number of response variables. medians1 and
medians2 store the medians for samples 1 (corresponding to
level1 ) and 2, respectively. |
absdev.median | A list containing two data frames,
abs.dev.median1 and abs.dev.median2 , corresponding to the
absolute deviation around sample medians 1 and 2, respectively |
means.absdev | A list containing two vectors of length p,
(means.absdev1 and means.absdev1 ), corresponding to the
mean absolute deviations around medians for variables 1,...,p, in
samples 1 and 2, respectively. |
vars.absdev | A list containing two vectors of length p,
(vars.absdev1 and vars.absdev1 ), corresponding to the
variances of absolute deviations around medians for variables 1,...,
p, in samples 1 and 2, respectively. |
t.list | A list containing p vectors of length 5, each
vector containing the t-statistic, the degrees of freedom, the adjusted
p-value for the test, the raw effect size estimator:
\bar{x}_1 - \bar{x}_2 , and the post hoc effect size estimator
recommended by Hedges (1981), analogous to Cohen's d, given by
|\bar{x}_1 - \bar{x}_2| / \hat{\sigma} . Here
\hat{\sigma} = \sqrt{MSE} where MSE is the mean squared error,
the estimator of the variance for the difference of means
\bar{x}_1 - \bar{x}_2 , respectively. |
alternative | A character string specifying the alternative hypothesis chosen. |
var.equal | A logical variable indicating whether the two
variances were treated as being equal TRUE or not FALSE .
|
P.adjust | A character string indicating the correction method chosen |
group | A character string specifying the name of the two-level factor defining groups. |
levels.group | a vector of length two showing the two levels in
factor group . |
data.name | a character string giving the name of the data. |
data | the data frame analyzed. |
The extractor function print.Levenetests2s.mv
returns an
annotated output of the Levene tests (or, equivalently, the two-sample
t-tests applied to the absolute differences around medians).
Author(s)
Jorge Navarro Alberto, ganava4@gmail.com
References
Manly, B.F.J., Navarro Alberto, J.A. and Gerow, K. (2024) Multivariate Statistical Methods. A Primer. 5th Edn. Chapman and Hall/CRC.
Hedges, L. V. 1981. Distribution theory for Glass’s estimator of effect size and related estimators. Journal of Educational Statistics 6(2): 107–128.
Examples
data(sparrows)
res.Levene2s.mv <- Levenetests2s.mv(sparrows, Survivorship, "S",
alternative = "less", var.equal = TRUE,
P.adjust = "bonferroni", unit = "mm")
res.Levene2s.mv
Tests in One-way MANOVA with extra information
Description
An R function to test the difference of mean vectors among the levels of a
single factor with respect to p response variables. Sum of squares and
cross-products matrices involved in the MANOVA can be optionally displayed.
Test statistics produced are the same as those implemented in
summary.manova
Usage
OnewayMANOVA(x, group)
Arguments
x |
A data frame with one factor and p response variables. |
group |
Factor defining groups. It must be one of the columns
in |
Details
This function is a simplified version of manova
, focusing in
multivariate analysis of variance for one single factor with respect to
p responses. The print
method in OnewayMANOVA
is similar
to that in summary.manova
, producing the same approximate F tests in
the one-way MANOVA. A simplified printout of the sums of squares and product
matrices involved in the analysis can optionally be chosen.
Value
Returns an object of class "OnewayMANOVA"
, a list containing
the following components:
name | A character string describing the function. |
T | The total sum of squares and cross-product matrix, defined
as \mathbf{T} = \mathbf{B} + \mathbf{W} , with \mathbf{B} and
\mathbf{W} described below. |
W | The within-sample or residual sum of squares and cross-product matrix. |
B | The between-sample sum of squares and cross-product matrix |
x.mnv | An object of class "manova" (and some other classes)
produced by function manova , to be passed as argument in
summary.OnewayMANOVA in order to produce the approximate F-tests.
|
group | A character string specifying the name of the factor defining groups. |
levels.group | A vector showing the levels in factor
group . |
data.name | A character string giving the name of the data. |
variables | A character string vector containing the variable names. |
data | The data frame analyzed. |
Author(s)
Jorge Navarro Alberto, ganava4@gmail.com
References
Manly, B.F.J., Navarro Alberto, J.A. and Gerow, K. (2024) Multivariate Statistical Methods. A Primer. 5th Edn. Chapman and Hall/CRC.
Examples
data(skulls)
res.MANOVA <- OnewayMANOVA(skulls, group = Period)
# Brief output
res.MANOVA
Penrose's distance calculator
Description
Computes Penrose's distance between m multivariate populations or samples, when information is available on the means and variances.
Usage
Penrose.dist(x, group)
Arguments
x |
A data frame with |
group |
The classification factor defining m samples or groups.
It must be one of the variables in |
Details
Let the mean of X_k
in population i be \mu_{ki}
,
k=1,...,p; i=1,...,m
and assume that the variance of variable X_k
is V_k
. The Penrose (1953) distance P_{ij}
between population
i and population j is given by
P_{ij} = \sum_{k = 1}^{p} \frac{(\mu_{ki} - \mu_{kj})^2}{pV_k}
Penrose's distances between multivariate samples are computed using this
expression, but \mu_{ki}
, \mu_{kj}
and V_k
being replaced
by their corresponding sample estimates.
A disadvantage of Penrose's measure is that it does not consider the correlations between the p variables.
The function requires package biotools (da Silva, 2017, 2021).
Value
Returns an object of class "Penrose.dist"
, a list containing
the following components:
name | A character string describing the function. |
means.vec | A numeric matrix with p rows and m columns giving the mean of each variable per group. |
covs.list | A list containing the m sample covariance matrices. |
Samp.sizes | A table showing the number of observations used in the calculation of the covariance matrix for each group. |
PooledCov | The pooled covariance matrix. This matrix can be accessed and used as an input argument for the calculation of Mahalanobis distance in packages biotools (da Silva, 2017, 2021) and ecodist (Goslee and Urban 2007). |
Penrose.mat | The Penrose distances given as a "matrix "
object. |
Penros.dist | The Penrose distances given as a "dist "
object. |
group | A character string specifying the name of the classification factor defining groups. |
levels.group | a vector of length m, showing the levels
in factor group . |
data.name | a character string giving the name of the data. |
variables | a character string vector containing the variable names. |
data | the data frame analyzed. |
Author(s)
Jorge Navarro Alberto, ganava4@gmail.com
References
da Silva, A.R. (2021). biotools: Tools for Biometry and Applied Statistics in Agricultural Science. R package version 4.2. https://cran.r-project.org/package=biotools.
da Silva, A.R., Malafaia, G., and Menezes, I.P.P. (2017). biotools: an R function to predict spatial gene diversity via an individual-based approach. Genetics and Molecular Research 16. https://doi.org/10.4238/gmr16029655.
Goslee, S.C. and Urban, D.L. (2007). The ecodist package for dissimilarity-based analysis of ecological data. Journal of Statistical Software 22(7):1-19. DOI:10.18637/jss.v022.i07
Manly, B.F.J., Navarro Alberto, J.A. and Gerow, K. (2024) Multivariate Statistical Methods. A Primer. 5th Edn. Chapman and Hall/CRC.
Penrose, L.W. (1953). Distance, size and shape. Annals of Eugenics 18: 337-43.
Examples
data(skulls)
res.Penrose <- Penrose.dist(x = skulls, group = Period)
# Brief output
res.Penrose
van Valen's test
Description
Computes van Valen's test for the comparison of the variation in two multivariate samples. The comparison is made in terms of distances between all standardized variables from their corresponding standardized medians, thus producing two sets of pooled distances, one per sample, whose means are then compared by a two-sample t-test.
Usage
VanValen(x, group, level1, alternative = "two.sided", var.equal = FALSE)
Arguments
x |
a data frame with one two-level factor and p response variables. |
group |
two-level factor defining groups. It must be one of the columns
in |
level1 |
a character string identifying Sample 1. The string must be one
of the factor levels in |
alternative |
a character string specifying the alternative hypothesis
in the t-test for the comparison of mean pooled distances. Must be one of
|
var.equal |
a logical variable indicating whether to treat the two
variances of pooled distances as being equal. If |
Details
To ensure that all variables are given equal weight, each variable is first standardized in van Valen's test, so that the mean is zero and variance is one for all samples combined before the calculation of the pooled distances. These are given by
d_{ij} = \sqrt{\sum_{k = 1}^{p}{(x_{ijk}-M_{jk})^2}}
where
x_{ijk}
is the value of the standardized variable X_{k}
for the
i
th individual in sample j
, and
M_{jk}
is the median of the same standardized variable in the j
th
sample.
The sample means of the d_{ij}
values are compared with a t-test. If
one sample is more variable than another, then the mean d_{ij}
values
will tend to be higher in that sample. The expression for d_{ij}
in van
Valen's is based on an implicit assumption that if the two samples being
tested differ, then one sample will be more variable than the other for all
variables. A significant result cannot be expected in a case where, for
example, X_1
and X_2
are more variable in sample 1, but X_3
and X_4
are more variable in sample 2. The effect of the differing
variances would then tend to cancel out in the calculation of d_{ij}
.
Thus, Van Valen's test is not appropriate for situations where changes in
the level of variation are not expected to be consistent for all variables.
Value
Returns an object of class "VanValen"
, a list containing the
following components:
name | A character string describing the function. |
std.data | A list with two data frames matlevel1 and
matlevel2 containing the values of the standardized variables for
samples 1 and 2 respectively |
medians.std | A list containing two vectors. The first vector
medians.std1 contains the medians for all standardized variables in
sample 1 as declared in parameter level1 , and the second vector,
medians.std2 , holds the corresponding medians for the other sample.
|
dev.median | A list with two data frames dev.median1 and
dev.median2 containing the deviations from sample medians for
samples 1 and 2, respectively. |
d.list | A list with two data frames d.level1 and
d.level2 containing the pooled distances of standardized variables
from their corresponding medians for samples 1 and 2, respectively. |
means.d | A named numeric vector carrying the mean pooled distances for samples 1 and 2, respectively |
vars.d | A named numeric vector carrying the variance of pooled distances for samples 1 and 2, respectively |
t.vec | A named numeric vector containing the t-statistic, the degrees of freedom and the p-value for the test, respectively. |
alternative | a character string specifying the alternative hypothesis chosen. |
var.equal | A logical variable indicating whether the two
variances were treated as being equal TRUE or not FALSE . |
group | A character string specifying the name of the two-level factor defining groups. |
levels.group | A vector of length two, showing the two levels in
factor group . |
data.name | A character string giving the name of the data. |
variables | A character string vector containing the variable names. |
data | The data frame analyzed. |
Author(s)
Jorge Navarro Alberto, ganava4@gmail.com
References
Manly, B.F.J., Navarro Alberto, J.A. and Gerow, K. (2024) Multivariate Statistical Methods. A Primer. 5th Edn. CRC Press.
van Valen, L. (1978) The statistics of variation. Evolutionary Theory 4: 33-43. (Erratum Evolutionary Theory 4: 202.)
Examples
data(sparrows)
res.VanValen <- VanValen(sparrows, "Survivorship", "S",
alternative = "less", var.equal = TRUE)
# Brief output
res.VanValen
Prints Box'M test based on an F-statistic
Description
Prints the results produced by BoxM.F function, with the option to display the matrices involved in the calculations
Usage
## S3 method for class 'BoxM.F'
print(x, long = FALSE, ...)
Arguments
x |
an object of class BoxM.F |
long |
a logical variable indicating whether a long output is desired
( |
... |
further arguments passed to or from other methods. |
Value
Displays the results of Box's M test for homogeneity of covariance
matrices, based on the F-approximation computed by the BoxM.F
function. The argument x
, invisibly, as for all print methods, is a list of
class "BoxM.F
". This print
method provides two sorts of output
depending on whether the long
argument is TRUE
or FALSE
(the default).
The "short" output displays:
A heading describing the analysis.
The data frame analyzed.
The variables used for the test.
The factor defining the populations or samples and their levels.
The value of the Box's M statistic, the corresponding approximate F-statistic, the degrees of freedom for the numerator and the denominator of the F-statistic, and the p-value.
In addition to the above information, the "long" output lists:
The covariance matrix for each sample.
The pooled covariance matrix.
Examples
data(skulls)
resBoxM.F <- BoxM.F(skulls, Period)
# Long output
print(resBoxM.F, long = TRUE)
Prints Hotelling's T^2
test
Description
Prints the results produced by the Hotelling.mat
function
Usage
## S3 method for class 'Hotelling.mat'
print(x, long = FALSE, ...)
Arguments
x |
an object of class |
long |
a logical variable indicating whether a long output is desired
( |
... |
further arguments passed to or from other methods. |
Value
Displays the results of the comparison of mean values of two multivariate
samples, under the assumption that covariance matrices are equal, using
Hotelling's T² test. The argument x
, invisibly, as for all print methods,
is a list of class "Hotelling.mat
". This print
method provides two
sorts of output depending on whether the long
argument is TRUE
or FALSE
(the default). The "short" output displays:
A description of the analysis.
The data frame analyzed.
The labels of the two-level group factor (samples), with an order determined by the user in the
Hotelling.mat
argumentlevel1
.The value of Hotelling's
T²
-statistic.The value of the F-statistic with its corresponding degrees of freedom for numerator and denominator.
The P-value.
In addition to this summary, the "long" output shows:
The mean vectors and covariance matrices for each sample.
The pooled covariance matrix.
The inverse of the covariance matrix.
Examples
data(sparrows)
results.T2 <- Hotelling.mat(sparrows, group = Survivorship, level1 = "S")
# Long output
print(results.T2, long = TRUE)
Prints Levene's test based on Hotelling's T^2
test
Description
Prints the results produced by LeveneT2
, consisting of
a Levene's test for two multivariate samples based on Hotelling's T^2
test.
Usage
## S3 method for class 'LeveneT2'
print(x, long = FALSE, ...)
Arguments
x |
an object of class |
long |
a logical variable indicating whether a long output is desired
( |
... |
further arguments passed to or from other methods. |
Value
Displays the results of the comparison of multivariate variation in two
samples in which data values are transformed into absolute deviations from
their respective sample medians, and mean vectors of absolute deviations are
compared using Hotelling's T^2
test. The argument x
, invisibly, as
for all print methods, is a list of class "LeveneT2
". This print
method provides two sorts of output depending on whether the long
argument
is TRUE
or FALSE
(the default). The "short" output displays:
A description of the analysis.
The data frame analyzed.
The names of responses in the data frame.
The labels of the two-level group factor (samples), with an order determined by the argument
level1
inLeveneT2
.The value of Hotelling's
T²
-statistic.The value of the F-statistic with its corresponding degrees of freedom for numerator and denominator. When the within-sample covariance matrices of absolute deviations around medians are not assumed equal (
var.equal = FALSE
), these degrees of freedom are approximated using the Nel and van der Merwe's (1986) solution to the multivariate Behrens-Fisher problem, as implemented in Hotelling package (Curran and Hersh, 2021).The P-value.
In addition to the above information, the "long" output lists:
Sub-data frames containing the original responses and medians, separately for each sample.
The absolute deviations from sample medians for samples 1 and 2.
Vectors of mean absolute deviations around medians for samples 1 and 2, used in Hotelling's
T²
test.
References
Curran, J. and Hersh, T. (2021). Hotelling: Hotelling's T^2 Test and Variants. R package version 1.0-8, https://CRAN.R-project.org/package=Hotelling.
Nel, D.G. and van de Merwe, C.A. (1986). A solution to the multivariate Behrens-Fisher problem. Comm. Statist. Theor. Meth., A15, 12, 3719-3736.
Examples
data(sparrows)
LeveneT2.sparrows <- LeveneT2(sparrows, group = Survivorship, level1 = "S",
var.equal = TRUE)
# Long output
print(LeveneT2.sparrows, long = TRUE)
Prints multiple two-sample Levene tests for the comparison of variation in multivariate data
Description
Prints the results produced by Levenetests2s.mv
,
consisting of two-sample Levene's tests computed from two-sample t-tests
applied to absolute differences around medians for more than one response
vector.
Usage
## S3 method for class 'Levenetests2s.mv'
print(x, ...)
Arguments
x |
an object of class "Levenetests2s.mv" |
... |
further arguments passed to or from other methods. |
Details
Summarize
Value
An annotated output of two-sample Levene's tests computed from two-sample
t-tests applied to absolute differences around medians for more than one
response vector, with (optionally) corrected significance levels. The
argument x
, invisibly, as for all print methods, is a list of class
"Levenetests2s.mv
". This print
method provides a user-friendly
display of particular elements in x
:
A description of the analysis.
The data frame analyzed.
The labels of the two-level group factor (samples), with an order determined by the user in the
Levenetests2s.mv
argumentlevel1
.The t-based Levene's test results for each response variable; these include:
The variable name.
Sample medians classified by group levels.
Means and variances of sample absolute deviations from the median classified by group levels.
The value of the t-statistic, the degrees of freedom and the p-value.
Effect sizes: raw and Hedge's (1981). The units of raw effect sizes are shown according to the argument
unit =
inLevenetests2s.mv
.
The type of alternative hypothesis for all tests.
The method of significance level adjustment for multiple comparisons used.
References
Hedges, L. V. 1981. Distribution theory for Glass’s estimator of effect size and related estimators. Journal of Educational Statistics 6(2): 107–128.
Examples
data(sparrows)
res.Levene2s.mv <- Levenetests2s.mv(sparrows, Survivorship, "S",
alternative = "less", var.equal = TRUE,
P.adjust = "bonferroni", unit = "mm")
print(res.Levene2s.mv)
Prints a one-way MANOVA with extra information
Description
Prints the results produced by the OnewayMANOVA
function
Usage
## S3 method for class 'OnewayMANOVA'
print(
x,
test = c("Pillai", "Wilks", "Hotelling-Lawley", "Roy"),
long = FALSE,
...
)
Arguments
x |
An object of class |
test |
The name of the test statistic to be used (the four tests
implemented in |
long |
A logical variable indicating whether a long output is desired
( |
... |
further arguments passed to or from other methods. |
Value
Displays the results of a One-way MANOVA, i.e., the test of the difference of
mean vectors among the levels of a single factor with respect to p response
variables. The argument x
, invisibly, as for all print methods, is a list
of class "OnewayMANOVA
". This print
method provides two sorts of
output depending on whether the long
argument is TRUE
or FALSE
(the
default). The "short" output displays:
A heading describing the function.
The data frame analyzed.
The variables involved in the calculation of distances.
The factor defining the populations or samples and their levels.
The One-way MANOVA table specifying the
test
chosen for the F-test approximation, like insummary.manova
.
In addition to the above information, the "long" output lists:
The Between-Sample Sum of Squares and Crossed Products matrix, B
The Within-Sample Total Sum of Squares and Crossed Products matrix, W.
The Total Sample Sum of Squares and Crossed Products matrix, T.
Examples
data(skulls)
res.MANOVA <- OnewayMANOVA(skulls, group = Period)
# Long output, Wilks' test
print(res.MANOVA, test = "Wilks", long = TRUE)
Prints Penrose's distance matrix
Description
Prints the results produced by Penrose.dist
, the Penrose's distance
calculator.
Usage
## S3 method for class 'Penrose.dist'
print(x, long = FALSE, ...)
Arguments
x |
an object of class Penrose.dist |
long |
a logical variable indicating whether a long output is desired
( |
... |
further arguments passed to or from other methods. |
Value
Displays Penrose's distances between m multivariate populations or samples.
The argument x
, invisibly, as for all print methods, is a list of class
"Penrose.dist
". This print
method provides two sorts of output
depending on whether the long
argument is TRUE
or FALSE
(the default).
The "short" output displays:
A heading describing the function.
The data frame analyzed.
The variables involved in the calculation of distances.
The factor defining the populations or samples and their levels.
The Penrose distance matrix (lower triangular form).
In addition to the above information, the "long" output lists:
The population or sample sizes.
The mean vector for each population / sample.
The covariance matrix for each population / sample
The pooled covariance matrix.
Examples
data(skulls)
res.Penrose <- Penrose.dist(x = skulls, group = Period)
# Long output
print(res.Penrose, long = TRUE)
Prints van Valen's test
Description
Displays the results of van Valen's test produced by the VanValen
function and, optionally, the matrices involved in the calculations.
Usage
## S3 method for class 'VanValen'
print(x, long = FALSE, ...)
Arguments
x |
an object of class |
long |
a logical variable indicating whether a long output is desired
( |
... |
further arguments passed to or from other methods. |
Value
Displays the results of van Valen's test produced by the VanValen
function. The argument x
, invisibly, as for all print methods, is a list
of class "VanValen
". This print
method provides two sorts of
output depending on whether the long
argument is TRUE
or FALSE
(the
default). The "short" output displays:
A two-line heading describing the analysis.
The data frame analyzed.
The variables used for the comparison of samples.
The labels of the two-level group factor (samples), with an order determined by the user in the argument
level1
ofVanValen
.The value of the t-statistic, the degrees of freedom and the p-value.
The type of alternative hypothesis for the t-test.
In addition to the above information, the "long" output lists:
Sub-data frames containing the standardized data, separately for each sample.
The sample medians for the standardized data, samples 1 and 2.
Sub-data frames containing the deviations from sample medians for the standardized values, separately for each sample.
Sub-data frames containing the pooled distances (d's), separately for each sample. These two samples of d-values are compared by a t-test.
The means and variances for each sample of d-values.
Examples
data(sparrows)
res.VanValen <- VanValen(sparrows, "Survivorship", "S",
alternative = "less", var.equal = TRUE)
# Long output
print(res.VanValen, long = TRUE)
Prints multiple two-sample t-tests for a multivariate data set
Description
Prints the results produced by ttests2s.mv
, consisting
of two-sample t-tests on more than one response vector with corrected
significance levels for multiple comparisons, as offered by p.adjust
.
Effects sizes are also displayed.
Usage
## S3 method for class 'ttests2s.mv'
print(x, ...)
Arguments
x |
an object of class |
... |
further arguments passed to or from other methods. |
Value
An annotated output of multiple two-sample t-tests on more than one
response vector with (optionally) corrected significance levels. The argument
x
, invisibly, as for all print methods, is a list of class
"ttests2s.mv
". This print
method provides a user-friendly display
of particular elements in x
:
A description of the analysis.
The data frame analyzed.
The labels of the two-level group factor (samples), with an order determined by the user in the
ttests2s.mv
argumentlevel1
.The t-test results for each response variable; these include:
The variable name.
Sample means and variances classified by group levels.
The value of the t-statistic, the degrees of freedom and the p-value.
Effect sizes: raw and Hedge's (1981). The units of raw effect sizes are shown according to the argument
unit =
inttests2s.mv
.
The type of alternative hypothesis for all tests.
The method of significance level adjustment for multiple comparisons used.
References
Hedges, L. V. 1981. Distribution theory for Glass’s estimator of effect size and related estimators. Journal of Educational Statistics 6(2): 107–128.
Examples
data(sparrows)
ttests.sparrows <- ttests2s.mv(sparrows, group = Survivorship, level1 = "S",
var.equal = TRUE, P.adjust = "holm",
unit = "mm")
print(ttests.sparrows)
Egyptian male skulls
Description
Measurements made on male skulls from the area of Thebes in Egypt. There are samples of 30 skulls from each of five periods: the Early Predynastic period (circa 4000 BC), the Late Predynastic period (circa 3300 BC), the 12th and 13th Dynasties (circa 1850 BC), the Ptolemaic period (circa 200 BC), and the Roman period (circa AD 150). Four measurements (mm) are available on each skull.
Usage
data(skulls)
Format
A data frame with 150 rows and 5 variables:
Period
A factor with five levels
Maximum_breadth
a numeric vector
Basibregmatic_height
a numeric vector
Basialveolar_length
a numeric vector
Nasal_height
a numeric vector
References
Thomson, A. and Randall-Maciver, P. (1905). Ancient Races of the Thebaid, Oxford University Press, Oxford, London.
Manly, B.F.J., Navarro Alberto, J.A. and Gerow, K. (2024) Multivariate Statistical Methods. A Primer. 5th Edition. Boca Raton, CRC Press.
Examples
data(skulls)
str(skulls)
Body measurements of female sparrows
Description
Data extracted from the classical report by Hermon Bumpus (1898) who measured morphological variables in sparrows, after a severe storm. This data subset consists of five body measurements of 49 female sparrows, classified according to their survival status (21 survived, 28 did not survive).
Usage
data(sparrows)
Format
A data frame with 49 rows and 6 variables:
Survivorship
A factor with two levels ("S" = Survived, "NS" = Did not survive)
Total_length
Total length (mm), a numeric vector
Alar_extent
Alar extent (mm), a numeric vector
L_beak_head
Length of beak and head (mm), a numeric vector
L_humerus
Length of humerus (mm), a numeric vector
L_keel_sternum
Length of keel of sternum (mm), a numeric vector
References
Bumpus, H.C. (1898). The elimination of the unfit as illustrated by the introduced sparrow, Passer domesticus. Biological Lectures, 11th Lecture. Marine Biology Laboratory, Woods Hole, MA, 209–26.
Manly, B.F.J., Navarro Alberto, J.A. and Gerow, K. (2024) Multivariate Statistical Methods. A Primer. 5th Edition. Boca Raton, CRC Press.
Examples
data(sparrows)
str(sparrows)
Multiple two-sample t-tests for multivariate data
Description
Performs multiple two-sample t-tests on more than one response vector with
corrected significance levels using any of the adjustment methods for
multiple comparisons offered by p.adjust
. Effects sizes are also
computed.
Usage
ttests2s.mv(
x,
group,
level1,
alternative = "two.sided",
var.equal = FALSE,
P.adjust = "none",
unit = "units"
)
Arguments
x |
A data frame with one two-level factor and p response variables. |
group |
Two-level factor defining groups. It must be one of the columns
in |
level1 |
A character string identifying Sample 1. The string must be one
of the factor levels in |
alternative |
a character string specifying the alternative hypothesis,
must be one of |
var.equal |
a logical variable indicating whether to treat the two
variances as being equal. If |
P.adjust |
p-value correction method, a character string. Can be abbreviated. |
unit |
A character string in cases in which all response variables are
measured using the same physical units. Useful to fully characterize raw
effect sizes. The default value is the character string |
Details
This function extends the univariate t.test
for the comparison of mean
values for two samples, when more than one variable is involved in the data
analysis, so that type one error rates ("false significances") in a series of
univariate t-tests are adjusted according to the number of response
variables analyzed. The pairwise comparisons between the two levels in
group
with corrections for multiple testing are made over more than
one response vector thus, the function is a variation of
pairwise.t.test
.
The methods implemented are the same as those contained in the
p.adjust.methods
for p.adjust
: "bonferroni"
,
"holm"
, "hochberg"
, "hommel"
, "BH"
(Benjamini-Hochberg) or its alias "fdr"
(False Discovery Rate), and
"BY"
(Benjamini & Yekutieli). The default pass-through option
("none"
) is also included.
Value
Returns an object of class "ttests2s.mv"
, a list containing
the following components:
name | A character string describing the function |
t.list | A list containing p vectors of length 5, each vector
having the computed t-statistic, the degrees of freedom for the
t-statistic, the adjusted p-value for the test, the raw effect size
estimator: \bar{x}_1 - \bar{x}_2 , and the post hoc effect size
estimator recommended by Hedges (1981), analogous to Cohen's d, given
by |\bar{x}_1 - \bar{x}_2| / \hat{\sigma} . Here \hat{\sigma}
= \sqrt{MSE} where MSE is mean squared error, the estimator
of the variance for the difference of means \bar{x}_1 - \bar{x}_2 .
|
alternative | A character string specifying the alternative hypothesis chosen. |
var.equal | A logical variable indicating whether the two
variances were treated as being equal TRUE or not FALSE .
|
P.adjust | A character string indicating the correction method chosen |
raw.ES | The raw effect size (scalar) expressed in the
pre-specified unit s |
unit | A character string indicating the unit s chosen |
Hedges.d | The post hoc effect size Hedges' estimator (scalar) |
group | A character string specifying the name of the two-level factor defining groups. |
levels.group | A vector of length two showing the two levels in
factor group . |
data.name | A character string giving the name of the data. |
data | the data frame analyzed. |
The extractor function print.ttests2s.mv
returns an
annotated output of each t-test and effect size estimation.
Author(s)
Jorge Navarro Alberto, ganava4@gmail.com
References
Hedges, L. V. 1981. Distribution theory for Glass’s estimator of effect size and related estimators. Journal of Educational Statistics 6(2): 107–128.
Examples
data(sparrows)
ttests.sparrows <- ttests2s.mv(sparrows, group = Survivorship, level1 = "S",
var.equal = TRUE, P.adjust = "bonferroni",
unit = "mm")
ttests.sparrows