Title: | Simultaneous Inference for Multiple Linear Contrasts in GEE Models |
Version: | 1.20 |
Imports: | stats, methods, Matrix, mvtnorm |
Suggests: | geepack, testthat |
Description: | Provides global hypothesis tests, multiple testing procedures and simultaneous confidence intervals for multiple linear contrasts of regression coefficients in a single generalized estimating equation (GEE) model or across multiple GEE models. GEE models are fit by a modified version of the 'geeM' package. |
Date: | 2019-06-18 |
Author: | Robin Ristl [aut, cre], Lee McDaniel [ctb] (Author of 'geeM' package), Nick Henderson [ctb] (Author of 'geeM' package), Melanie Prague [ctb] (Contributor to 'geeM' package) |
Maintainer: | Robin Ristl <robin.ristl@meduniwien.ac.at> |
Depends: | R (≥ 3.0.0) |
License: | GPL-3 |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 6.1.0 |
NeedsCompilation: | no |
Packaged: | 2019-06-21 13:36:22 UTC; Robin |
Repository: | CRAN |
Date/Publication: | 2019-06-21 15:30:03 UTC |
Simulated Data Set With Three Endpoints
Description
A data set was simulated with repeated observations of a continous outcome variable Y.lin, a count data outcome Y.poi and a binary outcome Y.bin. In the simulation, the mean of an outcome variable depends on the binary grouping variable gr.lang and one of the continuos predictors x1, x2 and x3. Observations of all outcomes in the same subject, indicated by the variable id, are correlated. Also x1, x2 and x3 are correlated within subjects. Data are independent between subjects. The example code shows how to fit a marginal GEE model for each outcome variable and how to test, with different methods, the null hypothesis that the grouping variable has no effect on any of the three endpoints.
Usage
data(datasim)
Format
A data frame.
Examples
data(datasim)
head(datasim)
mod1<-geem2(Y.lin~gr.lang+x1,id=id,data=datasim,family="gaussian",corstr="exchangeable")
mod2<-geem2(Y.poi~gr.lang+x2,id=id,data=datasim,family="poisson",corstr="exchangeable")
mod3<-geem2(Y.bin~gr.lang+x3,id=id,data=datasim,family="binomial",corstr="exchangeable")
L1<-L2<-L3<-matrix(c(0,1,0),nrow=1)
mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="Wald",type="maximum",
biascorr=TRUE,asymptotic=FALSE,closed.test=TRUE)
## Not run:
mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="score",closed.test=TRUE)
mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="score",type="quadratic",
closed.test=TRUE)
mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="Wald",type="quadratic",
biascorr=TRUE,asymptotic=FALSE,closed.test=TRUE)
mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="Wald",type="quadratic",
scaled=TRUE,biascorr=TRUE,asymptotic=FALSE,closed.test=TRUE)
## End(Not run)
Fit Generalized Estimating Equation Models
Description
geem2
is a modified version of geem
to fit generalized estimating equation models and to provide objects that can be used
for simultaneous inference across multiple marginal models using mmmgee
and mmmgee.test
.
Like geem
, geem2
estimates coefficients and nuisance parameters using generalized estimating equations. The link and variance functions can be
specified by the user and the syntax is similar to glm
.
Usage
geem2(formula, id, waves = NULL, data = parent.frame(),
family = gaussian, corstr = "independence", Mv = 1,
weights = NULL, corr.mat = NULL, init.beta = NULL,
init.alpha = NULL, init.phi = 1, scale.fix = FALSE,
nodummy = FALSE, sandwich = TRUE, useP = TRUE, maxit = 20,
tol = 1e-05, restriction = NULL, conv.criterion = c("ratio",
"difference"))
Arguments
formula |
a formula expression similar to that for |
id |
a vector identifying the clusters. By default, data are assumed to be sorted such that observations in a cluster are in consecutive rows and higher
numbered rows in a cluster are assumed to be later. If |
waves |
a non-negative integer vector identifying components of a cluster. For example, this could be a time ordering. If integers are skipped within a cluster,
then dummy rows with weight 0 are added in an attempt to preserve the correlation structure (except if |
data |
an optional data frame containing the variables in the model. |
family |
will determine the link and variance functions. The argument can be one of three options: a family object, a character string, or a list of functions.
For more information on how to use family objects, see |
corstr |
a character string specifying the correlation structure. Allowed structures are: |
Mv |
for |
weights |
A vector of weights for the inverse of the scale factor each observation.
If an observation is assigned weight 0, it is excluded from the calculations of any parameters.
Observations with a |
corr.mat |
the correlation matrix for |
init.beta |
an optional vector with the initial values of beta. If not specified, then the intercept will be set to |
init.alpha |
an optional scalar or vector giving the initial values for the correlation. If provided along with |
init.phi |
an optional initial scale parameter. If not supplied, initialized to 1. |
scale.fix |
if set to |
nodummy |
if set to |
sandwich |
if |
useP |
if set to |
maxit |
maximum number of iterations. |
tol |
tolerance in calculation of coefficients. |
restriction |
either a contrast matrix or a list of a contrast matrix and a right hand side vector, defining a restriction on the regression coefficients. See details. |
conv.criterion |
convergence criterion, either |
Details
The function is a modification of geem
from the geeM package, such that
additional output is returned that is required for the calculation of covariance
matrix across multiple marginal models. In particular the contributions of each subject to the estimating equation are made available in the output. Internal functions regarding
the calculation of matrix inverses were modified to improve the handling of missing data.
In geem2
, weights are defined as scale weights, similar to most othe software. Note that, in contrast, the current version of geem
(version 0.10.1) uses
residual weights.
The scale parameter phi
is used in estimating the residual working correlation parameters and in estimating the model based (naiv) covariance matrix of the regression coefficients.
Similar as in most other software, requesting scale.fix=TRUE
only has an impact on the latter, while the working correlation is still estimated using an empirical
scale factor for the residuals. In contrast, geem
uses the fixed scale factor also when estimating the working correlation.
geem2
allows for estimation of regression coefficients under linear restrictions L\beta=r
, where L
is a contrast matrix, \beta
the vector of regression coefficients and r
a real valued right hand side vector. Using the argument restriction
, L
and r
can be specified.
If only L
is specified, r
is assumed as null vector. The functionality is in particular required to calculate the generalized score test for linear hypotheses about
\beta
. Use conv.criterion="difference"
if any regression coefficient is restricted to 0.
Value
A list with class geem2
, similar to the output
of geem
from the geeM package. The additional slot sandwich.args
contains components to calculate the sandwich variance estimator for the
fitted model and across models if
applied in the multiple marginal model framework.
Note
The option to fit a model with linear restrictions concerning the coefficients is implemented to enable the calculation of a generalized score test.
It may also be used to obtain estimates of the coefficients under restrictions. The model based and robust variance estimates of the restricted
coefficient estimates are found in the slots restricted.naiv.var
and restricted.var
, respectively. Note that the variance of estimates restricted to a single value
is supposed to be zero, however the calculated variance estimate may deviate from zero within machine accuracy.
Author(s)
The geem
function was written by Lee McDaniel and Nick Henderson, modifications for geem2
are by Robin Ristl robin.ristl@meduniwien.ac.at
References
Lee S. McDaniel, Nicholas C. Henderson, Paul J. Rathouz. Fast pure R implementation of GEE: application of the matrix package. The R journal 5.1 (2013): 181.
See Also
Examples
data(keratosis)
m1<-geem2(clearance~trt,id=id,data=keratosis,family=binomial,corstr="independence")
summary(m1)
m2<-geem2(pain~trt,id=id,data=keratosis[keratosis$lesion==1,],family=gaussian,corstr="independence")
summary(m2)
geem2(pain~trt,id=id,data=keratosis[keratosis$lesion==1,],family=gaussian,corstr="exchangeable")
#
data(datasim)
mod1<-geem2(Y.lin~gr.lang+x1,id=id,data=datasim,family="gaussian",corstr="exchangeable")
summary(mod1)
mod2<-geem2(Y.poi~gr.lang+x2,id=id,data=datasim,family="poisson",corstr="unstructured")
summary(mod2)
mod3<-geem2(Y.bin~gr.lang+x3,id=id,data=datasim,family="binomial",corstr="user",
corr.mat=matrix(c(1,2,3,0, 2,1,2,3, 3,2,1,2, 0,3,2,1),4,4))
summary(mod3)
Simulated Data Set for a Study of Actinic Keratosis Treatments
Description
A data set simulated under the planning assumptions for a study comparing four radiation regimens for a photodynamic treatment of actinic keratosis. Each patient receives each treatment in a different skin patch and each patch contains four lesions. Variables are patient identifier (id), treatment (trt), lesion identifier within a patient (lesion), the binary outcome clearance success (1=success, 0=no success) reported for each lesion and the metric outcome pain (larger values indicating more pain) reported for each skin patch. The aim of the study is to compare the treatments B, C and D to the reference treatment A in terms of both outcomes.
Usage
data(keratosis)
Format
A data frame.
Examples
data(keratosis)
head(keratosis)
Covariance Matrix Estimation for Multiple Marginal GEE Models
Description
Calculate the covariance matrix for a stacked vector of regression coefficients
from multiple marginal GEE models fitted with geem2
.
Usage
mmmgee(x, biascorr = FALSE)
Arguments
x |
a list of |
biascorr |
logical, if |
Value
A list with class mmmgee
containing the following components:
beta
The stacked vector of regression coefficient estimates from the models in
x
.V
The estimated covariance matrix of the regression coefficient estimates.
A
The outer component of
V=ABA
.B
The inner component of
V=ABA
.biascorr
The value of the input argument
biascorr
(logical).n
A vector with the number of clusters in each model in
x
.p
A vector with number of regression coefficients in each model in
x
.
Author(s)
Robin Ristl, robin.ristl@meduniwien.ac.at
References
Lloyd A. Mancl, Timothy A. DeRouen. A covariance estimator for GEE with improved small sample properties. Biometrics, 2001, 57(1):126-134.
See Also
Examples
data(keratosis)
m1<-geem2(clearance~trt,id=id,data=keratosis,family=binomial,corstr="independence")
m2<-geem2(pain~trt,id=id,data=keratosis[keratosis$lesion==1,],family=gaussian,corstr="independence")
mmmgee(x=list(m1,m2),biascorr=TRUE)
Hypothesis Tests for Linear Contrasts in Multiple Marginal GEE Models
Description
Global hypothesis tests, multiple testing procedures and simultaneous confidence intervals for multiple linear contrasts of regression coefficients in a single generalized estimating equation (GEE) model or across multiple GEE models.
Usage
mmmgee.test(x, L, r = NULL, statistic = c("wald", "score"),
type = c("maximum", "quadratic"), asymptotic = TRUE,
biascorr = FALSE, closed.test = FALSE, conf.int = FALSE,
conf.level = 0.95, alternative = c("undirected", "greater", "less"),
denomDF = NULL, scaled.F = FALSE, maxit = 20, tol = 10^(-8), ...)
Arguments
x |
a |
L |
a contrast matrix defining a contrast for the stacked vector of regression coefficients of the marginal models, or a list of contrast matrices.
In the latter case, the list must contain one matrix for each model listed in |
r |
right hand side vector of the null hypothesis or a list of vectors resembling the right hand side of the null hypothesis. If not specified, |
statistic |
either |
type |
either |
asymptotic |
logical, if |
biascorr |
logical indicating whether the Mancl and DeRouen Bias correction should be used when estimating the joint covariance matrix via |
closed.test |
logical, if |
conf.int |
logical. If |
conf.level |
the nominal simultaneous coverage probability of the confidence intervals. |
alternative |
one of |
denomDF |
Defaults to |
scaled.F |
logical. If |
maxit |
maximal number of iterations to be passed to |
tol |
tolerance limit for the convergence criterion to be passed to |
... |
additional arguments that are passed to |
Details
The null hypothesis is H0:L\beta=r
where L
is a contrast matrix, \beta
the stacked vector of regression coefficients rom the marginal models and r
a real values right hand side vector.
L
can be specified as matrix or, if it is a block diagonal matrix with each block corresponding to
a contrast for one marginal GEE model, as list of the matrices on the diagonal. The right hand side r
can be speficied as vector or as list of
vectors each corresponding to the part of the right hand side vector for one model.
When choosing statistic="wald"
and type="maximum"
, the maximum of the standardized entries of L\hat{\beta}
is used as test statistic and the
p-value is calculated from a multivariate normal or t-distribution (depending on asymptotic
being TRUE
or FALSE
) with correlation matrix
estimated for L\hat{\beta}
. For the t-distribution, denominator degrees of freedom are used as specified in denomDF
.
When choosing statistic="wald"
and type="quadratic"
, a quadratic form of L\hat{\beta}
and the inverse of the estimated covariance matrix
of L\hat{\beta}
is used as test statistic and the p-value is calculated from a
chi-squared distribution or an F-distribution (depending on asymptotic
being TRUE
or FALSE
).
With statistic="score"
, generalized score tests are calculated by replacing L\hat{\beta}
by the first order approximation LAU
where U
is the stacked estimating equation (the score) and A
is the negative inverse of the matrix
of first derivatives of U
, both evaluated at the location of constrained estimates for \beta
under the null hypothesis.
Analogous to the Wald statistic, a maximum-type and a quadratic form score test are available. For the score test the option asymptotic
is ignored
and the reference distribution is multivariate normal or chi-squared.
Value
A list with class mmmgeetest
containing the following components, if required:
test
Contains a data frame with the test statistic, degrees of freedom (depending in the type of test) and the p-value. If closed test was required, a further data frame is reported with estimates, right hand side vector, unadjusted p-values and adjusted p-values for each line of
H0:L\beta-r=0
.hypothesis
A list containing the contrast matrix
L
and the right hand side vectorr
.conf.int
The simultaneous confidence intervals.
denomDF
The type and value of the denominator degrees of freedom used in the procedure.
mmmgee
The
mmmgee
object containing in particular the estimated covariance matrix for the coefficents of the models inx
. Seemmmgee
.
Note
A single value for the denominator degrees of freedom is calculated for the covariance matrix estimate across all contrasts. In the closed testing procedure, this value is used for the degrees of freedom associated with the covariance matrix of any subset of contrasts.
Usual linear models or generalized linear models can be regarded as special case of GEE models and can be included in the analysis framework.
Note however that mmmgee.test
always uses the robust sandwich covariance matrix estimate, even if the calculation of the sandwich covariance was suppressed
in the model objects in x
.
Author(s)
Robin Ristl, robin.ristl@meduniwien.ac.at
References
Dennis D. Boos. On generalized score tests. The American Statistician, 1992, 46(4):327-333.
Lloyd A. Mancl, Timothy A. DeRouen. A covariance estimator for GEE with improved small sample properties. Biometrics, 2001, 57(1):126-134.
See Also
Examples
data(keratosis)
m1<-geem2(clearance~trt,id=id,data=keratosis,family=binomial,corstr="independence")
m2<-geem2(pain~trt,id=id,data=keratosis[keratosis$lesion==1,],family=gaussian,corstr="independence")
L1<-L2<-diag(1,4)[-1,]
mmmgee.test(x=m1,L=list(L1),statistic="wald",type="maximum")
mmmgee.test(x=m1,L=list(L1),statistic="score",type="maximum")
mmmgee.test(x=list(m1,m2),L=list(L1,L2),type="maximum",asymptotic=FALSE,biascorr=TRUE)
mmmgee.test(x=list(m1,m2),L=list(L1,L2),type="maximum",closed.test=TRUE)
mmmgee.test(x=list(m1,m2),L=list(L1,L2),type="maximum",asymptotic=FALSE,
alternative="less",conf.int=TRUE,denomDF=40)
mmmgee.test(x=list(m1,m2),L=list(L1,L2),type="quadratic",asymptotic=TRUE)
mmmgee.test(x=list(m1,m2),L=list(L1,L2),statistic="score",type="quadratic")
mmmgee.test(x=list(m1,m2),L=list(L1,L2),statistic="score",type="maximum")
#
## Not run:
data(datasim)
mod1<-geem2(Y.lin~gr.lang+x1,id=id,data=datasim,family="gaussian",corstr="exchangeable")
mod2<-geem2(Y.poi~gr.lang+x2,id=id,data=datasim,family="poisson",corstr="exchangeable")
mod3<-geem2(Y.bin~gr.lang+x3,id=id,data=datasim,family="binomial",corstr="exchangeable")
Li<-matrix(c(0,1,0),nrow=1)
mmmgee.test(list(mod1,mod2,mod3),L=list(Li,Li,Li),statistic="Wald",type="maximum",
biascorr=TRUE,asymptotic=FALSE,closed.test=TRUE)
mmmgee.test(list(mod1,mod2,mod3),L=list(Li,Li,Li),statistic="score",closed.test=TRUE)
## End(Not run)