Type: | Package |
Title: | Robust Regression with Compositional Covariates |
Version: | 1.1 |
Date: | 2020-10-10 |
Maintainer: | Aditya Mishra <amishra@flatironinstitute.org> |
Description: | We implement the algorithm estimating the parameters of the robust regression model with compositional covariates. The model simultaneously treats outliers and provides reliable parameter estimates. Publication reference: Mishra, A., Mueller, C.,(2019) <doi:10.48550/arXiv.1909.04990>. |
URL: | https://arxiv.org/abs/1909.04990, https://github.com/amishra-stats/robregcc |
Depends: | R (≥ 3.5.0), stats, utils |
License: | GPL (≥ 3.0) |
LazyData: | true |
Imports: | Rcpp (≥ 0.12.0), MASS, magrittr, graphics |
LinkingTo: | Rcpp, RcppArmadillo |
NeedsCompilation: | yes |
RoxygenNote: | 7.1.1 |
Encoding: | UTF-8 |
Packaged: | 2020-07-25 05:05:50 UTC; amishra |
Author: | Aditya Mishra [aut, cre], Christian Muller [ctb] |
Repository: | CRAN |
Date/Publication: | 2020-07-25 20:20:03 UTC |
Estimate parameters of linear regression model with compositional covariates using method suggested by Pixu shi.
Description
The model uses scaled lasoo approach for model selection.
Usage
classo(Xt, y, C, we = NULL, type = 1, control = list())
Arguments
Xt |
CLR transformed predictor matrix. |
y |
model response vector |
C |
sub-compositional matrix |
we |
specify weight of model parameter |
type |
1/2 for l1 / l2 loss in the model |
control |
a list of internal parameters controlling the model fitting |
Value
beta |
model parameter estimate |
References
Shi, P., Zhang, A. and Li, H., 2016. Regression analysis for microbiome compositional data. The Annals of Applied Statistics, 10(2), pp.1019-1040.
Examples
library(robregcc)
library(magrittr)
data(simulate_robregcc)
X <- simulate_robregcc$X;
y <- simulate_robregcc$y
C <- simulate_robregcc$C
n <- nrow(X); p <- ncol(X); k <- nrow(C)
# Predictor transformation due to compositional constraint:
Xt <- cbind(1,X) # accounting for intercept in predictor
C <- cbind(0,C) # accounting for intercept in constraint
bw <- c(0,rep(1,p)) # weight matrix to not penalize intercept
# Non-robust regression, [Pixu Shi 2016]
control <- robregcc_option(maxiter = 5000, tol = 1e-7, lminfac = 1e-12)
fit.nr <- classo(Xt, y, C, we = bw, type = 1, control = control)
Compute solution path of constrained lasso.
Description
The model uses scaled lasoo approach for model selection.
Usage
classo_path(Xt, y, C, we = NULL, control = list())
Arguments
Xt |
CLR transformed predictor matrix. |
y |
model response vector |
C |
sub-compositional matrix |
we |
specify weight of model parameter |
control |
a list of internal parameters controlling the model fitting |
Value
betapath |
solution path estimate |
beta |
model parameter estimate |
Examples
library(robregcc)
library(magrittr)
data(simulate_robregcc)
X <- simulate_robregcc$X;
y <- simulate_robregcc$y
C <- simulate_robregcc$C
n <- nrow(X); p <- ncol(X); k <- nrow(C)
#
Xt <- cbind(1,X) # accounting for intercept in predictor
C <- cbind(0,C) # accounting for intercept in constraint
bw <- c(0,rep(1,p)) # weight matrix to not penalize intercept
# Non-robust regression
control <- robregcc_option(maxiter = 5000, tol = 1e-7, lminfac = 1e-12)
fit.path <- classo_path(Xt, y, C, we = bw, control = control)
Extract coefficients estimate from the sparse version of the robregcc fitted object.
Description
S3 methods extracting estimated coefficients for objects generated by
robregcc
. Robust coeffcient estimate.
Usage
coef_cc(object, type = 0, s = 0)
Arguments
object |
Object generated by |
type |
0/1 residual estimate before/after sanity check |
s |
0/1 no/yes 1se rule |
Value
coefficient estimate
Examples
library(magrittr)
library(robregcc)
data(simulate_robregcc)
X <- simulate_robregcc$X;
y <- simulate_robregcc$y
C <- simulate_robregcc$C
n <- nrow(X); p <- ncol(X); k <- nrow(C)
Xt <- cbind(1,X) # accounting for intercept in predictor
C <- cbind(0,C) # accounting for intercept in constraint
bw <- c(0,rep(1,p)) # weight matrix to not penalize intercept
example_seed <- 2*p+1
set.seed(example_seed)
# Breakdown point for tukey Bisquare loss function
b1 = 0.5 # 50% breakdown point
cc1 = 1.567 # corresponding model parameter
b1 = 0.25; cc1 = 2.937
# Initialization [PSC analysis for compositional data]
control <- robregcc_option(maxiter=1000,tol = 1e-4,lminfac = 1e-7)
fit.init <- cpsc_sp(Xt, y,alp = 0.4, cfac = 2, b1 = b1,
cc1 = cc1,C,bw,1,control)
## Robust model fitting
# control parameters
control <- robregcc_option()
beta.wt <- fit.init$betaR # Set weight for model parameter beta
beta.wt[1] <- 0
control$gamma = 1 # gamma for constructing weighted penalty
control$spb = 40/p # fraction of maximum non-zero model parameter beta
control$outMiter = 1000 # Outer loop iteration
control$inMiter = 3000 # Inner loop iteration
control$nlam = 50 # Number of tuning parameter lambda to be explored
control$lmaxfac = 1 # Parameter for constructing sequence of lambda
control$lminfac = 1e-8 # Parameter for constructing sequence of lambda
control$tol = 1e-20; # tolrence parameter for converging [inner loop]
control$out.tol = 1e-16 # tolerence parameter for convergence [outer loop]
control$kfold = 10 # number of fold of crossvalidation
control$sigmafac = 2#1.345
# Robust regression using adaptive lasso penalty
fit.ada <- robregcc_sp(Xt,y,C,
beta.init = beta.wt, cindex = 1,
gamma.init = fit.init$residuals,
control = control,
penalty.index = 1, alpha = 0.95)
# Robust regression using lasso penalty [Huber equivalent]
fit.soft <- robregcc_sp(Xt,y,C, cindex = 1,
control = control, penalty.index = 2,
alpha = 0.95)
# Robust regression using hard thresholding penalty
control$lmaxfac = 1e2 # Parameter for constructing sequence of lambda
control$lminfac = 1e-3 # Parameter for constructing sequence of lambda
control$sigmafac = 2#1.345
fit.hard <- robregcc_sp(Xt,y,C, beta.init = fit.init$betaf,
gamma.init = fit.init$residuals,
cindex = 1,
control = control, penalty.index = 3,
alpha = 0.95)
coef_cc(fit.ada)
coef_cc(fit.soft)
coef_cc(fit.hard)
Principal sensitivity component analysis with compositional covariates in sparse setting.
Description
Produce model and its residual estimate based on PCS analysis.
Usage
cpsc_sp(
X0,
y0,
alp = 0.4,
cfac = 2,
b1 = 0.25,
cc1 = 2.937,
C = NULL,
we,
type,
control = list()
)
Arguments
X0 |
CLR transformed predictor matrix. |
y0 |
model response vector |
alp |
(0,0.5) fraction of data sample to be removed to generate subsample |
cfac |
initial value of shift parameter for weight construction/initialization |
b1 |
tukey bisquare function parameter producing desired breakdown point |
cc1 |
tukey bisquare function parameter producing desired breakdown point |
C |
sub-compositional matrix |
we |
penalization index for model parameters beta |
type |
1/2 for l1 / l2 loss in the model |
control |
a list of internal parameters controlling the model fitting |
Value
betaf |
TModel parameter estimate |
residuals |
residual estimate |
References
Mishra, A., Mueller, C.,(2019) Robust regression with compositional covariates. In prepration. arXiv:1909.04990.
Examples
library(robregcc)
library(magrittr)
data(simulate_robregcc)
X <- simulate_robregcc$X;
y <- simulate_robregcc$y
C <- simulate_robregcc$C
n <- nrow(X); p <- ncol(X); k <- nrow(C)
Xt <- cbind(1,X) # include intercept in predictor
C <- cbind(0,C) # include intercept in constraint
bw <- c(0,rep(1,p)) # weights not penalize intercept
example_seed <- 2*p+1
set.seed(example_seed)
# Breakdown point for tukey Bisquare loss function
b1 = 0.5 # 50% breakdown point
cc1 = 1.567 # corresponding model parameter
b1 = 0.25; cc1 = 2.937
# Initialization [PSC analysis for compositional data]
control <- robregcc_option(maxiter = 1000,
tol = 1e-4,lminfac = 1e-7)
fit.init <- cpsc_sp(Xt, y,alp = 0.4, cfac = 2, b1 = b1,
cc1 = cc1,C,bw,1,control)
Plot cross-validation error plot
Description
S3 methods plotting crossvalidation error using the object obtained from robregcc
.
Usage
plot_cv(object)
Arguments
object |
robregcc fitted onject |
Value
generate cv error plot
Examples
library(magrittr)
library(robregcc)
data(simulate_robregcc)
X <- simulate_robregcc$X;
y <- simulate_robregcc$y
C <- simulate_robregcc$C
n <- nrow(X); p <- ncol(X); k <- nrow(C)
Xt <- cbind(1,X) # accounting for intercept in predictor
C <- cbind(0,C) # accounting for intercept in constraint
bw <- c(0,rep(1,p)) # weight matrix to not penalize intercept
example_seed <- 2*p+1
set.seed(example_seed)
# Breakdown point for tukey Bisquare loss function
b1 = 0.5 # 50% breakdown point
cc1 = 1.567 # corresponding model parameter
b1 = 0.25; cc1 = 2.937
# Initialization [PSC analysis for compositional data]
control <- robregcc_option(maxiter=1000,tol = 1e-4,lminfac = 1e-7)
fit.init <- cpsc_sp(Xt, y,alp = 0.4, cfac = 2, b1 = b1,
cc1 = cc1,C,bw,1,control)
## Robust model fitting
# control parameters
control <- robregcc_option()
beta.wt <- fit.init$betaR # Set weight for model parameter beta
beta.wt[1] <- 0
control$gamma = 1 # gamma for constructing weighted penalty
control$spb = 40/p # fraction of maximum non-zero model parameter beta
control$outMiter = 1000 # Outer loop iteration
control$inMiter = 3000 # Inner loop iteration
control$nlam = 50 # Number of tuning parameter lambda to be explored
control$lmaxfac = 1 # Parameter for constructing sequence of lambda
control$lminfac = 1e-8 # Parameter for constructing sequence of lambda
control$tol = 1e-20; # tolrence parameter for converging [inner loop]
control$out.tol = 1e-16 # tolerence parameter for convergence [outer loop]
control$kfold = 10 # number of fold of crossvalidation
control$sigmafac = 2#1.345
# Robust regression using adaptive lasso penalty
fit.ada <- robregcc_sp(Xt,y,C,
beta.init = beta.wt, cindex = 1,
gamma.init = fit.init$residuals,
control = control,
penalty.index = 1, alpha = 0.95)
# Robust regression using lasso penalty [Huber equivalent]
fit.soft <- robregcc_sp(Xt,y,C, cindex = 1,
control = control, penalty.index = 2,
alpha = 0.95)
# Robust regression using hard thresholding penalty
control$lmaxfac = 1e2 # Parameter for constructing sequence of lambda
control$lminfac = 1e-3 # Parameter for constructing sequence of lambda
control$sigmafac = 2#1.345
fit.hard <- robregcc_sp(Xt,y,C, beta.init = fit.init$betaf,
gamma.init = fit.init$residuals,
cindex = 1,
control = control, penalty.index = 3,
alpha = 0.95)
plot_cv(fit.ada)
plot_cv(fit.soft)
plot_cv(fit.hard)
Plot solution path at different value of lambda
Description
S3 methods plotting solution path of model parameter and mean shift using the object obtained from robregcc
.
Usage
plot_path(object, ptype = 0)
Arguments
object |
Object generated by |
ptype |
path type 0/1 for Gamma/Beta path respectvely |
Value
plot solution path
Examples
library(magrittr)
library(robregcc)
data(simulate_robregcc)
X <- simulate_robregcc$X;
y <- simulate_robregcc$y
C <- simulate_robregcc$C
n <- nrow(X); p <- ncol(X); k <- nrow(C)
Xt <- cbind(1,X) # accounting for intercept in predictor
C <- cbind(0,C) # accounting for intercept in constraint
bw <- c(0,rep(1,p)) # weight matrix to not penalize intercept
example_seed <- 2*p+1
set.seed(example_seed)
# Breakdown point for tukey Bisquare loss function
b1 = 0.5 # 50% breakdown point
cc1 = 1.567 # corresponding model parameter
b1 = 0.25; cc1 = 2.937
# Initialization [PSC analysis for compositional data]
control <- robregcc_option(maxiter=1000,tol = 1e-4,lminfac = 1e-7)
fit.init <- cpsc_sp(Xt, y,alp = 0.4, cfac = 2, b1 = b1,
cc1 = cc1,C,bw,1,control)
## Robust model fitting
# control parameters
control <- robregcc_option()
beta.wt <- fit.init$betaR # Set weight for model parameter beta
beta.wt[1] <- 0
control$gamma = 1 # gamma for constructing weighted penalty
control$spb = 40/p # fraction of maximum non-zero model parameter beta
control$outMiter = 1000 # Outer loop iteration
control$inMiter = 3000 # Inner loop iteration
control$nlam = 50 # Number of tuning parameter lambda to be explored
control$lmaxfac = 1 # Parameter for constructing sequence of lambda
control$lminfac = 1e-8 # Parameter for constructing sequence of lambda
control$tol = 1e-20; # tolrence parameter for converging [inner loop]
control$out.tol = 1e-16 # tolerence parameter for convergence [outer loop]
control$kfold = 10 # number of fold of crossvalidation
control$sigmafac = 2#1.345
# Robust regression using adaptive lasso penalty
fit.ada <- robregcc_sp(Xt,y,C,
beta.init = beta.wt, cindex = 1,
gamma.init = fit.init$residuals,
control = control,
penalty.index = 1, alpha = 0.95)
# Robust regression using lasso penalty [Huber equivalent]
fit.soft <- robregcc_sp(Xt,y,C, cindex = 1,
control = control, penalty.index = 2,
alpha = 0.95)
# Robust regression using hard thresholding penalty
control$lmaxfac = 1e2 # Parameter for constructing sequence of lambda
control$lminfac = 1e-3 # Parameter for constructing sequence of lambda
control$sigmafac = 2#1.345
fit.hard <- robregcc_sp(Xt,y,C, beta.init = fit.init$betaf,
gamma.init = fit.init$residuals,
cindex = 1,
control = control, penalty.index = 3,
alpha = 0.95)
plot_path(fit.ada)
plot_path(fit.soft)
plot_path(fit.hard)
Plot residuals estimate from robregcc object
Description
S3 methods extracting residuals from the objects generated by
robregcc
.
Usage
plot_resid(object, type = 0, s = 0)
Arguments
object |
Object generated by |
type |
0/1 residual estimate before/after sanity check |
s |
0/1 no/yes 1se rule |
Value
plot estimated residual
Examples
library(magrittr)
library(robregcc)
data(simulate_robregcc)
X <- simulate_robregcc$X;
y <- simulate_robregcc$y
C <- simulate_robregcc$C
n <- nrow(X); p <- ncol(X); k <- nrow(C)
Xt <- cbind(1,X) # accounting for intercept in predictor
C <- cbind(0,C) # accounting for intercept in constraint
bw <- c(0,rep(1,p)) # weight matrix to not penalize intercept
example_seed <- 2*p+1
set.seed(example_seed)
# Breakdown point for tukey Bisquare loss function
b1 = 0.5 # 50% breakdown point
cc1 = 1.567 # corresponding model parameter
b1 = 0.25; cc1 = 2.937
# Initialization [PSC analysis for compositional data]
control <- robregcc_option(maxiter=1000,tol = 1e-4,lminfac = 1e-7)
fit.init <- cpsc_sp(Xt, y,alp = 0.4, cfac = 2, b1 = b1,
cc1 = cc1,C,bw,1,control)
## Robust model fitting
# control parameters
control <- robregcc_option()
beta.wt <- fit.init$betaR # Set weight for model parameter beta
beta.wt[1] <- 0
control$gamma = 1 # gamma for constructing weighted penalty
control$spb = 40/p # fraction of maximum non-zero model parameter beta
control$outMiter = 1000 # Outer loop iteration
control$inMiter = 3000 # Inner loop iteration
control$nlam = 50 # Number of tuning parameter lambda to be explored
control$lmaxfac = 1 # Parameter for constructing sequence of lambda
control$lminfac = 1e-8 # Parameter for constructing sequence of lambda
control$tol = 1e-20; # tolrence parameter for converging [inner loop]
control$out.tol = 1e-16 # tolerence parameter for convergence [outer loop]
control$kfold = 10 # number of fold of crossvalidation
control$sigmafac = 2#1.345
# Robust regression using adaptive lasso penalty
fit.ada <- robregcc_sp(Xt,y,C,
beta.init = beta.wt, cindex = 1,
gamma.init = fit.init$residuals,
control = control,
penalty.index = 1, alpha = 0.95)
# Robust regression using lasso penalty [Huber equivalent]
fit.soft <- robregcc_sp(Xt,y,C, cindex = 1,
control = control, penalty.index = 2,
alpha = 0.95)
# Robust regression using hard thresholding penalty
control$lmaxfac = 1e2 # Parameter for constructing sequence of lambda
control$lminfac = 1e-3 # Parameter for constructing sequence of lambda
control$sigmafac = 2#1.345
fit.hard <- robregcc_sp(Xt,y,C, beta.init = fit.init$betaf,
gamma.init = fit.init$residuals,
cindex = 1,
control = control, penalty.index = 3,
alpha = 0.95)
plot_resid(fit.ada)
plot_resid(fit.soft)
plot_resid(fit.hard)
Extract residuals estimate from the sparse version of the robregcc fitted object.
Description
Robust residuals estimate
Usage
## S3 method for class 'robregcc'
residuals(object, ...)
Arguments
object |
robregcc fitted onject |
... |
Other argumnts for future usage. |
Value
residuals estimate
Examples
library(magrittr)
library(robregcc)
data(simulate_robregcc)
X <- simulate_robregcc$X;
y <- simulate_robregcc$y
C <- simulate_robregcc$C
n <- nrow(X); p <- ncol(X); k <- nrow(C)
Xt <- cbind(1,X) # accounting for intercept in predictor
C <- cbind(0,C) # accounting for intercept in constraint
bw <- c(0,rep(1,p)) # weight matrix to not penalize intercept
example_seed <- 2*p+1
set.seed(example_seed)
# Breakdown point for tukey Bisquare loss function
b1 = 0.5 # 50% breakdown point
cc1 = 1.567 # corresponding model parameter
b1 = 0.25; cc1 = 2.937
# Initialization [PSC analysis for compositional data]
control <- robregcc_option(maxiter=1000,tol = 1e-4,lminfac = 1e-7)
fit.init <- cpsc_sp(Xt, y,alp = 0.4, cfac = 2, b1 = b1,
cc1 = cc1,C,bw,1,control)
## Robust model fitting
# control parameters
control <- robregcc_option()
beta.wt <- fit.init$betaR # Set weight for model parameter beta
beta.wt[1] <- 0
control$gamma = 1 # gamma for constructing weighted penalty
control$spb = 40/p # fraction of maximum non-zero model parameter beta
control$outMiter = 1000 # Outer loop iteration
control$inMiter = 3000 # Inner loop iteration
control$nlam = 50 # Number of tuning parameter lambda to be explored
control$lmaxfac = 1 # Parameter for constructing sequence of lambda
control$lminfac = 1e-8 # Parameter for constructing sequence of lambda
control$tol = 1e-20; # tolrence parameter for converging [inner loop]
control$out.tol = 1e-16 # tolerence parameter for convergence [outer loop]
control$kfold = 10 # number of fold of crossvalidation
control$sigmafac = 2#1.345
# Robust regression using adaptive lasso penalty
fit.ada <- robregcc_sp(Xt,y,C,
beta.init = beta.wt, cindex = 1,
gamma.init = fit.init$residuals,
control = control,
penalty.index = 1, alpha = 0.95)
# Robust regression using lasso penalty [Huber equivalent]
fit.soft <- robregcc_sp(Xt,y,C, cindex = 1,
control = control, penalty.index = 2,
alpha = 0.95)
# Robust regression using hard thresholding penalty
control$lmaxfac = 1e2 # Parameter for constructing sequence of lambda
control$lminfac = 1e-3 # Parameter for constructing sequence of lambda
control$sigmafac = 2#1.345
fit.hard <- robregcc_sp(Xt,y,C, beta.init = fit.init$betaf,
gamma.init = fit.init$residuals,
cindex = 1,
control = control, penalty.index = 3,
alpha = 0.95)
residuals(fit.ada)
residuals(fit.soft)
residuals(fit.hard)
Control parameter for model estimation:
Description
The model approach use scaled lasoo approach for model selection.
Usage
robregcc_option(
maxiter = 10000,
tol = 1e-10,
nlam = 100,
out.tol = 1e-08,
lminfac = 1e-08,
lmaxfac = 10,
mu = 1,
nu = 1.05,
sp = 0.3,
gamma = 2,
outMiter = 3000,
inMiter = 500,
kmaxS = 500,
tolS = 1e-04,
nlamx = 20,
nlamy = 20,
spb = 0.3,
spy = 0.3,
lminfacX = 1e-06,
lminfacY = 0.01,
kfold = 10,
fullpath = 0,
sigmafac = 2
)
Arguments
maxiter |
maximum number of iteration for convergence |
tol |
tolerance value set for convergence |
nlam |
number of lambda to be genrated to obtain solution path |
out.tol |
tolernce value set for convergence of outer loop |
lminfac |
a multiplier of determing lambda_min as a fraction of lambda_max |
lmaxfac |
a multiplier of lambda_max |
mu |
penalty parameter used in enforcing orthogonality |
nu |
penalty parameter used in enforcing orthogonality (incremental rate of mu) |
sp |
maximum proportion of nonzero elements in shift parameter |
gamma |
adaptive penalty weight exponential factor |
outMiter |
maximum number of outer loop iteration |
inMiter |
maximum number of inner loop iteration |
kmaxS |
maximum number of iteration for fast S estimator for convergence |
tolS |
tolerance value set for convergence in case of fast S estimator |
nlamx |
number of x lambda |
nlamy |
number of y lambda |
spb |
sparsity in beta |
spy |
sparsity in shift gamma |
lminfacX |
a multiplier of determing lambda_min as a fraction of lambda_max for sparsity in X |
lminfacY |
a multiplier of determing lambda_min as a fraction of lambda_max for sparsity in shift parameter |
kfold |
nummber of folds for crossvalidation |
fullpath |
1/0 to get full path yes/no |
sigmafac |
multiplying factor for the range of standard deviation |
Value
a list of controling parameter.
Examples
# default options
library(robregcc)
control_default = robregcc_option()
# manual options
control_manual <- robregcc_option(maxiter=1000,tol = 1e-4,lminfac = 1e-7)
Simulation data
Description
Simulate data for the robust regression with compositional covariates
Usage
robregcc_sim(n, betacc, beta0, O, Sigma, levg, snr, shft, m, C, out = list())
Arguments
n |
sample size |
betacc |
model parameter satisfying compositional covariates |
beta0 |
intercept |
O |
number of outlier |
Sigma |
covariance matrix of simulated predictors |
levg |
1/0 whether to include leveraged observation or not |
snr |
noise to signal ratio |
shft |
multiplying factor to model variance for creating outlier |
m |
test sample size |
C |
subcompositional matrix |
out |
list for obtaining output with simulated data structure |
Value
a list containing simulated output.
References
Mishra, A., Mueller, C.,(2019) Robust regression with compositional covariates. In prepration. arXiv:1909.04990.
Examples
## Simulation example:
library(robregcc)
library(magrittr)
## n: sample size
## p: number of predictors
## o: fraction of observations as outliers
## L: {0,1} => leveraged {no, yes},
## s: multiplicative factor
## ngrp: number of subgroup in the model
## snr: noise to signal ratio for computing true std_err
## Define parameters to simulate example
p <- 80 # number of predictors
n <- 300 # number of sample
O <- 0.10*n # number of outlier
L <- 1
s <- 8
ngrp <- 4 # number of sub-composition
snr <- 3 # Signal to noise ratio
example_seed <- 2*p+1 # example seed
set.seed(example_seed)
# Simulate subcomposition matrix
C1 <- matrix(0,ngrp,23)
tind <- c(0,10,16,20,23)
for (ii in 1:ngrp)
C1[ii,(tind[ii] + 1):tind[ii + 1]] <- 1
C <- matrix(0,ngrp,p)
C[,1:ncol(C1)] <- C1
# model parameter beta
beta0 <- 0.5
beta <- c(1, -0.8, 0.4, 0, 0, -0.6, 0, 0, 0, 0, -1.5, 0, 1.2, 0, 0, 0.3)
beta <- c(beta,rep(0,p - length(beta)))
# Simulate response and predictor, i.e., X, y
Sigma <- 1:p %>% outer(.,.,'-') %>% abs(); Sigma <- 0.5^Sigma
data.case <- vector("list",1)
set.seed(example_seed)
data.case <- robregcc_sim(n,beta,beta0, O = O,
Sigma,levg = L, snr,shft = s,0, C,out = data.case)
data.case$C <- C
# We have saved a copy of simulated data in the package
# with name simulate_robregcc
# simulate_robregcc = data.case;
# save(simulate_robregcc, file ='data/simulate_robregcc.rda')
X <- data.case$X # predictor matrix
y <- data.case$y # model response
Robust model estimation approach for regression with compositional covariates.
Description
Fit regression model with compositional covariates for a range of tuning parameter lambda. Model parameters is assumed to be sparse.
Usage
robregcc_sp(
X,
y,
C,
beta.init = NULL,
gamma.init = NULL,
cindex = 1,
control = list(),
penalty.index = 3,
alpha = 1,
verbose = TRUE
)
Arguments
X |
predictor matrix |
y |
phenotype/response vector |
C |
conformable sub-compositional matrix |
beta.init |
initial value of model parameter beta |
gamma.init |
inital value of shift parameter gamma |
cindex |
index of control (not penalized) variable in the model |
control |
a list of internal parameters controlling the model fitting |
penalty.index |
a vector of length 2 specifying type of penalty for model parameter and shift parameter respectively. 1, 2, 3 corresponding to adaptive, soft and hard penalty |
alpha |
elastic net penalty |
verbose |
TRUE/FALSE for showing progress of the cross validation |
Value
Method |
Type of penalty used |
betapath |
model parameter estimate along solution path |
gammapath |
shift parameter estimate along solution path |
lampath |
sequence of fitted lambda) |
k0 |
scaling factor |
cver |
error from k fold cross validation |
selInd |
selected index from minimum and 1se rule cross validation error |
beta0 |
beta estimate corresponding to selected index |
gamma0 |
mean shift estimate corresponding to selected index |
residual0 |
residual estimate corresponding to selected index |
inlier0 |
inlier index corresponding to selected index |
betaE |
Post selection estimate corresponding to selected index |
residualE |
post selection residual corresponding to selected index |
inlierE |
post selection inlier index corresponding to selected index |
References
Mishra, A., Mueller, C.,(2019) Robust regression with compositional covariates. In prepration. arXiv:1909.04990.
Examples
library(magrittr)
library(robregcc)
data(simulate_robregcc)
X <- simulate_robregcc$X;
y <- simulate_robregcc$y
C <- simulate_robregcc$C
n <- nrow(X); p <- ncol(X); k <- nrow(C)
Xt <- cbind(1,X) # accounting for intercept in predictor
C <- cbind(0,C) # accounting for intercept in constraint
bw <- c(0,rep(1,p)) # weight matrix to not penalize intercept
example_seed <- 2*p+1
set.seed(example_seed)
# Breakdown point for tukey Bisquare loss function
b1 = 0.5 # 50% breakdown point
cc1 = 1.567 # corresponding model parameter
b1 = 0.25; cc1 = 2.937
# Initialization [PSC analysis for compositional data]
control <- robregcc_option(maxiter=1000,tol = 1e-4,lminfac = 1e-7)
fit.init <- cpsc_sp(Xt, y,alp = 0.4, cfac = 2, b1 = b1,
cc1 = cc1,C,bw,1,control)
## Robust model fitting
# control parameters
control <- robregcc_option()
beta.wt <- fit.init$betaR # Set weight for model parameter beta
beta.wt[1] <- 0
control$gamma = 1 # gamma for constructing weighted penalty
control$spb = 40/p # fraction of maximum non-zero model parameter beta
control$outMiter = 1000 # Outer loop iteration
control$inMiter = 3000 # Inner loop iteration
control$nlam = 50 # Number of tuning parameter lambda to be explored
control$lmaxfac = 1 # Parameter for constructing sequence of lambda
control$lminfac = 1e-8 # Parameter for constructing sequence of lambda
control$tol = 1e-20; # tolrence parameter for converging [inner loop]
control$out.tol = 1e-16 # tolerence parameter for convergence [outer loop]
control$kfold = 10 # number of fold of crossvalidation
control$sigmafac = 2#1.345
# Robust regression using adaptive lasso penalty
fit.ada <- robregcc_sp(Xt,y,C,
beta.init = beta.wt, cindex = 1,
gamma.init = fit.init$residuals,
control = control,
penalty.index = 1, alpha = 0.95)
# Robust regression using lasso penalty [Huber equivalent]
fit.soft <- robregcc_sp(Xt,y,C, cindex = 1,
control = control, penalty.index = 2,
alpha = 0.95)
# Robust regression using hard thresholding penalty
control$lmaxfac = 1e2 # Parameter for constructing sequence of lambda
control$lminfac = 1e-3 # Parameter for constructing sequence of lambda
control$sigmafac = 2#1.345
fit.hard <- robregcc_sp(Xt,y,C, beta.init = fit.init$betaf,
gamma.init = fit.init$residuals,
cindex = 1,
control = control, penalty.index = 3,
alpha = 0.95)
Simulated date for testing functions in the robregcc package (sparse setting).
Description
A list of response (y), predictors (X) and sub-cpmposition matrix (C).
Usage
data(simulate_robregcc)
Format
A list with three components:
- X
Compositional predictors.
- y
Outcome with outliers.
- C
Sub-cmposition matrix.
Details
Vector y, response with a certain percentage of observations as outliers.
Matrix X, Compositional predictors.
Source
Similated data
Examples
library(robregcc)
data(simulate_robregcc)
X <- simulate_robregcc$X;
y <- simulate_robregcc$y
C <- simulate_robregcc$C
n <- nrow(X); p <- ncol(X); k <- nrow(C)