Type: | Package |
Title: | A Robust Integrated Mean Variance Correlation |
Version: | 0.1.0 |
Description: | Measure the dependence structure between two random variables with a new correlation coefficient and extend it to hypothesis test, feature screening and false discovery rate control. |
License: | GPL-3 |
Encoding: | UTF-8 |
Imports: | splines, quantreg, expm, CompQuadForm, GGMridge, limma, stats |
RoxygenNote: | 7.2.3 |
Suggests: | knitr, mvtnorm, rmarkdown, testthat (≥ 3.0.0) |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2024-04-14 04:03:04 UTC; Surface |
Author: | Wei Xiong [aut], Han Pan [aut, cre], Hengjian Cui [aut] |
Maintainer: | Han Pan <scott_pan@163.com> |
Repository: | CRAN |
Date/Publication: | 2024-04-16 14:50:13 UTC |
Integrated Mean Variance Correlation
Description
This function is used to calculate the integrated mean variance correlation between two vectors
Usage
IMVC(y, x, K, NN = 3, type)
Arguments
y |
is a numeric vector |
x |
is a numeric vector |
K |
is the number of quantile levels |
NN |
is the number of B spline basis, default is 3 |
type |
is an indicator for measuring linear or nonlinear correlation, "linear" represents linear correlation and "nonlinear" represents linear or nonlinear correlation using B splines |
Value
The value of the corresponding sample statistic
Examples
n=200
x=rnorm(n)
y=x^2+rt(n,2)
IMVC(y,x,K=10,type="nonlinear")
Integrated Mean Variance Correlation Based FDR Control
Description
This function is used for FDR control with integrated mean variance correlation
Usage
IMVCFDR(y, x, K, NN = 3, numboot, timeboot, true_signal, null_method, alpha)
Arguments
y |
is the response vector |
x |
is the covariate matrix |
K |
is the number of quantile levels |
NN |
is the number of B spline basis, default is 3 |
numboot |
is the size of bootstrap samples |
timeboot |
is the number of bootstrap times for computing standard deviation of the IMVC |
true_signal |
is the true active set |
null_method |
is the estimation method for proportion of true null hypotheses. Choices are "lfdr", "mean", "hist" or "convest" |
alpha |
is the nominal FDR level |
Value
A list of FDP, power and selected variables
Examples
require("mvtnorm")
n=200
p=20
pho1=0.5
mean_x=rep(0,p)
sigma_x=matrix(NA,nrow = p,ncol = p)
for (i in 1:p) {
for (j in 1:p) {
sigma_x[i,j]=pho1^(abs(i-j))
}
}
x=rmvnorm(n, mean = mean_x, sigma = sigma_x,method = "chol")
x1=x[,1]
x2=x[,2]
x3=x[,3]
y=2*x1+2*x2+2*x3+rnorm(n)
IMVCFDR(y,x,K=5,numboot=100,timeboot=20,true_signal=c(1,2,3),null_method="hist",alpha=0.2)
Integrated Mean Variance Correlation Based Screening
Description
This function is used to select important features using integrated mean variance correlation
Usage
IMVCS(y, x, K, d, NN = 3, type)
Arguments
y |
is the response vector |
x |
is the covariate matrix |
K |
is the number of quantile levels |
d |
is the size of selected variables |
NN |
is the number of B spline basis, default is 3 |
type |
is an indicator for measuring linear or nonlinear correlation, "linear" represents linear correlation and "nonlinear" represents linear or nonlinear correlation using B splines |
Value
The labels of first d largest active set of all predictors
Examples
require("mvtnorm")
n=200
p=500
pho1=0.8
mean_x=rep(0,p)
sigma_x=matrix(NA,nrow = p,ncol = p)
for (i in 1:p) {
for (j in 1:p) {
sigma_x[i,j]=pho1^(abs(i-j))
}
}
x=rmvnorm(n, mean = mean_x, sigma = sigma_x,method = "chol")
x1=x[,1]
x2=x[,2]
x3=x[,12]
x4=x[,22]
y=2*x1+0.5*x2+3*x3*ifelse(x3<0,1,0)+2*x4+rnorm(n)
IMVCS(y,x,K=5,d=round(n/log(n)),type="nonlinear")
Integrated Mean Variance Correlation Based Hypothesis Test
Description
This function is used to test significance of linear or nonlinear correlation using integrated mean variance correlation
Usage
IMVCT(x, y, K, num_per, NN = 3, type)
Arguments
x |
is the univariate covariate vector |
y |
is the response vector |
K |
is the number of quantile levels |
num_per |
is the number of permutation times |
NN |
is the number of B spline basis, default is 3 |
type |
is an indicator for measuring linear or nonlinear correlation, "linear" represents linear correlation and "nonlinear" represents linear or nonlinear correlation using B splines |
Value
The p-value of the corresponding hypothesis test
Examples
# linear model
n=100
x=rnorm(n)
y=2*x+rt(n,2)
IMVCT(x,y,K=5,type = "linear")
# nonlinear model
n=100
x=rnorm(n)
y=2*cos(x)+rt(n,2)
IMVCT(x,y,K=5,type = "nonlinear",num_per = 100)