Type: Package
Title: A Robust Integrated Mean Variance Correlation
Version: 0.1.0
Description: Measure the dependence structure between two random variables with a new correlation coefficient and extend it to hypothesis test, feature screening and false discovery rate control.
License: GPL-3
Encoding: UTF-8
Imports: splines, quantreg, expm, CompQuadForm, GGMridge, limma, stats
RoxygenNote: 7.2.3
Suggests: knitr, mvtnorm, rmarkdown, testthat (≥ 3.0.0)
VignetteBuilder: knitr
Config/testthat/edition: 3
NeedsCompilation: no
Packaged: 2024-04-14 04:03:04 UTC; Surface
Author: Wei Xiong [aut], Han Pan [aut, cre], Hengjian Cui [aut]
Maintainer: Han Pan <scott_pan@163.com>
Repository: CRAN
Date/Publication: 2024-04-16 14:50:13 UTC

Integrated Mean Variance Correlation

Description

This function is used to calculate the integrated mean variance correlation between two vectors

Usage

IMVC(y, x, K, NN = 3, type)

Arguments

y

is a numeric vector

x

is a numeric vector

K

is the number of quantile levels

NN

is the number of B spline basis, default is 3

type

is an indicator for measuring linear or nonlinear correlation, "linear" represents linear correlation and "nonlinear" represents linear or nonlinear correlation using B splines

Value

The value of the corresponding sample statistic

Examples

n=200
x=rnorm(n)
y=x^2+rt(n,2)

IMVC(y,x,K=10,type="nonlinear")

Integrated Mean Variance Correlation Based FDR Control

Description

This function is used for FDR control with integrated mean variance correlation

Usage

IMVCFDR(y, x, K, NN = 3, numboot, timeboot, true_signal, null_method, alpha)

Arguments

y

is the response vector

x

is the covariate matrix

K

is the number of quantile levels

NN

is the number of B spline basis, default is 3

numboot

is the size of bootstrap samples

timeboot

is the number of bootstrap times for computing standard deviation of the IMVC

true_signal

is the true active set

null_method

is the estimation method for proportion of true null hypotheses. Choices are "lfdr", "mean", "hist" or "convest"

alpha

is the nominal FDR level

Value

A list of FDP, power and selected variables

Examples

require("mvtnorm")
n=200
p=20
pho1=0.5
mean_x=rep(0,p)
sigma_x=matrix(NA,nrow = p,ncol = p)
for (i in 1:p) {
 for (j in 1:p) {
   sigma_x[i,j]=pho1^(abs(i-j))
 }
}
x=rmvnorm(n, mean = mean_x, sigma = sigma_x,method = "chol")
x1=x[,1]
x2=x[,2]
x3=x[,3]
y=2*x1+2*x2+2*x3+rnorm(n)

IMVCFDR(y,x,K=5,numboot=100,timeboot=20,true_signal=c(1,2,3),null_method="hist",alpha=0.2)

Integrated Mean Variance Correlation Based Screening

Description

This function is used to select important features using integrated mean variance correlation

Usage

IMVCS(y, x, K, d, NN = 3, type)

Arguments

y

is the response vector

x

is the covariate matrix

K

is the number of quantile levels

d

is the size of selected variables

NN

is the number of B spline basis, default is 3

type

is an indicator for measuring linear or nonlinear correlation, "linear" represents linear correlation and "nonlinear" represents linear or nonlinear correlation using B splines

Value

The labels of first d largest active set of all predictors

Examples

require("mvtnorm")
n=200
p=500
pho1=0.8
mean_x=rep(0,p)
sigma_x=matrix(NA,nrow = p,ncol = p)
for (i in 1:p) {
 for (j in 1:p) {
   sigma_x[i,j]=pho1^(abs(i-j))
 }
}
x=rmvnorm(n, mean = mean_x, sigma = sigma_x,method = "chol")
x1=x[,1]
x2=x[,2]
x3=x[,12]
x4=x[,22]
y=2*x1+0.5*x2+3*x3*ifelse(x3<0,1,0)+2*x4+rnorm(n)

IMVCS(y,x,K=5,d=round(n/log(n)),type="nonlinear")

Integrated Mean Variance Correlation Based Hypothesis Test

Description

This function is used to test significance of linear or nonlinear correlation using integrated mean variance correlation

Usage

IMVCT(x, y, K, num_per, NN = 3, type)

Arguments

x

is the univariate covariate vector

y

is the response vector

K

is the number of quantile levels

num_per

is the number of permutation times

NN

is the number of B spline basis, default is 3

type

is an indicator for measuring linear or nonlinear correlation, "linear" represents linear correlation and "nonlinear" represents linear or nonlinear correlation using B splines

Value

The p-value of the corresponding hypothesis test

Examples

# linear model
n=100
x=rnorm(n)
y=2*x+rt(n,2)

IMVCT(x,y,K=5,type = "linear")
# nonlinear model
n=100
x=rnorm(n)
y=2*cos(x)+rt(n,2)

IMVCT(x,y,K=5,type = "nonlinear",num_per = 100)