Version: | 1.0.0 |
Title: | Penalised Regression for Dichotomised Outcomes |
Description: | Implements lasso and ridge regression for dichotomised outcomes (<doi:10.1080/02664763.2023.2233057>), i.e., numerical outcomes that were transformed to binary outcomes. Such artificial binary outcomes indicate whether an underlying measurement is greater than a threshold. |
Depends: | R (≥ 3.0.0) |
Imports: | glmnet, palasso |
Suggests: | knitr, testthat, rmarkdown, RColorBrewer, MASS, mvtnorm, randomForest, xgboost, MLmetrics |
License: | GPL-3 |
Encoding: | UTF-8 |
VignetteBuilder: | knitr |
RoxygenNote: | 7.3.2 |
URL: | https://github.com/rauschenberger/cornet, https://rauschenberger.github.io/cornet/ |
BugReports: | https://github.com/rauschenberger/cornet/issues |
NeedsCompilation: | no |
Packaged: | 2024-09-26 12:43:29 UTC; armin.rauschenberger |
Author: | Armin Rauschenberger
|
Maintainer: | Armin Rauschenberger <armin.rauschenberger@uni.lu> |
Repository: | CRAN |
Date/Publication: | 2024-09-26 22:50:07 UTC |
Combined regression
Description
Implements lasso and ridge regression for dichotomised outcomes. Such outcomes are not naturally but artificially binary. They indicate whether an underlying measurement is greater than a threshold.
Usage
cornet(
y,
cutoff,
X,
alpha = 1,
npi = 101,
pi = NULL,
nsigma = 99,
sigma = NULL,
nfolds = 10,
foldid = NULL,
type.measure = "deviance",
...
)
Arguments
y |
continuous outcome:
vector of length |
cutoff |
cut-off point for dichotomising outcome into classes:
meaningful value between |
X |
features:
numeric matrix with |
alpha |
elastic net mixing parameter:
numeric between |
npi |
number of |
pi |
pi sequence:
vector of increasing values in the unit interval;
or |
nsigma |
number of |
sigma |
sigma sequence:
vector of increasing positive values;
or |
nfolds |
number of folds:
integer between |
foldid |
fold identifiers:
vector with entries between |
type.measure |
loss function for binary classification:
character |
... |
further arguments passed to |
Details
The argument family
is unavailable, because
this function fits a gaussian model for the numeric response,
and a binomial model for the binary response.
Linear regression uses the loss function "deviance"
(or "mse"
),
but the loss is incomparable between linear and logistic regression.
The loss function "auc"
is unavailable for internal cross-validation.
If at all, use "auc"
for external cross-validation only.
Value
Returns an object of class cornet
, a list with multiple slots:
-
gaussian
: fitted linear model, classglmnet
-
binomial
: fitted logistic model, classglmnet
-
sigma
: scaling parameterssigma
, vector of lengthnsigma
-
pi
: weighting parameterspi
, vector of lengthnpi
-
cvm
: evaluation loss, matrix withnsigma
rows andnpi
columns -
sigma.min
: optimal scaling parameter, positive scalar -
pi.min
: optimal weighting parameter, scalar in unit interval -
cutoff
: threshold for dichotomisation
References
Armin Rauschenberger and Enrico Glaab (2024). "Predicting dichotomised outcomes from high-dimensional data in biomedicine". Journal of Applied Statistics 51(9):1756-1771. doi:10.1080/02664763.2023.2233057. (Click here to access PDF. Contact: armin.rauschenberger@uni.lu.)
See Also
Methods for objects of class cornet
include
coef
and
predict
.
Examples
n <- 100; p <- 200
y <- rnorm(n)
X <- matrix(rnorm(n*p),nrow=n,ncol=p)
net <- cornet(y=y,cutoff=0,X=X)
net
Arguments
Description
Verifies whether an argument matches formal requirements.
Usage
.check(
x,
type,
dim = NULL,
miss = FALSE,
min = NULL,
max = NULL,
values = NULL,
inf = FALSE,
null = FALSE
)
Arguments
x |
argument |
type |
character |
dim |
vector/matrix dimensionality: integer scalar/vector |
miss |
accept missing values: logical |
min |
lower limit: numeric |
max |
upper limit: numeric |
values |
only accept specific values: vector |
inf |
accept infinite ( |
null |
accept |
Examples
cornet:::.check(0.5,type="scalar",min=0,max=1)
Equality
Description
Verifies whether two or more arguments are identical.
Usage
.equal(..., na.rm = FALSE)
Arguments
... |
scalars, vectors, or matrices of equal dimensions |
na.rm |
remove missing values: logical |
Examples
cornet:::.equal(1,1,1)
Data simulation
Description
Simulates data for unit tests
Usage
.simulate(n, p, cor = 0, prob = 0.1, sd = 1, exp = 1, frac = 1)
Arguments
n |
sample size: positive integer |
p |
covariate space: positive integer |
cor |
correlation coefficient :
numeric between |
prob |
effect proportion:
numeric between |
sd |
standard deviation: positive numeric |
exp |
exponent: positive numeric |
frac |
class proportion:
numeric between |
Details
For simulating correlated features (cor
>0
),
this function requires the R package MASS
(see mvrnorm
).
Value
Returns invisible list with elements y
and X
.
Examples
data <- cornet:::.simulate(n=10,p=20)
names(data)
Single-split test
Description
Compares models for a continuous response with a cut-off value.
Usage
.test(y, cutoff, X, alpha = 1, type.measure = "deviance")
Arguments
y |
continuous outcome:
vector of length |
cutoff |
cut-off point for dichotomising outcome into classes:
meaningful value between |
X |
features:
numeric matrix with |
alpha |
elastic net mixing parameter:
numeric between |
type.measure |
loss function for binary classification:
character |
Details
Splits samples into 80
percent for training
and 20
percent for testing,
calculates squared deviance residuals of logistic and combined regression,
conducts the paired one-sided Wilcoxon signed rank test,
and returns the p
-value.
For the multi-split test,
use the median p
-value from 50
single-split tests
(van de Wiel 2009).
Examples
n <- 100; p <- 200
y <- rnorm(n)
X <- matrix(rnorm(n*p),nrow=n,ncol=p)
cornet:::.test(y=y,cutoff=0,X=X)
Extract estimated coefficients
Description
Extracts estimated coefficients from linear and logistic regression, under the penalty parameter that minimises the cross-validated loss.
Usage
## S3 method for class 'cornet'
coef(object, ...)
Arguments
object |
cornet object |
... |
further arguments (not applicable) |
Value
This function returns a matrix with n
rows and two columns,
where n
is the sample size. It includes the estimated coefficients
from linear regression (1st column: "beta"
)
and logistic regression (2nd column: "gamma"
).
Examples
n <- 100; p <- 200
y <- rnorm(n)
X <- matrix(rnorm(n*p),nrow=n,ncol=p)
net <- cornet(y=y,cutoff=0,X=X)
coef(net)
Performance measurement
Description
Compares models for a continuous response with a cut-off value.
Usage
cv.cornet(
y,
cutoff,
X,
alpha = 1,
nfolds.ext = 5,
nfolds.int = 10,
foldid.ext = NULL,
foldid.int = NULL,
type.measure = "deviance",
rf = FALSE,
xgboost = FALSE,
...
)
Arguments
y |
continuous outcome:
vector of length |
cutoff |
cut-off point for dichotomising outcome into classes:
meaningful value between |
X |
features:
numeric matrix with |
alpha |
elastic net mixing parameter:
numeric between |
nfolds.ext |
number of external folds |
nfolds.int |
internal fold identifiers:
vector of length |
foldid.ext |
external fold identifiers:
vector of length |
foldid.int |
number of internal folds |
type.measure |
loss function for binary classification:
character |
rf |
comparison with random forest: logical |
xgboost |
comparison with extreme gradient boosting: logical |
... |
Details
Computes the cross-validated loss of logistic and combined regression.
Examples
## Not run: n <- 100; p <- 200
y <- rnorm(n)
X <- matrix(rnorm(n*p),nrow=n,ncol=p)
start <- Sys.time()
loss <- cv.cornet(y=y,cutoff=0,X=X)
end <- Sys.time()
end - start
loss
## End(Not run)
Plot loss matrix
Description
Plots the loss for different combinations of scaling (sigma) and weighting (pi) parameters.
Usage
## S3 method for class 'cornet'
plot(x, ...)
Arguments
x |
cornet object |
... |
further arguments (not applicable) |
Value
This function plots the evaluation loss (cvm
).
Whereas the matrix has sigma in the rows, and pi in the columns,
the plot has sigma on the x
-axis, and pi on the y
-axis.
For all combinations of sigma and pi, the colour indicates the loss.
If the R package RColorBrewer
is installed,
blue represents low. Otherwise, red represents low.
White always represents high.
Examples
n <- 100; p <- 200
y <- rnorm(n)
X <- matrix(rnorm(n*p),nrow=n,ncol=p)
net <- cornet(y=y,cutoff=0,X=X)
plot(net)
Predict binary outcome
Description
Predicts the binary outcome with linear, logistic, and combined regression.
Usage
## S3 method for class 'cornet'
predict(object, newx, type = "probability", ...)
Arguments
object |
cornet object |
newx |
covariates:
numeric matrix with |
type |
|
... |
further arguments (not applicable) |
Details
For linear regression, this function tentatively transforms the predicted values to predicted probabilities, using a Gaussian distribution with a fixed mean (threshold) and a fixed variance (estimated variance of the numeric outcome).
Examples
n <- 100; p <- 200
y <- rnorm(n)
X <- matrix(rnorm(n*p),nrow=n,ncol=p)
net <- cornet(y=y,cutoff=0,X=X)
predict(net,newx=X)
Combined regression
Description
Prints summary of cornet object.
Usage
## S3 method for class 'cornet'
print(x, ...)
Arguments
x |
cornet object |
... |
further arguments (not applicable) |
Value
Returns sample size n
,
number of covariates p
,
information on dichotomisation,
tuned scaling parameter (sigma),
tuned weighting parameter (pi),
and corresponding loss.
Examples
n <- 100; p <- 200
y <- rnorm(n)
X <- matrix(rnorm(n*p),nrow=n,ncol=p)
net <- cornet(y=y,cutoff=0,X=X)
print(net)