Type: | Package |
Title: | Multivariate Inverse Gaussian Distribution |
Version: | 2.0 |
Description: | Provides utilities for estimation for the multivariate inverse Gaussian distribution of Minami (2003) <doi:10.1081/STA-120025379>, including random vector generation and explicit estimators of the location vector and scale matrix. The package implements kernel density estimators discussed in Belzile, Desgagnes, Genest and Ouimet (2024) <doi:10.48550/arXiv.2209.04757> for smoothing multivariate data on half-spaces. |
BugReports: | https://github.com/lbelzile/mig/issues |
Imports: | statmod, TruncatedNormal (≥ 2.3), Rcpp (≥ 1.0.12) |
Depends: | R (≥ 2.10) |
Suggests: | numDeriv, tinytest, knitr, rmarkdown, minqa |
LinkingTo: | Rcpp, RcppArmadillo |
Encoding: | UTF-8 |
LazyData: | true |
License: | MIT + file LICENSE |
VignetteBuilder: | knitr |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | yes |
Packaged: | 2025-04-08 22:13:41 UTC; lbelzile |
Author: | Leo Belzile |
Maintainer: | Leo Belzile <belzilel@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-04-08 22:50:02 UTC |
Log of sum of terms
Description
Computes the log of a sum of positive components, given on the log scale (lx
), avoiding numerical overflow.
Usage
.lsum(lx, loff)
Arguments
lx |
vector or matrix of log of terms to add |
loff |
log of offset |
Value
a vector of log sums
Author(s)
Marius Hofert, Martin Maechler (package copula
)
Maximum likelihood estimation of multivariate inverse Gaussian vectors
Description
The maximum likelihood estimators are obtained for fixed shift vector \boldsymbol{a}
and half-space vector \boldsymbol{\beta}
.
Usage
.mig_mle(x, beta, shift)
Arguments
x |
|
beta |
|
shift |
|
Value
a list with components:
-
xi
: MLE of the expectation or location vector -
Omega
: MLE of the scale matrix
Method of moments estimators for multivariate inverse Gaussian vectors
Description
These estimators are based on the sample mean and covariance.
Usage
.mig_mom(x, beta, shift)
Arguments
x |
|
beta |
|
shift |
|
Value
a list with components:
-
xi
: MOM estimate of the expectation or location vector -
Omega
: MOM estimate of the scale matrix
Threshold selection for bandwidth
Description
Automated thresholds selection for the robust likelihood cross validation. The cutoff is based on the covariance matrix of the sample data.
Usage
an(x)
Arguments
x |
matrix of observations |
Value
cutoff for robust selection
Multivariate inverse Gaussian distribution
Description
The density of the MIG model is
f(\boldsymbol{x}+\boldsymbol{a}) =(2\pi)^{-d/2}\boldsymbol{\beta}^{\top}\boldsymbol{\xi}|\boldsymbol{\Omega}|^{-1/2}(\boldsymbol{\beta}^{\top}\boldsymbol{x})^{-(1+d/2)}\exp\left\{-\frac{(\boldsymbol{x}-\boldsymbol{\xi})^{\top}\boldsymbol{\Omega}^{-1}(\boldsymbol{x}-\boldsymbol{\xi})}{2\boldsymbol{\beta}^{\top}\boldsymbol{x}}\right\}
for points in the d
-dimensional half-space \{\boldsymbol{x} \in \mathbb{R}^d: \boldsymbol{\beta}^{\top}(\boldsymbol{x}-\boldsymbol{a}) \geq 0\}
Usage
dmig(x, xi, Omega, beta, shift, log = FALSE)
rmig(n, xi, Omega, beta, shift, method = c("invsim", "bm"), timeinc = 0.001)
pmig(q, xi, Omega, beta, log = FALSE, method = c("sov", "mc"), B = 10000L)
Arguments
x |
|
xi |
|
Omega |
|
beta |
|
shift |
|
log |
logical; if |
n |
number of observations |
method |
string; one of inverse system ( |
timeinc |
time increment for multivariate simulation algorithm based on the hitting time of Brownian motion, default to |
q |
|
B |
number of Monte Carlo replications for the SOV estimator |
Details
Observations are generated using the representation as the first hitting time of a hyperplane of a correlated Brownian motion.
Value
for dmig
, the (log)-density
for rmig
, an n
vector if d=1
(univariate) or an n
by d
matrix if d > 1
an n
vector of (log) probabilities
Author(s)
Frederic Ouimet (bm
), Leo Belzile (invsim
)
Leo Belzile
Examples
# Density evaluation
x <- rbind(c(1, 2), c(2,3), c(0,-1))
beta <- c(1, 0)
xi <- c(1, 1)
Omega <- matrix(c(2, -1, -1, 2), nrow = 2, ncol = 2)
dmig(x, xi = xi, Omega = Omega, beta = beta)
# Random number generation
d <- 5L
beta <- rexp(d)
xi <- rexp(d)
Omega <- matrix(0.5, d, d) + diag(d)
samp <- rmig(n = 1000, beta = beta, xi = xi, Omega = Omega)
stopifnot(isTRUE(all(samp %*% beta > 0)))
mle <- fit_mig(samp, beta = beta, method = "mle")
set.seed(1234)
d <- 2L
beta <- runif(d)
Omega <- rWishart(n = 1, df = 2*d, Sigma = matrix(0.5, d, d) + diag(d))[,,1]
xi <- rexp(d)
q <- mig::rmig(n = 10, beta = beta, Omega = Omega, xi = xi)
pmig(q, xi = xi, beta = beta, Omega = Omega)
Laplacian of the MIG density with respect to the data
Description
Computes the sum of second derivatives of the multivariate
inverse Gaussian density with respect to the data argument
x
. The function is vectorized for more efficiency.
Usage
dmig_laplacian(x, xi, Omega, beta, scale = TRUE)
Arguments
x |
|
xi |
|
Omega |
|
beta |
|
Value
an n
vector
Density of elliptical vectors subject to a linear constraint
Description
Compute the density of multivariate Student-t or Gaussian \boldsymbol{x}
with location vector mu
, scale matrix sigma
and df
(integer) degrees of freedom
subject to the linear constraint \boldsymbol{\beta}^\top\boldsymbol{x} > \delta
.
Negative degrees of freedom or values larger than 1000 imply Gaussian vectors are generated instead.
Usage
dtellipt(x, beta, mu, sigma, df, delta = 0, log = FALSE)
Arguments
beta |
|
mu |
location vector |
sigma |
scale matrix |
df |
degrees of freedom argument |
delta |
buffer; default to zero |
log |
logical; if |
Value
an n
by d
matrix of random vectors
Fit multivariate inverse Gaussian distribution
Description
Fit multivariate inverse Gaussian distribution
Usage
fit_mig(x, beta, method = c("mle", "mom"), shift)
Arguments
x |
|
beta |
|
method |
string, one of |
shift |
|
Value
a list with components:
-
xi
: estimate of the expectation or location vector -
Omega
: estimate of the scale matrix
Gaussian kernel density estimator
Description
Given a data matrix over a half-space defined by beta
,
compute the log density of the asymmetric truncated Gaussian kernel density estimator,
taking in turn an observation as location vector.
Usage
gauss_kdens_arma(x, newdata, Sigma, logweights, logd)
Arguments
x |
matrix of observations |
newdata |
matrix of new observations at which to evaluated the kernel density |
Sigma |
covariance matrix |
logd |
logical; if |
Value
log density estimator
Likelihood cross-validation for Gaussian kernel density estimation
Description
Likelihood cross-validation criterion function.
Usage
gauss_lcv(x, Sigma, logweights)
Arguments
x |
|
Sigma |
smoothing positive-definite matrix |
logweights |
log vector of weights |
Value
LCV criterion value
Leave-one-out cross-validation for Gaussian kernel density estimation
Description
Given a data matrix, compute the log density using leave-one-out cross validation, taking in turn an observation as location vector and computing the density of the resulting mixture.
Usage
gauss_loo(x, Sigma, logweights)
Arguments
x |
|
Sigma |
smoothing positive-definite matrix |
logweights |
vector of log weights |
Value
a vector of values for the weighted leave-one-out likelihood
Least squares cross-validation for Gaussian kernel density estimation
Description
Least squares cross-validation for weighted Gaussian samples.
Usage
gauss_lscv(x, Sigma, logweights, xsamp, dxsamp, mckern = TRUE)
Arguments
x |
|
Sigma |
smoothing positive-definite matrix |
logweights |
log vector of weights |
xsamp |
|
dxsamp |
|
mckern |
logical; if |
Value
least square criterion value
Robust likelihood cross-validation for Gaussian kernel density estimation
Description
Robust likelihood cross-validation criterion function of Wu.
Usage
gauss_rlcv(x, Sigma, logweights, an, xsamp, dxsamp, mckern = TRUE)
Arguments
x |
|
Sigma |
smoothing positive-definite matrix |
logweights |
log vector of weights |
an |
threshold for linear approximation |
xsamp |
|
dxsamp |
|
mckern |
logical; if |
Value
RLCV criterion value
Magnetic storms
Description
Absolute magnitude of 373 geomagnetic storms lasting more than 48h with absolute magnitude (dst) larger than 100 in 1957-2014.
Format
a vector of size 373
Note
For a detailed article presenting the derivation of the Dst index, see http://wdc.kugi.kyoto-u.ac.jp/dstdir/dst2/onDstindex.html
Source
Aki Vehtari
References
World Data Center for Geomagnetism, Kyoto, M. Nose, T. Iyemori, M. Sugiura, T. Kamei (2015), Geomagnetic Dst index, doi:10.17593/14515-74000.
Gaussian kernel density estimator on half-space
Description
Given a data matrix over a half-space defined by beta
, compute an homeomorphism to
\mathbb{R}^d
and perform kernel smoothing based on a Gaussian kernel density estimator,
taking each turn an observation as location vector.
Usage
hsgauss_kdens(x, newdata, Sigma, beta, log = TRUE, ...)
Arguments
x |
|
newdata |
matrix of new observations at which to evaluated the kernel density |
Sigma |
scale matrix |
beta |
|
log |
logical; if |
... |
additional arguments, currently ignored |
Value
a vector containing the value of the kernel density at each of the newdata
points
Optimal scale matrix for kernel density estimation
Description
Given an n
sample from a multivariate distribution on the half-space defined by
\{\boldsymbol{x} \in \mathbb{R}^d: \boldsymbol{\beta}^\top\boldsymbol{x}>0\}
,
the function computes the bandwidth (type="isotropic"
) or scale
matrix that minimizes the asymptotic mean integrated squared error away from the boundary.
The latter depend on the true unknown density, which is replaced by the kernel density or
a MIG distribution evaluated at the maximum likelihood estimator. The integral or the integrated
squared error are obtained by Monte Carlo integration with N
simulations
Usage
kdens_bandwidth(
x,
beta,
shift,
family = c("mig", "hsgauss", "tnorm"),
method = c("amise", "lcv", "lscv", "rlcv"),
type = c("isotropic", "diag", "full"),
approx = c("kernel", "mig", "tnorm"),
transformation = c("none", "scaling", "spherical"),
N = 10000L,
buffer = 0,
maxiter = 2000L,
...
)
Arguments
x |
an |
beta |
|
shift |
location vector for translating the half-space. If missing, defaults to zero |
family |
distribution for smoothing, either |
method |
estimation criterion, either |
type |
string indicating whether to compute an isotropic model or estimate the optimal scale matrix via optimization |
approx |
string; distribution to approximate the true density function |
transformation |
string for optional scaling of the data before computing the bandwidth. Either standardization to unit variance |
N |
integer number of simulations for Monte Carlo integration |
buffer |
double indicating the buffer from the half-space |
maxiter |
integer; max number of iterations in the call to |
... |
additional parameters, currently ignored |
Value
a d
by d
scale matrix
References
Wu, X. (2019). Robust likelihood cross-validation for kernel density estimation. Journal of Business & Economic Statistics, 37(4), 761–770. doi:10.1080/07350015.2018.1424633 Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates, Biometrika, 71(2), 353–360. doi:10.1093/biomet/71.2.353 Rudemo, M. (1982). Empirical choice of histograms and kernel density estimators. Scandinavian Journal of Statistics, 9(2), 65–78. http://www.jstor.org/stable/4615859
Multivariate inverse Gaussian kernel density estimator
Description
Given a matrix of new observations, compute the density of the multivariate
inverse Gaussian mixture defined by assigning equal weight to each component where
\boldsymbol{\xi}
is the location parameter.
Usage
mig_kdens(x, newdata, Omega, beta, log = FALSE)
Arguments
x |
|
newdata |
matrix of new observations at which to evaluated the kernel density |
Omega |
|
beta |
|
log |
logical; if |
Value
value of the (log)-density at newdata
MIG kernel density estimator
Description
Given a data matrix over a half-space defined by beta
,
compute the log density taking in turn an observation in newdata
as location vector and computing the kernel density estimate.
Usage
mig_kdens_arma(x, newdata, Omega, beta, logd)
Arguments
x |
|
newdata |
matrix of new observations at which to evaluated the kernel density |
Omega |
|
beta |
|
Value
the value of the likelihood cross-validation criterion
Likelihood cross-validation for MIG density estimation
Description
Likelihood cross-validation criterion function.
Usage
mig_lcv(x, Omega, beta)
Arguments
x |
|
Omega |
|
beta |
|
Value
LCV criterion value
Gradient of the MIG log likelihood with respect to data
Description
This function returns the gradient vector of the log likelihood with respect to the
argument x
.
Usage
mig_loglik_grad(x, xi, Omega, beta)
Arguments
x |
|
xi |
|
Omega |
|
beta |
|
Value
an n
by d
matrix of first derivatives for the gradient, observation by observation, or a d
vector if x
is a vector.
Hessian of the MIG log likelihood with respect to data
Description
This function returns the hessian, i.e., the matrix of
second derivatives of the log likelihood with respect to the
argument x
.
Usage
mig_loglik_hessian(x, beta, xi, Omega)
Arguments
x |
|
beta |
|
xi |
|
Omega |
|
Value
a d
by d
matrix of second derivatives if x
has length d
,
else an n
by d
by d
array if x
is an n
by d
matrix
Laplacian of the MIG log likelihood with respect to the data
Description
Computes the sum of second derivatives of the multivariate
inverse Gaussian likelihood with respect to the data argument
x
. The function is vectorized for more efficiency.
Usage
mig_loglik_laplacian(x, beta, xi, Omega)
Arguments
x |
|
beta |
|
xi |
|
Omega |
|
Value
an n
vector
Leave-one-out cross-validation for kernel density estimation with MIG
Description
Given a data matrix over a half-space defined by beta
,
compute the log density using leave-one-out cross validation,
taking in turn an observation as location vector and computing the
density of the resulting mixture.
Usage
mig_loo(x, Omega, beta)
Arguments
x |
|
Omega |
|
beta |
|
Value
the value of the likelihood cross-validation criterion
Least squares cross-validation for MIG density estimation
Description
Given a data matrix over a half-space defined by beta
,
compute the average using leave-one-out cross validation density minus
half the squared density.
Usage
mig_lscv(x, beta, Omega, xsamp, dxsamp, mckern = TRUE)
Arguments
x |
|
beta |
|
Omega |
|
xsamp |
matrix of points at which to evaluate the integral |
dxsamp |
density of points |
Value
the value of the least square cross-validation criterion
Robust likelihood cross-validation for kernel density estimation for MIG
Description
Given a data matrix over a half-space defined by beta
,
compute the log density using leave-one-out cross validation,
taking in turn an observation as location vector and computing the
density of the resulting mixture.
Usage
mig_rlcv(x, beta, Omega, an, xsamp, dxsamp, mckern = TRUE)
Arguments
x |
|
beta |
|
Omega |
|
xsamp |
matrix of points at which to evaluate the integral |
dxsamp |
density of points |
Value
the value of the likelihood cross-validation criterion
Maximum likelihood estimation of truncated Gaussian on half space
Description
Given a data matrix and a vector of linear constraints for the half-space, we compute the maximum likelihood estimator of the location and scale matrix
Usage
mle_truncgauss(xdat, beta)
Arguments
xdat |
matrix of observations |
beta |
vector of constraints defining the half-space |
Value
a list with location vector loc
and scale matrix scale
Normal bandwidth rule
Description
Given an n
by d
matrix of observations, compute the
bandwidth according to Scott's rule.
Usage
normalrule_bandwidth(x)
Arguments
x |
|
Value
a d
by d
diagonal bandwidth matrix
Orthogonal projection matrix onto the half-space
Description
The orthogonal projection matrix P
has unit determinant and
transforms an n
by d
matrix by taking x * P
.
The components of the first column vector of the resulting matrix are strictly positive.
Usage
proj_hs(beta, inv = FALSE)
Arguments
beta |
vector defining the half-space |
inv |
logical; if |
Value
a d
by d
orthogonal projection matrix
Simulate elliptical vector subject to a linear constraint
Description
Simulate multivariate Student-t \boldsymbol{x}
with location vector mu
, scale matrix sigma
and df
(integer) degrees of freedom
subject to the linear constraint \boldsymbol{\beta}^\top\boldsymbol{x} > 0
.
Negative degrees of freedom or values larger than 1000 imply Gaussian vectors are generated instead.
Usage
rtellipt(n, beta, mu, sigma, df, delta = 0)
Arguments
n |
number of simulations |
beta |
|
mu |
location vector |
sigma |
scale matrix |
df |
degrees of freedom argument |
delta |
buffer; default to zero |
Value
an n
by d
matrix of random vectors
Truncated Gaussian kernel density estimator
Description
Given a data matrix over a half-space defined by beta
,
compute the log density of the asymmetric truncated Gaussian kernel density estimator,
taking in turn an observation as location vector.
Usage
tnorm_kdens(x, newdata, Sigma, beta, log = TRUE, ...)
Arguments
x |
|
newdata |
matrix of new observations at which to evaluated the kernel density |
Sigma |
scale matrix |
beta |
|
log |
logical; if |
... |
additional arguments, currently ignored |
Value
a vector containing the value of the kernel density at each of the newdata
points
Truncated Gaussian kernel density estimator
Description
Given a data matrix over a half-space defined by beta
,
compute the log density of the asymmetric truncated Gaussian kernel density estimator,
taking in turn an observation as location vector.
Usage
tnorm_kdens_arma(x, newdata, Omega, beta, logd)
Arguments
x |
|
newdata |
matrix of new observations at which to evaluated the kernel density |
Omega |
|
beta |
|
Value
the value of the likelihood cross-validation criterion
Likelihood cross-validation for truncated normal kernel density estimation
Description
Likelihood cross-validation criterion function.
Usage
tnorm_lcv(x, Omega, beta)
Arguments
x |
|
Omega |
smoothing positive-definite matrix |
beta |
vector of constraints for the half-space |
Value
LCV criterion value
Leave-one-out cross-validation for truncated Gaussian kernel density estimation
Description
Given a data matrix, compute the log density using leave-one-out cross validation, taking in turn an observation as location vector and computing the density of the resulting mixture.
Usage
tnorm_loo(x, Omega, beta)
Arguments
x |
|
Omega |
smoothing positive-definite matrix |
beta |
vector of constraints for the half-space |
Value
a vector of values for the weighted leave-one-out likelihood
Least squares cross-validation for truncated Gaussian kernel density estimation
Description
Least squares cross-validation for truncated Gaussian samples.
Usage
tnorm_lscv(x, Omega, beta, xsamp, dxsamp, mckern = TRUE)
Arguments
x |
|
Omega |
smoothing positive-definite matrix |
beta |
vector of constraints for the half-space |
xsamp |
|
dxsamp |
|
mckern |
logical; if |
Value
least square criterion value
Likelihood cross-validation for truncated normal kernel density estimation
Description
Robust likelihood cross-validation criterion function.
Usage
tnorm_rlcv(x, Omega, beta, an, xsamp, dxsamp, mckern = TRUE)
Arguments
x |
|
Omega |
smoothing positive-definite matrix |
beta |
vector of constraints for the half-space |
an |
threshold for linear approximation |
xsamp |
|
dxsamp |
|
mckern |
logical; if |
Value
RLCV criterion value