| Type: | Package |
| Title: | Multivariate Inverse Gaussian Distribution |
| Version: | 2.0 |
| Description: | Provides utilities for estimation for the multivariate inverse Gaussian distribution of Minami (2003) <doi:10.1081/STA-120025379>, including random vector generation and explicit estimators of the location vector and scale matrix. The package implements kernel density estimators discussed in Belzile, Desgagnes, Genest and Ouimet (2024) <doi:10.48550/arXiv.2209.04757> for smoothing multivariate data on half-spaces. |
| BugReports: | https://github.com/lbelzile/mig/issues |
| Imports: | statmod, TruncatedNormal (≥ 2.3), Rcpp (≥ 1.0.12) |
| Depends: | R (≥ 2.10) |
| Suggests: | numDeriv, tinytest, knitr, rmarkdown, minqa |
| LinkingTo: | Rcpp, RcppArmadillo |
| Encoding: | UTF-8 |
| LazyData: | true |
| License: | MIT + file LICENSE |
| VignetteBuilder: | knitr |
| RoxygenNote: | 7.3.2 |
| NeedsCompilation: | yes |
| Packaged: | 2025-04-08 22:13:41 UTC; lbelzile |
| Author: | Leo Belzile |
| Maintainer: | Leo Belzile <belzilel@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2025-04-08 22:50:02 UTC |
Log of sum of terms
Description
Computes the log of a sum of positive components, given on the log scale (lx), avoiding numerical overflow.
Usage
.lsum(lx, loff)Arguments
lx | vector or matrix of log of terms to add |
loff | log of offset |
Value
a vector of log sums
Author(s)
Marius Hofert, Martin Maechler (packagecopula)
Maximum likelihood estimation of multivariate inverse Gaussian vectors
Description
The maximum likelihood estimators are obtained for fixed shift vector\boldsymbol{a}and half-space vector\boldsymbol{\beta}.
Usage
.mig_mle(x, beta, shift)Arguments
x |
|
beta |
|
shift |
|
Value
a list with components:
xi: MLE of the expectation or location vectorOmega: MLE of the scale matrix
Method of moments estimators for multivariate inverse Gaussian vectors
Description
These estimators are based on the sample mean and covariance.
Usage
.mig_mom(x, beta, shift)Arguments
x |
|
beta |
|
shift |
|
Value
a list with components:
xi: MOM estimate of the expectation or location vectorOmega: MOM estimate of the scale matrix
Threshold selection for bandwidth
Description
Automated thresholds selection for the robust likelihood cross validation.The cutoff is based on the covariance matrix of the sample data.
Usage
an(x)Arguments
x | matrix of observations |
Value
cutoff for robust selection
Multivariate inverse Gaussian distribution
Description
The density of the MIG model is
f(\boldsymbol{x}+\boldsymbol{a}) =(2\pi)^{-d/2}\boldsymbol{\beta}^{\top}\boldsymbol{\xi}|\boldsymbol{\Omega}|^{-1/2}(\boldsymbol{\beta}^{\top}\boldsymbol{x})^{-(1+d/2)}\exp\left\{-\frac{(\boldsymbol{x}-\boldsymbol{\xi})^{\top}\boldsymbol{\Omega}^{-1}(\boldsymbol{x}-\boldsymbol{\xi})}{2\boldsymbol{\beta}^{\top}\boldsymbol{x}}\right\}
for points in thed-dimensional half-space\{\boldsymbol{x} \in \mathbb{R}^d: \boldsymbol{\beta}^{\top}(\boldsymbol{x}-\boldsymbol{a}) \geq 0\}
Usage
dmig(x, xi, Omega, beta, shift, log = FALSE)rmig(n, xi, Omega, beta, shift, method = c("invsim", "bm"), timeinc = 0.001)pmig(q, xi, Omega, beta, log = FALSE, method = c("sov", "mc"), B = 10000L)Arguments
x |
|
xi |
|
Omega |
|
beta |
|
shift |
|
log | logical; if |
n | number of observations |
method | string; one of inverse system ( |
timeinc | time increment for multivariate simulation algorithm based on the hitting time of Brownian motion, default to |
q |
|
B | number of Monte Carlo replications for the SOV estimator |
Details
Observations are generated using the representation as the first hitting time of a hyperplane of a correlated Brownian motion.
Value
fordmig, the (log)-density
forrmig, ann vector ifd=1 (univariate) or ann byd matrix ifd > 1
ann vector of (log) probabilities
Author(s)
Frederic Ouimet (bm), Leo Belzile (invsim)
Leo Belzile
Examples
# Density evaluationx <- rbind(c(1, 2), c(2,3), c(0,-1))beta <- c(1, 0)xi <- c(1, 1)Omega <- matrix(c(2, -1, -1, 2), nrow = 2, ncol = 2)dmig(x, xi = xi, Omega = Omega, beta = beta)# Random number generationd <- 5Lbeta <- rexp(d)xi <- rexp(d)Omega <- matrix(0.5, d, d) + diag(d)samp <- rmig(n = 1000, beta = beta, xi = xi, Omega = Omega)stopifnot(isTRUE(all(samp %*% beta > 0)))mle <- fit_mig(samp, beta = beta, method = "mle")set.seed(1234)d <- 2Lbeta <- runif(d)Omega <- rWishart(n = 1, df = 2*d, Sigma = matrix(0.5, d, d) + diag(d))[,,1]xi <- rexp(d)q <- mig::rmig(n = 10, beta = beta, Omega = Omega, xi = xi)pmig(q, xi = xi, beta = beta, Omega = Omega)Laplacian of the MIG density with respect to the data
Description
Computes the sum of second derivatives of the multivariateinverse Gaussian density with respect to the data argumentx. The function is vectorized for more efficiency.
Usage
dmig_laplacian(x, xi, Omega, beta, scale = TRUE)Arguments
x |
|
xi |
|
Omega |
|
beta |
|
Value
ann vector
Density of elliptical vectors subject to a linear constraint
Description
Compute the density of multivariate Student-t or Gaussian\boldsymbol{x}with location vectormu, scale matrixsigma anddf (integer) degrees of freedomsubject to the linear constraint\boldsymbol{\beta}^\top\boldsymbol{x} > \delta.Negative degrees of freedom or values larger than 1000 imply Gaussian vectors are generated instead.
Usage
dtellipt(x, beta, mu, sigma, df, delta = 0, log = FALSE)Arguments
beta |
|
mu | location vector |
sigma | scale matrix |
df | degrees of freedom argument |
delta | buffer; default to zero |
log | logical; if |
Value
ann byd matrix of random vectors
Fit multivariate inverse Gaussian distribution
Description
Fit multivariate inverse Gaussian distribution
Usage
fit_mig(x, beta, method = c("mle", "mom"), shift)Arguments
x |
|
beta |
|
method | string, one of |
shift |
|
Value
a list with components:
xi: estimate of the expectation or location vectorOmega: estimate of the scale matrix
Gaussian kernel density estimator
Description
Given a data matrix over a half-space defined bybeta,compute the log density of the asymmetric truncated Gaussian kernel density estimator,taking in turn an observation as location vector.
Usage
gauss_kdens_arma(x, newdata, Sigma, logweights, logd)Arguments
x | matrix of observations |
newdata | matrix of new observations at which to evaluated the kernel density |
Sigma | covariance matrix |
logd | logical; if |
Value
log density estimator
Likelihood cross-validation for Gaussian kernel density estimation
Description
Likelihood cross-validation criterion function.
Usage
gauss_lcv(x, Sigma, logweights)Arguments
x |
|
Sigma | smoothing positive-definite matrix |
logweights | log vector of weights |
Value
LCV criterion value
Leave-one-out cross-validation for Gaussian kernel density estimation
Description
Given a data matrix, compute the log density using leave-one-outcross validation, taking in turn an observation as location vectorand computing the density of the resulting mixture.
Usage
gauss_loo(x, Sigma, logweights)Arguments
x |
|
Sigma | smoothing positive-definite matrix |
logweights | vector of log weights |
Value
a vector of values for the weighted leave-one-out likelihood
Least squares cross-validation for Gaussian kernel density estimation
Description
Least squares cross-validation for weighted Gaussian samples.
Usage
gauss_lscv(x, Sigma, logweights, xsamp, dxsamp, mckern = TRUE)Arguments
x |
|
Sigma | smoothing positive-definite matrix |
logweights | log vector of weights |
xsamp |
|
dxsamp |
|
mckern | logical; if |
Value
least square criterion value
Robust likelihood cross-validation for Gaussian kernel density estimation
Description
Robust likelihood cross-validation criterion function of Wu.
Usage
gauss_rlcv(x, Sigma, logweights, an, xsamp, dxsamp, mckern = TRUE)Arguments
x |
|
Sigma | smoothing positive-definite matrix |
logweights | log vector of weights |
an | threshold for linear approximation |
xsamp |
|
dxsamp |
|
mckern | logical; if |
Value
RLCV criterion value
Magnetic storms
Description
Absolute magnitude of 373 geomagnetic storms lasting more than 48h with absolute magnitude (dst) larger than 100 in 1957-2014.
Format
a vector of size 373
Note
For a detailed article presenting the derivation of the Dst index, seehttp://wdc.kugi.kyoto-u.ac.jp/dstdir/dst2/onDstindex.html
Source
Aki Vehtari
References
World Data Center for Geomagnetism, Kyoto, M. Nose, T. Iyemori, M. Sugiura, T. Kamei (2015),Geomagnetic Dst index, doi:10.17593/14515-74000.
Gaussian kernel density estimator on half-space
Description
Given a data matrix over a half-space defined bybeta, compute an homeomorphism to\mathbb{R}^d and perform kernel smoothing based on a Gaussian kernel density estimator,taking each turn an observation as location vector.
Usage
hsgauss_kdens(x, newdata, Sigma, beta, log = TRUE, ...)Arguments
x |
|
newdata | matrix of new observations at which to evaluated the kernel density |
Sigma | scale matrix |
beta |
|
log | logical; if |
... | additional arguments, currently ignored |
Value
a vector containing the value of the kernel density at each of thenewdata points
Optimal scale matrix for kernel density estimation
Description
Given ann sample from a multivariate distribution on the half-space defined by\{\boldsymbol{x} \in \mathbb{R}^d: \boldsymbol{\beta}^\top\boldsymbol{x}>0\},the function computes the bandwidth (type="isotropic") or scalematrix that minimizes the asymptotic mean integrated squared error away from the boundary.The latter depend on the true unknown density, which is replaced by the kernel density ora MIG distribution evaluated at the maximum likelihood estimator. The integral or the integratedsquared error are obtained by Monte Carlo integration withN simulations
Usage
kdens_bandwidth( x, beta, shift, family = c("mig", "hsgauss", "tnorm"), method = c("amise", "lcv", "lscv", "rlcv"), type = c("isotropic", "diag", "full"), approx = c("kernel", "mig", "tnorm"), transformation = c("none", "scaling", "spherical"), N = 10000L, buffer = 0, maxiter = 2000L, ...)Arguments
x | an |
beta |
|
shift | location vector for translating the half-space. If missing, defaults to zero |
family | distribution for smoothing, either |
method | estimation criterion, either |
type | string indicating whether to compute an isotropic model or estimate the optimal scale matrix via optimization |
approx | string; distribution to approximate the true density function |
transformation | string for optional scaling of the data before computing the bandwidth. Either standardization to unit variance |
N | integer number of simulations for Monte Carlo integration |
buffer | double indicating the buffer from the half-space |
maxiter | integer; max number of iterations in the call to |
... | additional parameters, currently ignored |
Value
ad byd scale matrix
References
Wu, X. (2019). Robust likelihood cross-validation for kernel density estimation.Journal of Business & Economic Statistics, 37(4), 761–770.doi:10.1080/07350015.2018.1424633Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates,Biometrika, 71(2), 353–360.doi:10.1093/biomet/71.2.353Rudemo, M. (1982). Empirical choice of histograms and kernel density estimators.Scandinavian Journal of Statistics, 9(2), 65–78. http://www.jstor.org/stable/4615859
Multivariate inverse Gaussian kernel density estimator
Description
Given a matrix of new observations, compute the density of the multivariateinverse Gaussian mixture defined by assigning equal weight to each component where\boldsymbol{\xi} is the location parameter.
Usage
mig_kdens(x, newdata, Omega, beta, log = FALSE)Arguments
x |
|
newdata | matrix of new observations at which to evaluated the kernel density |
Omega |
|
beta |
|
log | logical; if |
Value
value of the (log)-density atnewdata
MIG kernel density estimator
Description
Given a data matrix over a half-space defined bybeta,compute the log density taking in turn an observation innewdataas location vector and computing the kernel density estimate.
Usage
mig_kdens_arma(x, newdata, Omega, beta, logd)Arguments
x |
|
newdata | matrix of new observations at which to evaluated the kernel density |
Omega |
|
beta |
|
Value
the value of the likelihood cross-validation criterion
Likelihood cross-validation for MIG density estimation
Description
Likelihood cross-validation criterion function.
Usage
mig_lcv(x, Omega, beta)Arguments
x |
|
Omega |
|
beta |
|
Value
LCV criterion value
Gradient of the MIG log likelihood with respect to data
Description
This function returns the gradient vector of the log likelihood with respect to theargumentx.
Usage
mig_loglik_grad(x, xi, Omega, beta)Arguments
x |
|
xi |
|
Omega |
|
beta |
|
Value
ann byd matrix of first derivatives for the gradient, observation by observation, or ad vector ifx is a vector.
Hessian of the MIG log likelihood with respect to data
Description
This function returns the hessian, i.e., the matrix ofsecond derivatives of the log likelihood with respect to theargumentx.
Usage
mig_loglik_hessian(x, beta, xi, Omega)Arguments
x |
|
beta |
|
xi |
|
Omega |
|
Value
ad byd matrix of second derivatives ifx has lengthd,else ann byd byd array ifx is ann byd matrix
Laplacian of the MIG log likelihood with respect to the data
Description
Computes the sum of second derivatives of the multivariateinverse Gaussian likelihood with respect to the data argumentx. The function is vectorized for more efficiency.
Usage
mig_loglik_laplacian(x, beta, xi, Omega)Arguments
x |
|
beta |
|
xi |
|
Omega |
|
Value
ann vector
Leave-one-out cross-validation for kernel density estimation with MIG
Description
Given a data matrix over a half-space defined bybeta,compute the log density using leave-one-out cross validation,taking in turn an observation as location vector and computing thedensity of the resulting mixture.
Usage
mig_loo(x, Omega, beta)Arguments
x |
|
Omega |
|
beta |
|
Value
the value of the likelihood cross-validation criterion
Least squares cross-validation for MIG density estimation
Description
Given a data matrix over a half-space defined bybeta,compute the average using leave-one-out cross validation density minushalf the squared density.
Usage
mig_lscv(x, beta, Omega, xsamp, dxsamp, mckern = TRUE)Arguments
x |
|
beta |
|
Omega |
|
xsamp | matrix of points at which to evaluate the integral |
dxsamp | density of points |
Value
the value of the least square cross-validation criterion
Robust likelihood cross-validation for kernel density estimation for MIG
Description
Given a data matrix over a half-space defined bybeta,compute the log density using leave-one-out cross validation,taking in turn an observation as location vector and computing thedensity of the resulting mixture.
Usage
mig_rlcv(x, beta, Omega, an, xsamp, dxsamp, mckern = TRUE)Arguments
x |
|
beta |
|
Omega |
|
xsamp | matrix of points at which to evaluate the integral |
dxsamp | density of points |
Value
the value of the likelihood cross-validation criterion
Maximum likelihood estimation of truncated Gaussian on half space
Description
Given a data matrix and a vector of linear constraints for the half-space,we compute the maximum likelihood estimator of the location and scale matrix
Usage
mle_truncgauss(xdat, beta)Arguments
xdat | matrix of observations |
beta | vector of constraints defining the half-space |
Value
a list with location vectorloc and scale matrixscale
Normal bandwidth rule
Description
Given ann byd matrix of observations, compute thebandwidth according to Scott's rule.
Usage
normalrule_bandwidth(x)Arguments
x |
|
Value
ad byd diagonal bandwidth matrix
Orthogonal projection matrix onto the half-space
Description
The orthogonal projection matrixP has unit determinant andtransforms ann byd matrix by takingx * P.The components of the first column vector of the resulting matrix are strictly positive.
Usage
proj_hs(beta, inv = FALSE)Arguments
beta | vector defining the half-space |
inv | logical; if |
Value
ad byd orthogonal projection matrix
Simulate elliptical vector subject to a linear constraint
Description
Simulate multivariate Student-t\boldsymbol{x}with location vectormu, scale matrixsigma anddf (integer) degrees of freedomsubject to the linear constraint\boldsymbol{\beta}^\top\boldsymbol{x} > 0.Negative degrees of freedom or values larger than 1000 imply Gaussian vectors are generated instead.
Usage
rtellipt(n, beta, mu, sigma, df, delta = 0)Arguments
n | number of simulations |
beta |
|
mu | location vector |
sigma | scale matrix |
df | degrees of freedom argument |
delta | buffer; default to zero |
Value
ann byd matrix of random vectors
Truncated Gaussian kernel density estimator
Description
Given a data matrix over a half-space defined bybeta,compute the log density of the asymmetric truncated Gaussian kernel density estimator,taking in turn an observation as location vector.
Usage
tnorm_kdens(x, newdata, Sigma, beta, log = TRUE, ...)Arguments
x |
|
newdata | matrix of new observations at which to evaluated the kernel density |
Sigma | scale matrix |
beta |
|
log | logical; if |
... | additional arguments, currently ignored |
Value
a vector containing the value of the kernel density at each of thenewdata points
Truncated Gaussian kernel density estimator
Description
Given a data matrix over a half-space defined bybeta,compute the log density of the asymmetric truncated Gaussian kernel density estimator,taking in turn an observation as location vector.
Usage
tnorm_kdens_arma(x, newdata, Omega, beta, logd)Arguments
x |
|
newdata | matrix of new observations at which to evaluated the kernel density |
Omega |
|
beta |
|
Value
the value of the likelihood cross-validation criterion
Likelihood cross-validation for truncated normal kernel density estimation
Description
Likelihood cross-validation criterion function.
Usage
tnorm_lcv(x, Omega, beta)Arguments
x |
|
Omega | smoothing positive-definite matrix |
beta | vector of constraints for the half-space |
Value
LCV criterion value
Leave-one-out cross-validation for truncated Gaussian kernel density estimation
Description
Given a data matrix, compute the log density using leave-one-outcross validation, taking in turn an observation as location vectorand computing the density of the resulting mixture.
Usage
tnorm_loo(x, Omega, beta)Arguments
x |
|
Omega | smoothing positive-definite matrix |
beta | vector of constraints for the half-space |
Value
a vector of values for the weighted leave-one-out likelihood
Least squares cross-validation for truncated Gaussian kernel density estimation
Description
Least squares cross-validation for truncated Gaussian samples.
Usage
tnorm_lscv(x, Omega, beta, xsamp, dxsamp, mckern = TRUE)Arguments
x |
|
Omega | smoothing positive-definite matrix |
beta | vector of constraints for the half-space |
xsamp |
|
dxsamp |
|
mckern | logical; if |
Value
least square criterion value
Likelihood cross-validation for truncated normal kernel density estimation
Description
Robust likelihood cross-validation criterion function.
Usage
tnorm_rlcv(x, Omega, beta, an, xsamp, dxsamp, mckern = TRUE)Arguments
x |
|
Omega | smoothing positive-definite matrix |
beta | vector of constraints for the half-space |
an | threshold for linear approximation |
xsamp |
|
dxsamp |
|
mckern | logical; if |
Value
RLCV criterion value