Movatterモバイル変換

Type:

Package

Title:

Multivariate Inverse Gaussian Distribution

Version:

2.0

Description:

Provides utilities for estimation for the multivariate inverse Gaussian distribution of Minami (2003) <doi:10.1081/STA-120025379>, including random vector generation and explicit estimators of the location vector and scale matrix. The package implements kernel density estimators discussed in Belzile, Desgagnes, Genest and Ouimet (2024) <doi:10.48550/arXiv.2209.04757> for smoothing multivariate data on half-spaces.

BugReports:

https://github.com/lbelzile/mig/issues

Imports:

statmod, TruncatedNormal (≥ 2.3), Rcpp (≥ 1.0.12)

Depends:

R (≥ 2.10)

Suggests:

numDeriv, tinytest, knitr, rmarkdown, minqa

LinkingTo:

Rcpp, RcppArmadillo

Encoding:

UTF-8

LazyData:

true

License:

MIT + file LICENSE

VignetteBuilder:

knitr

RoxygenNote:

7.3.2

NeedsCompilation:

yes

Packaged:

2025-04-08 22:13:41 UTC; lbelzile

Author:

Leo Belzile

[aut, cre], Frederic Ouimet

[aut]

Maintainer:

Leo Belzile <belzilel@gmail.com>

Repository:

CRAN

Date/Publication:

2025-04-08 22:50:02 UTC

Log of sum of terms

Description

Computes the log of a sum of positive components, given on the log scale (lx), avoiding numerical overflow.

Usage

.lsum(lx, loff)

Arguments

lx

vector or matrix of log of terms to add

loff

log of offset

Value

a vector of log sums

Author(s)

Marius Hofert, Martin Maechler (packagecopula)

Maximum likelihood estimation of multivariate inverse Gaussian vectors

Description

The maximum likelihood estimators are obtained for fixed shift vector\boldsymbol{a}and half-space vector\boldsymbol{\beta}.

Usage

.mig_mle(x, beta, shift)

Arguments

x

n byd matrix of quantiles

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

shift

d translation for the half-space\boldsymbol{a}

Value

a list with components:

xi: MLE of the expectation or location vector
Omega: MLE of the scale matrix

Method of moments estimators for multivariate inverse Gaussian vectors

Description

These estimators are based on the sample mean and covariance.

Usage

.mig_mom(x, beta, shift)

Arguments

x

n byd matrix of quantiles

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

shift

d translation for the half-space\boldsymbol{a}

Value

a list with components:

xi: MOM estimate of the expectation or location vector
Omega: MOM estimate of the scale matrix

Threshold selection for bandwidth

Description

Automated thresholds selection for the robust likelihood cross validation.The cutoff is based on the covariance matrix of the sample data.

Usage

an(x)

Arguments

x

matrix of observations

Value

cutoff for robust selection

Multivariate inverse Gaussian distribution

Description

The density of the MIG model is

f(\boldsymbol{x}+\boldsymbol{a}) =(2\pi)^{-d/2}\boldsymbol{\beta}^{\top}\boldsymbol{\xi}|\boldsymbol{\Omega}|^{-1/2}(\boldsymbol{\beta}^{\top}\boldsymbol{x})^{-(1+d/2)}\exp\left\{-\frac{(\boldsymbol{x}-\boldsymbol{\xi})^{\top}\boldsymbol{\Omega}^{-1}(\boldsymbol{x}-\boldsymbol{\xi})}{2\boldsymbol{\beta}^{\top}\boldsymbol{x}}\right\}

for points in thed-dimensional half-space\{\boldsymbol{x} \in \mathbb{R}^d: \boldsymbol{\beta}^{\top}(\boldsymbol{x}-\boldsymbol{a}) \geq 0\}

Usage

dmig(x, xi, Omega, beta, shift, log = FALSE)rmig(n, xi, Omega, beta, shift, method = c("invsim", "bm"), timeinc = 0.001)pmig(q, xi, Omega, beta, log = FALSE, method = c("sov", "mc"), B = 10000L)

Arguments

x

n byd matrix of quantiles

xi

d vector of location parameters\boldsymbol{\xi}, giving the expected value

Omega

d byd positive definite scale matrix\boldsymbol{\Omega}

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

shift

d translation for the half-space\boldsymbol{a}

log

logical; ifTRUE, returns log probabilities

n

number of observations

method

string; one of inverse system (invsim, default), Brownian motion (bm)

timeinc

time increment for multivariate simulation algorithm based on the hitting time of Brownian motion, default to1e-3.

q

n byd matrix of quantiles

B

number of Monte Carlo replications for the SOV estimator

Details

Observations are generated using the representation as the first hitting time of a hyperplane of a correlated Brownian motion.

Value

fordmig, the (log)-density

forrmig, ann vector ifd=1 (univariate) or ann byd matrix ifd > 1

ann vector of (log) probabilities

Author(s)

Frederic Ouimet (bm), Leo Belzile (invsim)

Leo Belzile

Examples

# Density evaluationx <- rbind(c(1, 2), c(2,3), c(0,-1))beta <- c(1, 0)xi <- c(1, 1)Omega <- matrix(c(2, -1, -1, 2), nrow = 2, ncol = 2)dmig(x, xi = xi, Omega = Omega, beta = beta)# Random number generationd <- 5Lbeta <- rexp(d)xi <- rexp(d)Omega <- matrix(0.5, d, d) + diag(d)samp <- rmig(n = 1000, beta = beta, xi = xi, Omega = Omega)stopifnot(isTRUE(all(samp %*% beta > 0)))mle <- fit_mig(samp, beta = beta, method = "mle")set.seed(1234)d <- 2Lbeta <- runif(d)Omega <- rWishart(n = 1, df = 2*d, Sigma = matrix(0.5, d, d) + diag(d))[,,1]xi <- rexp(d)q <- mig::rmig(n = 10, beta = beta, Omega = Omega, xi = xi)pmig(q, xi = xi, beta = beta, Omega = Omega)

Laplacian of the MIG density with respect to the data

Description

Computes the sum of second derivatives of the multivariateinverse Gaussian density with respect to the data argumentx. The function is vectorized for more efficiency.

Usage

dmig_laplacian(x, xi, Omega, beta, scale = TRUE)

Arguments

x

n byd matrix of quantiles

xi

d vector of location parameters\boldsymbol{\xi}, giving the expected value

Omega

d byd positive definite scale matrix\boldsymbol{\Omega}

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

Value

ann vector

Density of elliptical vectors subject to a linear constraint

Description

Compute the density of multivariate Student-t or Gaussian\boldsymbol{x}with location vectormu, scale matrixsigma anddf (integer) degrees of freedomsubject to the linear constraint\boldsymbol{\beta}^\top\boldsymbol{x} > \delta.Negative degrees of freedom or values larger than 1000 imply Gaussian vectors are generated instead.

Usage

dtellipt(x, beta, mu, sigma, df, delta = 0, log = FALSE)

Arguments

beta

d vector of linear constraints

mu

location vector

sigma

scale matrix

df

degrees of freedom argument

delta

buffer; default to zero

log

logical; ifTRUE, return the log density

Value

ann byd matrix of random vectors

Fit multivariate inverse Gaussian distribution

Description

Fit multivariate inverse Gaussian distribution

Usage

fit_mig(x, beta, method = c("mle", "mom"), shift)

Arguments

x

n byd matrix of quantiles

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

method

string, one ofmle for maximum likelihood estimation, ormom for method of moments.

shift

d translation for the half-space\boldsymbol{a}

Value

a list with components:

xi: estimate of the expectation or location vector
Omega: estimate of the scale matrix

Gaussian kernel density estimator

Description

Given a data matrix over a half-space defined bybeta,compute the log density of the asymmetric truncated Gaussian kernel density estimator,taking in turn an observation as location vector.

Usage

gauss_kdens_arma(x, newdata, Sigma, logweights, logd)

Arguments

x

matrix of observations

newdata

matrix of new observations at which to evaluated the kernel density

Sigma

covariance matrix

logd

logical; ifTRUE, return the log density

Value

log density estimator

Likelihood cross-validation for Gaussian kernel density estimation

Description

Likelihood cross-validation criterion function.

Usage

gauss_lcv(x, Sigma, logweights)

Arguments

x

n byd matrix of observations

Sigma

smoothing positive-definite matrix

logweights

log vector of weights

Value

LCV criterion value

Leave-one-out cross-validation for Gaussian kernel density estimation

Description

Given a data matrix, compute the log density using leave-one-outcross validation, taking in turn an observation as location vectorand computing the density of the resulting mixture.

Usage

gauss_loo(x, Sigma, logweights)

Arguments

x

n byd matrix of observations

Sigma

smoothing positive-definite matrix

logweights

vector of log weights

Value

a vector of values for the weighted leave-one-out likelihood

Least squares cross-validation for Gaussian kernel density estimation

Description

Least squares cross-validation for weighted Gaussian samples.

Usage

gauss_lscv(x, Sigma, logweights, xsamp, dxsamp, mckern = TRUE)

Arguments

x

n byd matrix of observations

Sigma

smoothing positive-definite matrix

logweights

log vector of weights

xsamp

n byd random sample for Monte Carlo estimation of bias

dxsamp

n vector of density for the points fromxsamp

mckern

logical; ifTRUE, uses the kernel as sampler for Monte Carlo estimation

Value

least square criterion value

Robust likelihood cross-validation for Gaussian kernel density estimation

Description

Robust likelihood cross-validation criterion function of Wu.

Usage

gauss_rlcv(x, Sigma, logweights, an, xsamp, dxsamp, mckern = TRUE)

Arguments

x

n byd matrix of observations

Sigma

smoothing positive-definite matrix

logweights

log vector of weights

an

threshold for linear approximation

xsamp

n byd random sample for Monte Carlo estimation of bias

dxsamp

n vector of density for the points fromxsamp

mckern

logical; ifTRUE, uses the kernel as sampler for Monte Carlo estimation

Value

RLCV criterion value

Magnetic storms

Description

Absolute magnitude of 373 geomagnetic storms lasting more than 48h with absolute magnitude (dst) larger than 100 in 1957-2014.

Format

a vector of size 373

Note

For a detailed article presenting the derivation of the Dst index, seehttp://wdc.kugi.kyoto-u.ac.jp/dstdir/dst2/onDstindex.html

Source

Aki Vehtari

References

World Data Center for Geomagnetism, Kyoto, M. Nose, T. Iyemori, M. Sugiura, T. Kamei (2015),Geomagnetic Dst index, doi:10.17593/14515-74000.

Gaussian kernel density estimator on half-space

Description

Given a data matrix over a half-space defined bybeta, compute an homeomorphism to\mathbb{R}^d and perform kernel smoothing based on a Gaussian kernel density estimator,taking each turn an observation as location vector.

Usage

hsgauss_kdens(x, newdata, Sigma, beta, log = TRUE, ...)

Arguments

x

n byd matrix of quantiles

newdata

matrix of new observations at which to evaluated the kernel density

Sigma

scale matrix

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

log

logical; ifTRUE, returns log probabilities

...

additional arguments, currently ignored

Value

a vector containing the value of the kernel density at each of thenewdata points

Optimal scale matrix for kernel density estimation

Description

Given ann sample from a multivariate distribution on the half-space defined by\{\boldsymbol{x} \in \mathbb{R}^d: \boldsymbol{\beta}^\top\boldsymbol{x}>0\},the function computes the bandwidth (type="isotropic") or scalematrix that minimizes the asymptotic mean integrated squared error away from the boundary.The latter depend on the true unknown density, which is replaced by the kernel density ora MIG distribution evaluated at the maximum likelihood estimator. The integral or the integratedsquared error are obtained by Monte Carlo integration withN simulations

Usage

kdens_bandwidth(  x,  beta,  shift,  family = c("mig", "hsgauss", "tnorm"),  method = c("amise", "lcv", "lscv", "rlcv"),  type = c("isotropic", "diag", "full"),  approx = c("kernel", "mig", "tnorm"),  transformation = c("none", "scaling", "spherical"),  N = 10000L,  buffer = 0,  maxiter = 2000L,  ...)

Arguments

x

ann byd matrix of observations

beta

d vector defining the half-space

shift

location vector for translating the half-space. If missing, defaults to zero

family

distribution for smoothing, eithermig for multivariate inverse Gaussian,tnorm for truncated normal on the half-space andhsgauss for the Gaussian smoothing after suitable transformation.

method

estimation criterion, eitheramise for the expression that minimizes the asymptotic integrated squared error,lcv for likelihood (leave-one-out) cross-validation,lscv for least-square cross-validation orrlcv for robust cross validation of Wu (2019)

type

string indicating whether to compute an isotropic model or estimate the optimal scale matrix via optimization

approx

string; distribution to approximate the true density functionf(x); eitherkernel for the kernel estimator evaluated at the sample points (except formethod="amise", which isn't supported),mig for multivariate inverse Gaussian with the method of moments ortnorm for the multivariate truncated Gaussian evaluated by maximum likelihood.

transformation

string for optional scaling of the data before computing the bandwidth. Either standardization to unit variancescaling, spherical transformation to unit variance and zero correlation (spherical), ornone (default).

N

integer number of simulations for Monte Carlo integration

buffer

double indicating the buffer from the half-space

maxiter

integer; max number of iterations in the call tooptim.

...

additional parameters, currently ignored

Value

ad byd scale matrix

References

Wu, X. (2019). Robust likelihood cross-validation for kernel density estimation.Journal of Business & Economic Statistics, 37(4), 761–770.doi:10.1080/07350015.2018.1424633Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates,Biometrika, 71(2), 353–360.doi:10.1093/biomet/71.2.353Rudemo, M. (1982). Empirical choice of histograms and kernel density estimators.Scandinavian Journal of Statistics, 9(2), 65–78. http://www.jstor.org/stable/4615859

Multivariate inverse Gaussian kernel density estimator

Description

Given a matrix of new observations, compute the density of the multivariateinverse Gaussian mixture defined by assigning equal weight to each component where\boldsymbol{\xi} is the location parameter.

Usage

mig_kdens(x, newdata, Omega, beta, log = FALSE)

Arguments

x

n byd matrix of quantiles

newdata

matrix of new observations at which to evaluated the kernel density

Omega

d byd positive definite scale matrix\boldsymbol{\Omega}

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

log

logical; ifTRUE, returns log probabilities

Value

value of the (log)-density atnewdata

MIG kernel density estimator

Description

Given a data matrix over a half-space defined bybeta,compute the log density taking in turn an observation innewdataas location vector and computing the kernel density estimate.

Usage

mig_kdens_arma(x, newdata, Omega, beta, logd)

Arguments

x

n byd matrix of quantiles

newdata

matrix of new observations at which to evaluated the kernel density

Omega

d byd positive definite scale matrix\boldsymbol{\Omega}

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

Value

the value of the likelihood cross-validation criterion

Likelihood cross-validation for MIG density estimation

Description

Likelihood cross-validation criterion function.

Usage

mig_lcv(x, Omega, beta)

Arguments

x

n byd matrix of observations

Omega

d byd positive definite scale matrix\boldsymbol{\Omega}

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

Value

LCV criterion value

Gradient of the MIG log likelihood with respect to data

Description

This function returns the gradient vector of the log likelihood with respect to theargumentx.

Usage

mig_loglik_grad(x, xi, Omega, beta)

Arguments

x

n byd matrix of quantiles

xi

d vector of location parameters\boldsymbol{\xi}, giving the expected value

Omega

d byd positive definite scale matrix\boldsymbol{\Omega}

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

Value

ann byd matrix of first derivatives for the gradient, observation by observation, or ad vector ifx is a vector.

Hessian of the MIG log likelihood with respect to data

Description

This function returns the hessian, i.e., the matrix ofsecond derivatives of the log likelihood with respect to theargumentx.

Usage

mig_loglik_hessian(x, beta, xi, Omega)

Arguments

x

n byd matrix of quantiles

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

xi

d vector of location parameters\boldsymbol{\xi}, giving the expected value

Omega

d byd positive definite scale matrix\boldsymbol{\Omega}

Value

ad byd matrix of second derivatives ifx has lengthd,else ann byd byd array ifx is ann byd matrix

Laplacian of the MIG log likelihood with respect to the data

Description

Computes the sum of second derivatives of the multivariateinverse Gaussian likelihood with respect to the data argumentx. The function is vectorized for more efficiency.

Usage

mig_loglik_laplacian(x, beta, xi, Omega)

Arguments

x

n byd matrix of quantiles

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

xi

d vector of location parameters\boldsymbol{\xi}, giving the expected value

Omega

d byd positive definite scale matrix\boldsymbol{\Omega}

Value

ann vector

Leave-one-out cross-validation for kernel density estimation with MIG

Description

Given a data matrix over a half-space defined bybeta,compute the log density using leave-one-out cross validation,taking in turn an observation as location vector and computing thedensity of the resulting mixture.

Usage

mig_loo(x, Omega, beta)

Arguments

x

n byd matrix of quantiles

Omega

d byd positive definite scale matrix\boldsymbol{\Omega}

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

Value

the value of the likelihood cross-validation criterion

Least squares cross-validation for MIG density estimation

Description

Given a data matrix over a half-space defined bybeta,compute the average using leave-one-out cross validation density minushalf the squared density.

Usage

mig_lscv(x, beta, Omega, xsamp, dxsamp, mckern = TRUE)

Arguments

x

n byd matrix of quantiles

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

Omega

d byd positive definite scale matrix\boldsymbol{\Omega}

xsamp

matrix of points at which to evaluate the integral

dxsamp

density of points

Value

the value of the least square cross-validation criterion

Robust likelihood cross-validation for kernel density estimation for MIG

Description

Usage

mig_rlcv(x, beta, Omega, an, xsamp, dxsamp, mckern = TRUE)

Arguments

x

n byd matrix of quantiles

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

Omega

d byd positive definite scale matrix\boldsymbol{\Omega}

xsamp

matrix of points at which to evaluate the integral

dxsamp

density of points

Value

the value of the likelihood cross-validation criterion

Maximum likelihood estimation of truncated Gaussian on half space

Description

Given a data matrix and a vector of linear constraints for the half-space,we compute the maximum likelihood estimator of the location and scale matrix

Usage

mle_truncgauss(xdat, beta)

Arguments

xdat

matrix of observations

beta

vector of constraints defining the half-space

Value

a list with location vectorloc and scale matrixscale

Normal bandwidth rule

Description

Given ann byd matrix of observations, compute thebandwidth according to Scott's rule.

Usage

normalrule_bandwidth(x)

Arguments

x

n byd matrix of observations

Value

ad byd diagonal bandwidth matrix

Orthogonal projection matrix onto the half-space

Description

The orthogonal projection matrixP has unit determinant andtransforms ann byd matrix by takingx * P.The components of the first column vector of the resulting matrix are strictly positive.

Usage

proj_hs(beta, inv = FALSE)

Arguments

beta

vector defining the half-space

inv

logical; ifTRUE, return the inverse matrix

Value

ad byd orthogonal projection matrix

Simulate elliptical vector subject to a linear constraint

Description

Simulate multivariate Student-t\boldsymbol{x}with location vectormu, scale matrixsigma anddf (integer) degrees of freedomsubject to the linear constraint\boldsymbol{\beta}^\top\boldsymbol{x} > 0.Negative degrees of freedom or values larger than 1000 imply Gaussian vectors are generated instead.

Usage

rtellipt(n, beta, mu, sigma, df, delta = 0)

Arguments

n

number of simulations

beta

d vector of linear constraints

mu

location vector

sigma

scale matrix

df

degrees of freedom argument

delta

buffer; default to zero

Value

ann byd matrix of random vectors

Truncated Gaussian kernel density estimator

Description

Given a data matrix over a half-space defined bybeta,compute the log density of the asymmetric truncated Gaussian kernel density estimator,taking in turn an observation as location vector.

Usage

tnorm_kdens(x, newdata, Sigma, beta, log = TRUE, ...)

Arguments

x

n byd matrix of quantiles

newdata

matrix of new observations at which to evaluated the kernel density

Sigma

scale matrix

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

log

logical; ifTRUE, returns log probabilities

...

additional arguments, currently ignored

Value

a vector containing the value of the kernel density at each of thenewdata points

Truncated Gaussian kernel density estimator

Description

Given a data matrix over a half-space defined bybeta,compute the log density of the asymmetric truncated Gaussian kernel density estimator,taking in turn an observation as location vector.

Usage

tnorm_kdens_arma(x, newdata, Omega, beta, logd)

Arguments

x

n byd matrix of quantiles

newdata

matrix of new observations at which to evaluated the kernel density

Omega

d byd positive definite scale matrix\boldsymbol{\Omega}

beta

d vector\boldsymbol{\beta} defining the half-space through\boldsymbol{\beta}^{\top}\boldsymbol{\xi}>0

Value

the value of the likelihood cross-validation criterion

Likelihood cross-validation for truncated normal kernel density estimation

Description

Likelihood cross-validation criterion function.

Usage

tnorm_lcv(x, Omega, beta)

Arguments

x

n byd matrix of observations

Omega

smoothing positive-definite matrix

beta

vector of constraints for the half-space

Value

LCV criterion value

Leave-one-out cross-validation for truncated Gaussian kernel density estimation

Description

Given a data matrix, compute the log density using leave-one-outcross validation, taking in turn an observation as location vectorand computing the density of the resulting mixture.

Usage

tnorm_loo(x, Omega, beta)

Arguments

x

n byd matrix of observations

Omega

smoothing positive-definite matrix

beta

vector of constraints for the half-space

Value

a vector of values for the weighted leave-one-out likelihood

Least squares cross-validation for truncated Gaussian kernel density estimation

Description

Least squares cross-validation for truncated Gaussian samples.

Usage

tnorm_lscv(x, Omega, beta, xsamp, dxsamp, mckern = TRUE)

Arguments

x

n byd matrix of observations

Omega

smoothing positive-definite matrix

beta

vector of constraints for the half-space

xsamp

n byd random sample for Monte Carlo estimation of bias

dxsamp

n vector of density for the points fromxsamp

mckern

logical; ifTRUE, uses the kernel as sampler for Monte Carlo estimation

Value

least square criterion value

Likelihood cross-validation for truncated normal kernel density estimation

Description

Robust likelihood cross-validation criterion function.

Usage

tnorm_rlcv(x, Omega, beta, an, xsamp, dxsamp, mckern = TRUE)

Arguments

x

n byd matrix of observations

Omega

smoothing positive-definite matrix

beta

vector of constraints for the half-space

an

threshold for linear approximation

xsamp

n byd random sample for Monte Carlo estimation of bias

dxsamp

n vector of density for the points fromxsamp

mckern

logical; ifTRUE, uses the kernel as sampler for Monte Carlo estimation

Value

RLCV criterion value