Movatterモバイル変換


[0]ホーム

URL:


Type:Package
Title:Distributions Hermite Polynomial Approximation
Version:1.3.3
Date:2023-11-29
Author:Potanin Bogdan
Maintainer:Potanin Bogdan <bogdanpotanin@gmail.com>
Description:Multivariate conditional and marginal densities, moments, cumulative distribution functions as well as binary choice and sample selection models based on Hermite polynomial approximation which was proposed and described by A. Gallant and D. W. Nychka (1987) <doi:10.2307/1913241>.
License:GPL-3
Imports:Rcpp (≥ 1.0.10), RcppParallel (≥ 5.0.0)
LinkingTo:Rcpp, RcppArmadillo, RcppParallel
RoxygenNote:7.2.3
Encoding:UTF-8
Suggests:ggplot2, mvtnorm, titanic, sampleSelection, GA (≥ 3.2)
NeedsCompilation:yes
SystemRequirements:GNU make
Packaged:2023-11-29 05:10:28 UTC; Bogdan
Repository:CRAN
Date/Publication:2023-11-29 07:00:10 UTC

B-splines generation, estimation and combination

Description

FunctionbsplineGenerate generates a listof all basis splines with appropriateknots vector anddegree.FunctionbsplineComb allows to get linear combinationsof these b-splines with particularweights. FunctionbsplineEstimate estimates the spline atpointsx. The structure of this spline should be provided viam andknots arguments.

Usage

bsplineGenerate(knots, degree, is_names = TRUE)bsplineEstimate(x, m, knots)bsplineComb(splines, weights)

Arguments

knots

sorted in ascending order numeric vector representingknots of the spline.

degree

positive integer representing degree of the spline.

is_names

logical; if TRUE (default) then rows and columns of thespline matrices will have a names. Set it to FALSE in order to get notable speed boost.

x

numeric vector representing the points at which the spline should be estimated.

m

numeric matrix which rows correspond to spline intervalswhile columns represent variables powers. Therefore the element in i-th row and j-th column represents the coefficient associated withthe variable that 1) belongs to the i-th interval i.e. between i-th and(i + 1)-th knots 2) raised to the power of (j - 1).

splines

list being returned by thebsplineGenerate function or a manually constructedlist with b-splines knots and matrices entries.

weights

numeric vector of the same length assplines.

Details

In contrast tobs functionbsplineGenerate generates a splines basis in a formof a list containing information concerning these b-splines structure.In order to evaluate one of these b-splines at particular pointsbsplineEstimate function should be applied.

Value

FunctionbsplineGenerate returns a list. Eachelement of this list is a list containing the followinginformation concerning b-spline structure:

FunctionbsplineComb returns a list with the following arguments:

FunctionbsplineGenerate returns a numericvector of values being calculated at pointsx via splines withknots vector and matrixm.

Examples

# Let's generate all b-splines of degree 3 with knots # vector (-2.1, 1.5, 1.5, 2.2, 3.7, 4.2, 5)b <- bsplineGenerate(knots = c(-2.1, 1.5, 1.5, 2.2, 3.7, 4.2, 5),                      degree = 3)# Get the first of these b-splinesb[[1]]# Take a linear combination of these splines with # weights 1.6, -1.2 and 3.2.b_comb <- bsplineComb(splines = b, weights = c(1.6, -1.2, 3.2))# Estimate this spline value at points (-3, 0.7, 2.5, 3.8, 10)b_values <- bsplineEstimate(x = c(-3, 0.7, 2.5, 3.8, 10),                              knots = b_comb$knots,                             m = b_comb$m)# Visualize the splines <- seq(from = 0, to = 5, length = 1000)b_values_s <- bsplineEstimate(x = s,                                knots = b_comb$knots,                               m = b_comb$m)plot(s, b_values_s)

Extract coefficients from hpaBinary object

Description

Extract coefficients from hpaBinary object

Usage

## S3 method for class 'hpaBinary'coef(object, ...)

Arguments

object

Object of class "hpaBinary"

...

further arguments (currently ignored)


Extract coefficients from hpaML object

Description

Extract coefficients from hpaML object

Usage

## S3 method for class 'hpaML'coef(object, ...)

Arguments

object

Object of class "hpaML"

...

further arguments (currently ignored)


Extract coefficients from hpaSelection object

Description

Extract coefficients from hpaSelection object

Usage

## S3 method for class 'hpaSelection'coef(object, ..., type = "all")

Arguments

object

Object of class "hpaSelection"

...

further arguments (currently ignored)

type

character; if "all" (default) then all estimated parametersvalues will be returned. If "selection" then selection equation coefficientsestimates will be provided. If "outcome" then outcome equation coefficientsestimates will be returned.


Calculate normal pdf in parallel

Description

Calculate in parallel for each value from vectorx density function of normal distribution with mean equal tomean and standard deviation equal tosd.

Usage

dnorm_parallel(x, mean = 0, sd = 1, is_parallel = FALSE)

Arguments

x

numeric vector of quantiles.

mean

double value.

sd

double positive value.

is_parallel

ifTRUE then multiple cores will beused for some calculations. It usually provides speed advantage forlarge enough samples (about more than 1000 observations).

Examples

## Consider normal distribution with mean 3 and standard deviation 5.## Calculate its density function at points 2 and 3.# Create vector of pointsmy_points <- c(2, 3)# Calculate pdf at these points # (set is_parallel = TRUE in order # to turn on parallel computations)dnorm_parallel(my_points, 3, 5,                is_parallel = FALSE)

Semi-nonparametric single index binary choice model estimation

Description

This function performs semi-nonparametric (SNP) maximum likelihood estimation of single index binary choice model using Hermite polynomial based approximating function proposed by Gallant and Nychka in 1987. Please, seedhpa 'Details' section to get more information concerning this approximating function.

Usage

hpaBinary(  formula,  data,  K = 1L,  mean_fixed = NA_real_,  sd_fixed = NA_real_,  constant_fixed = 0,  coef_fixed = TRUE,  is_x0_probit = TRUE,  is_sequence = FALSE,  x0 = numeric(0),  cov_type = "sandwich",  boot_iter = 100L,  is_parallel = FALSE,  opt_type = "optim",  opt_control = NULL,  is_validation = TRUE)

Arguments

formula

an object of class "formula" (or one that can be coerced to that class):a symbolic description of the model to be fitted.All variables informula should be numeric vectors of the same length.

data

data frame containing the variables in the model.

K

non-negative integer representing polynomial degree (order).

mean_fixed

numeric value for binary choice equation random error density mean parameter. Set it toNA (default) if this parameter should be estimated rather than fixed.

sd_fixed

numeric value for binary choice equation random errordensitysd parameter. Set it toNA (default) if this parametershould be estimated rather than fixed.

constant_fixed

numeric value for binary choice equation constant parameter. Set it toNA (default) if this parameter should be estimated rather than fixed.

coef_fixed

logical value indicating whether binary equation first independent variable coefficient should be fixed (TRUE) or estimated (FALSE).

is_x0_probit

logical; ifTRUE (default) then initialpoints for optimization routine will beobtained by probit model estimated viaglm function.

is_sequence

if TRUE then function calculates models with polynomialdegrees from 0 to K each time using initial values obtained from the previous step. In this case function will return the list of models where i-th list element correspond to model calculated under K=(i-1).

x0

numeric vector of optimization routine initial values.Note thatx0 = c(pol_coefficients[-1], mean, sd, coefficients).

cov_type

character determining the type of covariance matrix to bereturned and used for summary. Ifcov_type = "hessian" then negativeinverse of Hessian matrix will be applied. Ifcov_type = "gop" theninverse of Jacobian outer products will be used.Ifcov_type = "sandwich" (default) then sandwich covariance matrixestimator will be applied. Ifcov_type = "bootstrap" then bootstrapwithboot_iter iterations will be used.Ifcov_type = "hessianFD" orcov_type = "sandwichFD" then(probably) more accurate but computationally demanding central difference Hessian approximation will be calculated for the inverse Hessian and sandwich estimators correspondingly. Central differences are computed viaanalytically provided gradient. This Hessian matrix estimation approachseems to be less accurate than BFGS approximation if polynomial orderis high (usually greater then 5).

boot_iter

the number of bootstrap iterationsforcov_type = "bootstrap" covariance matrix estimator type.

is_parallel

ifTRUE then multiple cores will beused for some calculations. It usually provides speed advantage forlarge enough samples (about more than 1000 observations).

opt_type

string value determining the type of the optimizationroutine to be applied. The default is"optim" meaning that BFGS methodfrom theoptim function will be applied.Ifopt_type = "GA" thenga function will beadditionally applied.

opt_control

a list containing arguments to be passed to theoptimization routine depending onopt_type argument value.Please see details to get additional information.

is_validation

logical value indicating whether function input arguments should be validated. Set it toFALSE for slightperformance boost (default value isTRUE).

Details

Densities Hermite polynomial approximation approach has beenproposed by A. Gallant and D. W. Nychka in 1987. The main idea is toapproximate unknown distribution density with scaled Hermite polynomial.For more information please refer to the literature listed below.

Let's use notations introduced indhpa 'Details' section. FunctionhpaBinary maximizes the followingquasi log-likelihood function:

\ln L(\gamma_{0}, \gamma, \alpha, \mu, \sigma; x) = \sum\limits_{i:z_{i}=1} \ln\left(\overline{F}_{\xi}(-(\gamma_{0}+\gamma x_{i}), \infty;\alpha, \mu, \sigma)\right) +

+\sum\limits_{i:z_{i}=0} \ln\left(\overline{F}_{\xi}(-\infty, -(\gamma_{0} + x_{i}\gamma);\alpha, \mu, \sigma)\right),

where (in addition to previously defined notations):

x_{i} - is row vector of regressors derived fromdata according toformula.

\gamma - is column vector of regression coefficients.

\gamma_{0} - constant.

z_{i} - binary (0 or 1) dependent variable defined informula.

Note that\xi is one dimensional andK correspondstoK=K_{1}.

The first polynomial coefficient (zero powers) set to 1 for identification purposes i.e.\alpha_{0}=1.

Ifcoef_fixed isTRUE then the coefficient for the first independent variable informula will be fixed to 1 i.e.\gamma_{1}=1.

Ifmean_fixed is notNA then\mu=mean_fixedfixed.

Ifsd_fixed is notNA then\sigma=mean_fixedfixed. However ifis_x0_probit = TRUE then parameter\sigma will be scale adjusted in order to provide better initial point for optimization routine. Please, extract\sigma adjusted value from the function's output list. The same is formean_fixed.

Rows indata corresponding to variables mentioned informulawhich have at least oneNA value will be ignored.

All variables mentioned informula should be numeric vectors.

The function calculates standard errors via sandwich estimatorand significance levels are reported taking into account quasi maximumlikelihood estimator (QMLE) asymptotic normality. If one wants to switchfrom QMLE to semi-nonparametric estimator (SNPE) during hypothesis testingthen covariance matrix should be estimated again using bootstrap.

This function maximizes (quasi) log-likelihood function viaoptim function setting itsmethod argument to "BFGS". Ifopt_type = "GA" then geneticalgorithm fromga functionwill be additionally (afteroptim putting itssolution (par) intosuggestions matrix) applied in order to perform global optimization. Note that global optimization takesmuch more time (usually minutes but sometimes hours or even days). The number of iterations and population size of the genetic algorithmwill grow linearly along with the number of estimated parameters. If it seems that global maximum has not been found then itis possible to continue the search restarting the function setting its input argumentx0 tox1 output value. Note thatifcov_type = "bootstrap" thengafunction will not be used for bootstrap iterations since itmay be extremely time consuming.

Ifopt_type = "GA" thenopt_control should be thelist containing the values to be passed togafunction. It is possible to pass argumentslower,upper,popSize,pcrossover,pmutation,elitism,maxiter,suggestions,optim,optimArgs,seed andmonitor. Note that it is possible to setpopulation,selection,crossover andmutation arguments changingga default parameters viagaControl function. These arguments information reported inga.In order to provide manual values forlower andupper boundsplease follow parameters ordering mentioned above for thex0 argument. If these bounds are not provided manually thenthey (except those related to the polynomial coefficients)will depend on the estimates obtainedby local optimization viaoptim function(this estimates will be in the middlebetweenlower andupper).Specifically for each sd parameterlower (upper) boundis 5 times lower (higher) than thisparameteroptim estimate.For each mean and regression coefficient parameter its lower and upper bounds deviate from correspondingoptim estimateby two absolute values of this estimate.Finally, lower and upper bounds for each polynomialcoefficient are-10 and10 correspondingly (do not dependon theiroptim estimates).

The following arguments are differ from their defaults inga:

Let's denote byn_reg the number of regressorsincluded into theformula.The argumentspopSize andmaxiter ofga function have been set proportional to the number ofestimated polynomial coefficients and independent variables:

Value

This function returns an object of class "hpaBinary".

An object of class "hpaBinary" is a list containing the following components:

References

A. Gallant and D. W. Nychka (1987) <doi:10.2307/1913241>

See Also

summary.hpaBinary,predict.hpaBinary,plot.hpaBinary,logLik.hpaBinary

Examples

## Estimate survival probability on Titaniclibrary("titanic")# Prepare data set converting  # all variables to numeric vectorsh <- data.frame("male" = as.numeric(titanic_train$Sex == "male"))h$class_1 <- as.numeric(titanic_train$Pclass == 1)h$class_2 <- as.numeric(titanic_train$Pclass == 2)h$class_3 <- as.numeric(titanic_train$Pclass == 3)h$sibl <- titanic_train$SibSph$survived <- titanic_train$Survivedh$age <- titanic_train$Ageh$parch <- titanic_train$Parchh$fare <- titanic_train$Fare# Estimate model parametersmodel_hpa_1 <- hpaBinary(survived ~class_1 + class_2 +male + age + sibl + parch + fare,K = 3, data = h)#get summarysummary(model_hpa_1)# Get predicted probabilitiespred_hpa_1 <- predict(model_hpa_1)# Calculate number of correct predictionshpa_1_correct_0 <- sum((pred_hpa_1 < 0.5) &                        (model_hpa_1$dataframe$survived == 0))hpa_1_correct_1 <- sum((pred_hpa_1 >= 0.5) &                        (model_hpa_1$dataframe$survived == 1))hpa_1_correct <- hpa_1_correct_1 + hpa_1_correct_0# Plot random errors density approximationplot(model_hpa_1)## Estimate parameters on data simulated from Student distributionlibrary("mvtnorm")set.seed(123)# Simulate independent variables from normal distributionn <- 5000X <- rmvnorm(n=n, mean = c(0,0), sigma = matrix(c(1,0.5,0.5,1), ncol=2))# Simulate random errors from Student distributionepsilon <- rt(n, 5) * (3 / sqrt(5))# Calculate latent and observable variables valuesz_star <- 1 + X[, 1] + X[, 2] + epsilonz <- as.numeric((z_star > 0))# Store the results into data frameh <- as.data.frame(cbind(z,X))names(h) <- c("z", "x1", "x2")# Estimate model parametersmodel <- hpaBinary(formula = z ~ x1 + x2, data=h, K = 3)summary(model)# Get predicted probabilities of 1 valuespredict(model)# Plot density function approximationplot(model)

Probabilities and Moments Hermite Polynomial Approximation

Description

Approximation of truncated, marginal and conditional densities,moments and cumulative probabilities of multivariate distributions viaHermite polynomial based approach proposed by Gallant and Nychka in 1987.

Density approximating function is scale adjusted product of two terms. The first one is squared multivariate polynomial ofpol_degrees degrees withpol_coefficients coefficients vector. The second is product of independent normal random variables' densities with expected values and standard deviations given bymean andsd vectors correspondingly. Approximating function satisfies properties of density function thus generating a broad family of distributions.Characteristics of these distributions (moments, quantiles, probabilities and so on) may provide accurate approximations to characteristic of otherdistributions. Moreover it is usually possible to provide arbitrary closeapproximation by the means of polynomial degrees increase.

Usage

dhpa(  x,  pol_coefficients = numeric(0),  pol_degrees = numeric(0),  given_ind = numeric(0),  omit_ind = numeric(0),  mean = numeric(0),  sd = numeric(0),  is_parallel = FALSE,  log = FALSE,  is_validation = TRUE)phpa(  x,  pol_coefficients = numeric(0),  pol_degrees = numeric(0),  given_ind = numeric(0),  omit_ind = numeric(0),  mean = numeric(0),  sd = numeric(0),  is_parallel = FALSE,  log = FALSE,  is_validation = TRUE)ihpa(  x_lower = numeric(0),  x_upper = numeric(0),  pol_coefficients = numeric(0),  pol_degrees = numeric(0),  given_ind = numeric(0),  omit_ind = numeric(0),  mean = numeric(0),  sd = numeric(0),  is_parallel = FALSE,  log = FALSE,  is_validation = TRUE)ehpa(  x = numeric(0),  pol_coefficients = numeric(0),  pol_degrees = numeric(0),  given_ind = numeric(0),  omit_ind = numeric(0),  mean = numeric(0),  sd = numeric(0),  expectation_powers = numeric(0),  is_parallel = FALSE,  is_validation = TRUE)etrhpa(  tr_left = numeric(0),  tr_right = numeric(0),  pol_coefficients = numeric(0),  pol_degrees = numeric(0),  mean = numeric(0),  sd = numeric(0),  expectation_powers = numeric(0),  is_parallel = FALSE,  is_validation = TRUE)dtrhpa(  x,  tr_left = numeric(0),  tr_right = numeric(0),  pol_coefficients = numeric(0),  pol_degrees = numeric(0),  given_ind = numeric(0),  omit_ind = numeric(0),  mean = numeric(0),  sd = numeric(0),  is_parallel = FALSE,  log = FALSE,  is_validation = TRUE)itrhpa(  x_lower = numeric(0),  x_upper = numeric(0),  tr_left = numeric(0),  tr_right = numeric(0),  pol_coefficients = numeric(0),  pol_degrees = numeric(0),  given_ind = numeric(0),  omit_ind = numeric(0),  mean = numeric(0),  sd = numeric(0),  is_parallel = FALSE,  log = FALSE,  is_validation = TRUE)dhpaDiff(  x,  pol_coefficients = numeric(0),  pol_degrees = numeric(0),  given_ind = numeric(0),  omit_ind = numeric(0),  mean = numeric(0),  sd = numeric(0),  type = "pol_coefficients",  is_parallel = FALSE,  log = FALSE,  is_validation = TRUE)ehpaDiff(  x = numeric(0),  pol_coefficients = numeric(0),  pol_degrees = numeric(0),  given_ind = numeric(0),  omit_ind = numeric(0),  mean = numeric(0),  sd = numeric(0),  expectation_powers = numeric(0),  type = "pol_coefficients",  is_parallel = FALSE,  log = FALSE,  is_validation = TRUE)ihpaDiff(  x_lower = numeric(0),  x_upper = numeric(0),  pol_coefficients = numeric(0),  pol_degrees = numeric(0),  given_ind = numeric(0),  omit_ind = numeric(0),  mean = numeric(0),  sd = numeric(0),  type = "pol_coefficients",  is_parallel = FALSE,  log = FALSE,  is_validation = TRUE)qhpa(  p,  x = matrix(1, 1),  pol_coefficients = numeric(0),  pol_degrees = numeric(0),  given_ind = numeric(0),  omit_ind = numeric(0),  mean = numeric(0),  sd = numeric(0))rhpa(  n,  pol_coefficients = numeric(0),  pol_degrees = numeric(0),  mean = numeric(0),  sd = numeric(0))

Arguments

x

numeric matrix of function arguments andconditional values. Note thatx rows are points (observations)while random vectors components (variables) are columns.

pol_coefficients

numeric vector of polynomial coefficients.

pol_degrees

non-negative integer vector of polynomial degrees (orders).

given_ind

logical or numeric vector indicating whether corresponding random vector component is conditioned. By default it is a logical vector ofFALSE values. Ifgive_ind[i] equalsTRUE ori theni-th column ofx matrix will contain conditional values.

omit_ind

logical or numeric vector indicating whether correspondingrandom component is omitted. By default it is a logical vector ofFALSE values. Ifomit_ind[i] equalsTRUE ori then values ini-th column ofx matrix will be ignored.

mean

numeric vector of expected values.

sd

positive numeric vector of standard deviations.

is_parallel

ifTRUE then multiple cores will beused for some calculations. It usually provides speed advantage forlarge enough samples (about more than 1000 observations).

log

logical; ifTRUE then probabilities p are given as log(p)or derivatives will be given respect to log(p).

is_validation

logical value indicating whether function input arguments should be validated. Set it toFALSE for slightperformance boost (default value isTRUE).

x_lower

numeric matrix of lower integration limits.Note thatx_lower rows are observations while variables are columns.

x_upper

numeric matrix of upper integration limits.Note thatx_upper rows are observations while variablesare columns.

expectation_powers

integer vector of random vector components powers.

tr_left

numeric matrix of left (lower) truncation limits.Note thattr_left rows are observations while variables are columns.Iftr_left andtr_right are single row matrices then the same truncation limits will be applied to all observations that are determined by the first rows of these matrices.

tr_right

numeric matrix of right (upper) truncation limits.Note thattr_right rows are observations while variables are columns.Iftr_left andtr_right are single row matrices then the same truncation limits will be applied to all observations that are determined by the first rows of these matrices.

type

determines the partial derivatives to be included into thegradient. Iftype="pol_coefficients" then gradient will contain partial derivatives respect to polynomial coefficients listed in thesame order aspol_coefficients. Other available types aretype = "mean" andtype = "sd".For functiondhpaDiff it is possible to takegradient respect to the x points settingtype="x".For functionihpaDiff it is possible to takegradient respect to the x lower and upper points settingtype = "x_lower" ortype = "upper" correspondingly.In order to get full gradient please settype="all".

p

numeric vector of probabilities

n

positive integer representing the number of observations

Details

It is possible to approximate densitiesdhpa, cumulative probabilitiesphpa,ihpa, momentsehpa as well as their truncateddtrhpa,itrhpa,etrhpa formsand gradientsdhpaDiff,ihpaDiff.Note thatphpa is special ofihpawherexcorresponds tox_upper whilex_lower is matrix ofnegative infinity values. Sophpa intended to approximate cumulativedistribution functions whileihpa approximates probabilities thatrandom vector components will be between values determined by rows ofx_lower andx_upper matrices. Further details are given below.

Since density approximating function is non-negative and integratesto 1 it is density function for somem-variate random vector\xi. Approximating functionf_{\xi }(x) has the following form:

f_{\xi }(x) = f_{\xi }(x;\mu, \sigma, \alpha) =\frac{1}{\psi }\prod\limits_{t=1}^{m}\phi ({x}_{t};{\mu }_{t},{\sigma }_{t}){{\left( \sum\limits_{{i}_{1}=0}^{{K}_{1}}{...}\sum\limits_{{i}_{m}=0}^{{K}_{m}}{{{\alpha }_{({{i}_{1}},...,{{i}_{m}})}}\prod\limits_{r=1}^{m}x_{r}^{{{i}_{r}}}} \right)}^{2}}

\psi =\sum\limits_{{i}_{1}=0}^{{K}_{1}}{...}\sum\limits_{{i}_{m}=0}^{{K}_{m}}{\sum\limits_{{j}_{1}=0}^{{K}_{1}}{...}\sum\limits_{{j}_{m}=0}^{{K}_{m}}{{{\alpha }_{({i}_{1},\cdots,{i}_{m})}}{{\alpha }_{({j}_{1},\cdots,{j}_{m})}}\prod\limits_{r=1}^{m}\mathcal{M}({i}_{r}+{j}_{r};{{\mu }_{r}},{\sigma }_{r})}},

where:

x = (x_{1},...x_{m}) - is vector of arguments i.e. rowsofx matrix indhpa.

{\alpha }_{({i}_{1},\cdots,{i}_{m})} - is polynomial coefficientcorresponding topol_coefficients[k] element. In order to investigatecorrespondence betweenk and({i}_{1},\cdots,{i}_{m}) values please see 'Examples' section below orpolynomialIndex function 'Details', 'Value' and 'Examples' sections. Note that ifm=1thenpol_coefficients[k] simply corresponds to\alpha_{k-1}.

(K_{1},...,K_{m}) - are polynomial degrees (orders) provided viapol_degrees argument sopol_degrees[i] determinesK_{i}.

\phi (.;{\mu }_{t},{\sigma }_{t}) - is normal random variable density function where\mu_{t} and\sigma_{t} are mean and standard deviation determined bymean[t] andsd[t] arguments values.

\mathcal{M}(q;{{\mu }_{r}},{\sigma }_{r}) - isq-th ordermoment of normal random variable with mean{\mu }_{r} and standarddeviation{\sigma }_{r}. Note that functionnormalMoment allows to calculate and differentiate normal random variable's moments.

\psi - constant term insuring thatf_{\xi }(x) isdensity function.

Thereforedhpa allows to calculatef_{\xi}(x) values at pointsdetermined by rows ofx matrix given polynomial degreespol_degrees (K) as well asmean (\mu),sd (\sigma) andpol_coefficients (\alpha) parameters values. Note thatmean,sd andpol_degrees arem-variate vectors whilepol_coefficients hasprod(pol_degrees + 1) elements.

Cumulative probabilities could be approximated as follows:

P\left(\underline{x}_{1}\leq\xi_{1}\leq\overline{x}_{1},...,\underline{x}_{m}\leq\xi_{m}\leq\overline{x}_{m}\right) =

= \bar{F}_{\xi}(\underline{x},\bar{x}) = \bar{F}_{\xi}(\underline{x},\bar{x};\mu, \sigma, \alpha) =\frac{1}{\psi }\prod\limits_{t=1}^{m}(\Phi ({{{\bar{x}}}_{t}};{{\mu }_{t}},{{\sigma }_{t}})-\Phi ({{{\underline{x}}}_{t}};{{\mu }_{t}},{{\sigma }_{t}})) *

* \sum\limits_{{{i}_{1}}=0}^{{{K}_{1}}}{...}\sum\limits_{{{i}_{m}}=0}^{{{K}_{m}}}{\sum\limits_{{{j}_{1}}=0}^{{{K}_{1}}}{...}\sum\limits_{{{j}_{m}}=0}^{{{K}_{m}}}{{{\alpha }_{({{i}_{1}},...,{{i}_{m}})}}{{\alpha }_{({{j}_{1}},...,{{j}_{m}})}}}}\prod\limits_{r=1}^{m}\mathcal{M}_{TR}\left({i}_{r}+{j}_{r};\underline{x}_{r},\overline{x}_{r},\mu_{r},\sigma_{r}\right)

where:

\Phi (.;{\mu }_{t},{\sigma }_{t}) - is normal random variable's cumulative distribution function where\mu_{t} and\sigma_{t} are mean and standard deviation determined bymean[t] andsd[t] arguments values.

\mathcal{M}_{TR}(q;\underline{x}_{r},\overline{x}_{r},\mu_{r},\sigma_{r}) - isq-th ordermoment of truncated (from above by\overline{x}_{r} and from below by\underline{x}_{r}) normal random variable with mean{\mu }_{r} and standarddeviation{\sigma }_{r}. Note that functiontruncatedNormalMoment allows to calculate and differentiate truncated normal random variable's moments.

\overline{x} = (\overline{x}_{1},...,\overline{x}_{m}) - vector of upper integration limitsi.e. rows ofx_upper matrix inihpa.

\underline{x} = (\underline{x}_{1},...,\underline{x}_{m}) - vector of lower integration limitsi.e. rows ofx_lower matrix inihpa.

Thereforeihpa allows to calculate interval distribution function\bar{F}_{\xi}(\underline{x},\bar{x})values at points determined by rows ofx_lower (\underline{x})andx_upper (\overline{x}) matrices.The rest of the arguments are similar todhpa.

Expected value powered product approximation is as follows:

E\left( \prod\limits_{t=1}^{m}\xi_{t}^{{{k}_{t}}} \right)=\frac{1}{\psi }\sum\limits_{{{i}_{1}}=0}^{{{K}_{1}}}{...}\sum\limits_{{{i}_{m}}=0}^{{{K}_{m}}}{\sum\limits_{{{j}_{1}}=0}^{{{K}_{1}}}{...}\sum\limits_{{{j}_{m}}=0}^{{{K}_{m}}}{{{\alpha }_{({{i}_{1}},...,{{i}_{m}})}}{{\alpha }_{({{j}_{1}},...,{{j}_{m}})}}}}\prod\limits_{r=1}^{m}\mathcal{M}({{i}_{r}}+{{j}_{r}}+{{k}_{t}};{{\mu }_{r}},{{\sigma }_{r}})

where(k_{1},...,k_{m}) are integer powers determined byexpectation_powers argument ofehpa soexpectation_powers[t] assignsk_{t}. Note that argumentxinehpa allows to determined conditional values.

Expanding polynomial degrees(K_{1},...,K_{m}) it is possible to provide arbitrary close approximation to density of somem-variate random vector\xi^{\star}. So actuallyf_{\xi}(x)approximatesf_{\xi^{\star}}(x). Accurate approximation requiresappropriatemean,sd andpol_coefficients valuesselection. In order to get sample estimates of these parameters please applyhpaML function.

In order to perform calculation for marginal distribution of some\xi components please provide omitted components viaomit_ind argument.For examples if ones assumem=5-variate distributionand wants to deal with1-st,3-rd, and5-th components only i.e.(\xi_{1},\xi_{3},\xi_{5}) then setomit_ind = c(FALSE, TRUE, FALSE, TRUE, FALSE)indicating that\xi_{2} and\xi_{4} should be 'omitted' from\xi since2-nd and4-th values ofomit_ind areTRUE.Thenx still should be5 column matrix but values in2-nd and4-th columns will not affect calculation results. Meanwhile note that marginal distribution oftcomponents of\xi usually do not coincide with any marginaldistribution generated byt-variate density approximating function.

In order to perform calculation for conditional distribution i.e. given fixed values for some\xi components please provide thesecomponents viagiven_ind argument.For example if ones assumem=5-variate distributionand wants to deal with1-st,3-rd, and5-th components given fixed values (suppose 8 and 10) for the other two components i.e.(\xi|\xi_{2} = 8, \xi_{4} = 10) then setgiven_ind = c(FALSE, TRUE, FALSE, TRUE, FALSE) andx[2] = 8,x[4] = 10 where for simplicity it is assumed thatx is single row5 column matrix; it is possible to provide different conditional values for the same components simply setting different values to differentx rows.

Note that it is possible to combinegiven_ind andomit_indarguments. However it is wrong to set bothgiven_ind[i] andomit_ind[i] toTRUE. Also at least one value should beFALSE both forgiven_ind andomit_ind.

In order to consider truncated distribution of\xi i.e.\left(\xi|\overline{a}_{1}\leq\xi_{1}\leq\overline{b}_{1},\cdots,\overline{a}_{m}\leq\xi_{m}\leq\overline{b}_{m}\right)please set lower (left) truncation points\overline{a} and upper (right) truncation points\overline{b} viatr_left andtr_right arguments correspondingly. Note that if lower truncationpoints are negative infinite and upper truncation points are positiveinfinite thendtrhpa,itrhpa andetrhpa are similar todhpa,ihpa andehpa correspondingly.

In order to calculate Jacobian off_{\xi }(x;\mu, \sigma, \alpha)and\bar{F}_{\xi}(\underline{x},\bar{x};\mu, \sigma, \alpha) w.r.tall ore some particular parameters please applydhpaDiffandihpaDiff functions correspondingly specifyingparameters of interest viatype argument. Ifx orx_lower andx_upper are single row matrices then gradientswill be calculated.

For further information please see 'Examples' section. Note that examplesare given separately for each function.

Ifgiven_ind and (or)omit_ind are numeric vectorsthen they are insensitive to the order of elements. For examplegiven_ind = c(5, 2, 3) is similar togiven_ind = c(2, 3, 5).

Densities Hermite polynomial approximation approach has beenproposed by A. Gallant and D. W. Nychka in 1987. The main idea is toapproximate unknown distribution density with scaled Hermite polynomial.For more information please refer to the literature listed below.

Value

Functionsdhpa,phpa anddtrhpa return vector of probabilities of lengthnrow(x).

Functionsihpa anditrhpa return vector of probabilities of lengthnrow(x_upper).

Ifx argument has not been provided or is a single rowmatrix then functionehpa returns moment value. Otherwise it returns vector of lengthnrow(x) containing moments values.

Iftr_left andtr_right arguments are single row matrices thenfunctionetrhpa returns moment value.Otherwise it returns vector of lengthmax(nrow(tr_left), nrow(tr_right)) containing moments values.

FunctionsdhpaDiff andihpaDiff return Jacobin matrix. The numberof columns depends ontype argument. The number of rows isnrow(x) fordhpaDiff andnrow(x_upper) forihpaDiff

Ifmean orsd are not specified they assume the default values ofm-dimensional vectors of 0 and 1, respectively. Ifx_lower is not specified then it is the matrix of the same size asx_upper containing negative infinity values only. Ifexpectation_powers is not specified then it ism-dimensionalvector of 0 values.

Please see 'Details' section for additional information.

References

A. Gallant and D. W. Nychka (1987) <doi:10.2307/1913241>

Examples

## Example demonstrating dhpa function application.## Let's approximate some three random variables (i.e. X1, X2 and X3) ## joint density function at points  x = (0,1, 0.2, 0.3) and ## y = (0.5, 0.8, 0.6) with Hermite polynomial of (1, 2, 3) degrees which ## polynomial coefficients equal 1 except coefficient related  to x1*(x^3) ## polynomial element which equals 2. Also suppose that normal density ## related mean vector equals (1.1, 1.2, 1.3) while standard deviations ## vector is (2.1, 2.2, 2.3).# Prepare initial valuesx <- matrix(c(0.1, 0.2, 0.3), nrow = 1)   # x point as a single row matrixy <- matrix(c(0.5, 0.8, 0.6), nrow = 1)   # y point as a single row matrixx_y <- rbind(x, y)                        # matrix which rows are x and ymean <- c(1.1, 1.2, 1.3)sd <- c(2.1, 2.2, 2.3)pol_degrees <- c(1, 2, 3)# Create polynomial powers and indexes correspondence matrixpol_ind <- polynomialIndex(pol_degrees)# Set all polynomial coefficients to 1pol_coefficients <- rep(1, ncol(pol_ind))pol_degrees_n <- length(pol_degrees)# Assign coefficient 2 to the polynomial element (x1 ^ 1)*(x2 ^ 0)*(x3 ^ 2)pol_coefficients[apply(pol_ind, 2, function(x) all(x == c(1, 0, 2)))] <- 2# Visualize correspondence between polynomial # elements and their coefficientsas.data.frame(rbind(pol_ind, pol_coefficients),   row.names = c("x1 power", "x2 power", "x3 power", "coefficients"),   optional = TRUE)printPolynomial(pol_degrees, pol_coefficients)# Calculate density approximation    # at point x (note that x should be a matrix)dhpa(x = x,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd)   # at points x and ydhpa(x = x_y,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd)# Condition second component to be 0.5 i.e. X2 = 0.5.# Substitute x and y second components with conditional value 0.5x <- matrix(c(0.1, 0.5, 0.3), nrow = 1) # or simply x[2] <- 0.5y <- matrix(c(0.4, 0.5, 0.6), nrow = 1) # or simply y[2] <- 0.5x_y <- rbind(x, y) # Set TRUE to the second component indicating that it is conditionedgiven_ind <- c(FALSE, TRUE, FALSE)# Calculate conditional (on X2 = 0.5) density approximation    # at point xdhpa(x = x,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd,     given_ind = given_ind)   # at points x and ydhpa(x = x_y,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd,     given_ind = given_ind)# Consider third component marginal distribution conditioned on the# second component 0.5 value i.e. (X3 | X2 = 0.5).# Set TRUE to the first component indicating that it is omittedomit_ind <- c(TRUE, FALSE, FALSE)# Calculate conditional (on x2 = 0.5) marginal (for x3) density approximation   # at point xdhpa(x = x,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd,     given_ind = given_ind,      omit_ind = omit_ind)   # at points x and ydhpa(x = x_y,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd,     given_ind = given_ind,      omit_ind = omit_ind)          ## Example demonstrating phpa function application.## Let's approximate some three random variables (X1, X2, X3) ## joint cumulative distribution function (cdf) at point (0,1, 0.2, 0.3)## with Hermite polynomial of (1, 2, 3) degrees which polynomial ## coefficients equal 1 except coefficient related to x1*(x^3) polynomial ## element which equals 2. Also suppose that normal density related## mean vector equals (1.1, 1.2, 1.3) while standard deviations## vector is (2.1, 2.2, 2.3).## Prepare initial valuesx <- matrix(c(0.1, 0.2, 0.3), nrow = 1)mean <- c(1.1, 1.2, 1.3)sd <- c(2.1, 2.2, 2.3)pol_degrees <- c(1, 2, 3)# Create polynomial powers and indexes correspondence matrixpol_ind <- polynomialIndex(pol_degrees)# Set all polynomial coefficients to 1pol_coefficients <- rep(1, ncol(pol_ind))pol_degrees_n <- length(pol_degrees)# Assign coefficient 2 to the polynomial element (x1 ^ 1)*(x2 ^ 0)*(x3 ^ 2)pol_coefficients[apply(pol_ind, 2, function(x) all(x == c(1, 0, 2)))] <- 2# Visualize correspondence between polynomial# elements and their coefficientsas.data.frame(rbind(pol_ind, pol_coefficients),           row.names = c("x1 power", "x2 power",                          "x3 power", "coefficients"),           optional = TRUE)printPolynomial(pol_degrees, pol_coefficients)# Calculate cdf approximation at point xphpa(x = x,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd)# Condition second component to be 0.5# Substitute x second component with conditional value 0.5x <- matrix(c(0.1, 0.5, 0.3), nrow = 1) # or simply x[2] <- 0.5# Set TRUE to the second component indicating that it is conditionedgiven_ind <- c(FALSE, TRUE, FALSE)# Calculate conditional (on X2 = 0.5) cdf approximation at point xphpa(x = x,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd,     given_ind = given_ind)# Consider third component marginal distribution# conditioned on the second component 0.5 value# Set TRUE to the first component indicating that it is omittedomit_ind <- c(TRUE, FALSE, FALSE)# Calculate conditional (on X2 = 0.5) marginal (for X3) cdf # approximation at point xphpa(x = x,      pol_coefficients = pol_coefficients,       pol_degrees = pol_degrees,      mean = mean, sd = sd,      given_ind = given_ind,       omit_ind = omit_ind)## Example demonstrating ihpa function application.## Let's approximate some three random variables (X1, X2, X3) joint interval ## distribution function (intdf) at lower and upper points (0,1, 0.2, 0.3) ## and (0,4, 0.5, 0.6) correspondingly with Hermite polynomial of (1, 2, 3) ## degrees which polynomial coefficients equal 1 except coefficient related ## to x1*(x^3) polynomial element which equals 2. Also suppose that normal## density related mean vector equals (1.1, 1.2, 1.3) while standard## deviations vector is (2.1, 2.2, 2.3).## Prepare initial valuesx_lower <- matrix(c(0.1, 0.2, 0.3), nrow=1)x_upper <- matrix(c(0.4, 0.5, 0.6), nrow=1)mean <- c(1.1, 1.2, 1.3)sd <- c(2.1, 2.2, 2.3)pol_degrees <- c(1, 2, 3)# Create polynomial powers and indexes correspondence matrixpol_ind <- polynomialIndex(pol_degrees)# Set all polynomial coefficients to 1pol_coefficients <- rep(1, ncol(pol_ind))pol_degrees_n <- length(pol_degrees)# Assign coefficient 2 to the polynomial element (x1 ^ 1)*(x2 ^ 0)*(x3 ^ 2)pol_coefficients[apply(pol_ind, 2, function(x) all(x == c(1, 0, 2)))] <- 2# Visualize correspondence between polynomial # elements and their coefficientsas.data.frame(rbind(pol_ind, pol_coefficients),           row.names = c("x1 power", "x2 power",                          "x3 power", "coefficients"),              optional = TRUE)printPolynomial(pol_degrees, pol_coefficients)# Calculate intdf approximation at points x_lower and x_upperihpa(x_lower = x_lower, x_upper = x_upper,      pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd)# Condition second component to be 0.7# Substitute x second component with conditional value 0.7x_upper <- matrix(c(0.4, 0.7, 0.6), nrow = 1)# Set TRUE to the second component indicating that it is conditionedgiven_ind <- c(FALSE, TRUE, FALSE)# Calculate conditional (on X2 = 0.5) intdf approximation # at points x_lower and x_upperihpa(x_lower = x_lower, x_upper = x_upper,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd,     given_ind = given_ind)# Consider third component marginal distribution# conditioned on the second component 0.7 value# Set TRUE to the first component indicating that it is omittedomit_ind <- c(TRUE, FALSE, FALSE)# Calculate conditional (on X2 = 0.5) marginal (for X3) # intdf approximation at points x_lower and x_upperihpa(x_lower = x_lower, x_upper = x_upper,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd,     given_ind = given_ind, omit_ind = omit_ind)## Example demonstrating ehpa function application.## Let's approximate some three random variables (X1, X2, X3) powered product ## expectation for powers (3, 2, 1) with Hermite polynomial of (1, 2, 3) ## degrees which polynomial coefficients equal 1 except coefficient ## related to x1*(x^3) polynomial element which equals 2.## Also suppose that normal density related mean vector equals ## (1.1, 1.2, 1.3) while standard deviations vector is (2.1, 2.2, 2.3).# Prepare initial valuesexpectation_powers = c(3,2,1)mean <- c(1.1, 1.2, 1.3)sd <- c(2.1, 2.2, 2.3)pol_degrees <- c(1, 2, 3)# Create polynomial powers and indexes correspondence matrixpol_ind <- polynomialIndex(pol_degrees)# Set all polynomial coefficients to 1pol_coefficients <- rep(1, ncol(pol_ind))pol_degrees_n <- length(pol_degrees)#Assign coefficient 2 to the polynomial element (x1 ^ 1)*(x2 ^ 0)*(x3 ^ 2)pol_coefficients[apply(pol_ind, 2, function(x) all(x == c(1, 0, 2)))] <- 2# Visualize correspondence between polynomial elements and their coefficientsas.data.frame(rbind(pol_ind, pol_coefficients),              row.names = c("x1 power", "x2 power",                             "x3 power", "coefficients"),              optional = TRUE)printPolynomial(pol_degrees, pol_coefficients)# Calculate expected powered product approximationehpa(pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd,      expectation_powers = expectation_powers)# Condition second component to be 0.5# Substitute x second component with conditional value 0.5x <- matrix(c(NA, 0.5, NA), nrow = 1)#Set TRUE to the second component indicating that it is conditionedgiven_ind <- c(FALSE, TRUE, FALSE)# Calculate conditional (on X2 = 0.5) expected powered product approximationehpa(x = x,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd,      expectation_powers = expectation_powers,     given_ind = given_ind)# Consider third component marginal distribution# conditioned on the second component 0.5 value# Set TRUE to the first component indicating that it is omittedomit_ind <- c(TRUE, FALSE, FALSE)# Calculate conditional (on X2 = 0.5) marginal (for X3) expected powered # product approximation at points x_lower and x_upperehpa(x = x,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd,      expectation_powers = expectation_powers,     given_ind = given_ind,      omit_ind = omit_ind)## Example demonstrating etrhpa function application.## Let's approximate some three truncated random variables (X1, X2, X3) ## powered product expectation for powers (3, 2, 1) with Hermite polynomial ## of (1,2,3) degrees which polynomial coefficients equal 1 except ## coefficient related to x1*(x^3) polynomial element which equals 2. Also## suppose that normal density related mean vector equals (1.1, 1.2, 1.3) ## while standard deviations vector is (2.1, 2.2, 2.3). Suppose that lower  ## and upper truncation points are (-1.1,-1.2,-1.3) and (1.1,1.2,1.3) ## correspondingly.# Prepare initial valuesexpectation_powers = c(3,2,1)tr_left = matrix(c(-1.1,-1.2,-1.3), nrow = 1)tr_right = matrix(c(1.1,1.2,1.3), nrow = 1)mean <- c(1.1, 1.2, 1.3)sd <- c(2.1, 2.2, 2.3)pol_degrees <- c(1, 2, 3)# Create polynomial powers and indexes correspondence matrixpol_ind <- polynomialIndex(pol_degrees)# Set all polynomial coefficients to 1pol_coefficients <- rep(1, ncol(pol_ind))pol_degrees_n <- length(pol_degrees)# Assign coefficient 2 to the polynomial element (x1 ^ 1)*(x2 ^ 0)*(x3 ^ 2)pol_coefficients[apply(pol_ind, 2, function(x) all(x == c(1, 0, 2)))] <- 2# Visualize correspondence between polynomial elements and their coefficientsas.data.frame(rbind(pol_ind, pol_coefficients),              row.names = c("x1 power", "x2 power",                             "x3 power", "coefficients"),              optional = TRUE)printPolynomial(pol_degrees, pol_coefficients)# Calculate expected powered product approximation for truncated distributionetrhpa(pol_coefficients = pol_coefficients,        pol_degrees = pol_degrees,       mean = mean, sd = sd,        expectation_powers = expectation_powers,       tr_left = tr_left, tr_right = tr_right)       ## Example demonstrating dtrhpa function application.## Let's approximate some three random variables (X1, X2, X3) joint density ## function at point (0,1, 0.2, 0.3) with Hermite polynomial of (1,2,3)  ## degrees which polynomial coefficients equal 1 except coefficient related ## to x1*(x^3) polynomial element which equals 2. Also suppose that normal ## density related mean vector equals (1.1, 1.2, 1.3) while standard ## deviations vector is (2.1, 2.2, 2.3). Suppose that lower and upper ## truncation points are (-1.1,-1.2,-1.3) and (1.1,1.2,1.3) correspondingly.# Prepare initial valuesx <- matrix(c(0.1, 0.2, 0.3), nrow=1)tr_left = matrix(c(-1.1,-1.2,-1.3), nrow = 1)tr_right = matrix(c(1.1,1.2,1.3), nrow = 1)mean <- c(1.1, 1.2, 1.3)sd <- c(2.1, 2.2, 2.3)pol_degrees <- c(1, 2, 3)# Create polynomial powers and indexes correspondence matrixpol_ind <- polynomialIndex(pol_degrees)# Set all polynomial coefficients to 1pol_coefficients <- rep(1, ncol(pol_ind))pol_degrees_n <- length(pol_degrees)# Assign coefficient 2 to the polynomial element (x1 ^ 1)*(x2 ^ 0)*(x3 ^ 2)pol_coefficients[apply(pol_ind, 2, function(x) all(x == c(1, 0, 2)))] <- 2# Visualize correspondence between polynomial elements and their coefficientsas.data.frame(rbind(pol_ind, pol_coefficients),              row.names = c("x1 power", "x2 power",                             "x3 power", "coefficients"),              optional = TRUE)printPolynomial(pol_degrees, pol_coefficients)# Calculate density approximation at point xdtrhpa(x = x,       pol_coefficients = pol_coefficients,        pol_degrees = pol_degrees,       mean = mean, sd = sd,       tr_left = tr_left,        tr_right = tr_right)# Condition second component to be 0.5# Substitute x second component with conditional value 0.5x <- matrix(c(0.1, 0.5, 0.3), nrow = 1)# Set TRUE to the second component indicating that it is conditionedgiven_ind <- c(FALSE, TRUE, FALSE)# Calculate conditional (on x2 = 0.5) density approximation at point xdtrhpa(x = x,       pol_coefficients = pol_coefficients,        pol_degrees = pol_degrees,       mean = mean, sd = sd,       given_ind = given_ind,       tr_left = tr_left, tr_right = tr_right)# Consider third component marginal distribution# conditioned on the second component 0.5 value# Set TRUE to the first component indicating that it is omittedomit_ind <- c(TRUE, FALSE, FALSE)# Calculate conditional (on X2 = 0.5) marginal (for X3) # density approximation at point xdtrhpa(x = x,       pol_coefficients = pol_coefficients,        pol_degrees = pol_degrees,       mean = mean, sd = sd,       given_ind = given_ind, omit_ind = omit_ind,       tr_left = tr_left, tr_right = tr_right)       ## Example demonstrating itrhpa function application.## Let's approximate some three truncated random variables (X1, X2, X3) joint ## interval distribution function at lower and upper points (0,1, 0.2, 0.3) ## and (0,4, 0.5, 0.6) correspondingly with Hermite polynomial of (1 ,2, 3) ## degrees which polynomial coefficients equal 1 except coefficient## related to x1*(x^3) polynomial element which equals 2. Also suppose ## that normal density related mean vector equals (1.1, 1.2, 1.3) while ## standard deviations vector is (2.1, 2.2, 2.3). Suppose that lower and ## upper truncation are (-1.1,-1.2,-1.3) and (1.1,1.2,1.3) correspondingly.# Prepare initial valuesx_lower <- matrix(c(0.1, 0.2, 0.3), nrow=1)x_upper <- matrix(c(0.4, 0.5, 0.6), nrow=1)tr_left = matrix(c(-1.1,-1.2,-1.3), nrow = 1)tr_right = matrix(c(1.1,1.2,1.3), nrow = 1)mean <- c(1.1, 1.2, 1.3)sd <- c(2.1, 2.2, 2.3)pol_degrees <- c(1, 2, 3)# Create polynomial powers and indexes correspondence matrixpol_ind <- polynomialIndex(pol_degrees)# Set all polynomial coefficients to 1pol_coefficients <- rep(1, ncol(pol_ind))pol_degrees_n <- length(pol_degrees)# Assign coefficient 2 to the polynomial element (x1 ^ 1)*(x2 ^ 0)*(x3 ^ 2)pol_coefficients[apply(pol_ind, 2, function(x) all(x == c(1, 0, 2)))] <- 2# Visualize correspondence between polynomial # elements and their coefficientsas.data.frame(rbind(pol_ind, pol_coefficients),              row.names = c("x1 power", "x2 power",                             "x3 power", "coefficients"),              optional = TRUE)printPolynomial(pol_degrees, pol_coefficients)# Calculate intdf approximation at points x_lower and x_upperitrhpa(x_lower = x_lower, x_upper = x_upper,        pol_coefficients = pol_coefficients,        pol_degrees = pol_degrees,       mean = mean, sd = sd,       tr_left = tr_left, tr_right = tr_right)    # Condition second component to be 0.7# Substitute x second component with conditional value 0.7x_upper <- matrix(c(0.4, 0.7, 0.6), nrow = 1)# Set TRUE to the second component indicating that it is conditionedgiven_ind <- c(FALSE, TRUE, FALSE)# Calculate conditional (on X2 = 0.5) intdf # approximation at points x_lower and x_upperitrhpa(x_lower = x_lower, x_upper = x_upper,       pol_coefficients = pol_coefficients,        pol_degrees = pol_degrees,       mean = mean, sd = sd,       given_ind = given_ind,       tr_left = tr_left, tr_right = tr_right)    # Consider third component marginal distribution# conditioned on the second component 0.7 value# Set TRUE to the first component indicating that it is omittedomit_ind <- c(TRUE, FALSE, FALSE)# Calculate conditional (on X2 = 0.5) marginal (for X3) intdf # approximation at points x_lower and x_upperitrhpa(x_lower = x_lower, x_upper = x_upper,       pol_coefficients = pol_coefficients,        pol_degrees = pol_degrees,       mean = mean, sd = sd,       given_ind = given_ind, omit_ind = omit_ind,       tr_left = tr_left, tr_right = tr_right)       ## Example demonstrating dhpaDiff function application.## Let's approximate some three random variables (X1, X2, X3) joint density## function at point (0,1, 0.2, 0.3) with Hermite polynomial of (1,2,3)## degrees which polynomial coefficients equal 1 except coefficient related## to x1*(x^3) polynomial element which equals 2. Also suppose that normal## density related mean vector equals (1.1, 1.2, 1.3) while standard## deviations vector is (2.1, 2.2, 2.3). In this example let's calculate## density approximating function's gradient respect to various parameters# Prepare initial valuesx <- matrix(c(0.1, 0.2, 0.3), nrow = 1)mean <- c(1.1, 1.2, 1.3)sd <- c(2.1, 2.2, 2.3)pol_degrees <- c(1, 2, 3)# Create polynomial powers and indexes correspondence matrixpol_ind <- polynomialIndex(pol_degrees)# Set all polynomial coefficients to 1pol_coefficients <- rep(1, ncol(pol_ind))pol_degrees_n <- length(pol_degrees)# Assign coefficient 2 to the polynomial element (x1 ^ 1)*(x2 ^ 0)*(x3 ^ 2)pol_coefficients[apply(pol_ind, 2, function(x) all(x == c(1, 0, 2)))] <- 2# Visualize correspondence between polynomial# elements and their coefficientsas.data.frame(rbind(pol_ind, pol_coefficients),              row.names = c("x1 power", "x2 power",                             "x3 power", "coefficients"),              optional = TRUE)printPolynomial(pol_degrees, pol_coefficients)# Calculate density approximation gradient # respect to polynomial coefficients at point xdhpaDiff(x = x,         pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd)# Condition second component to be 0.5# Substitute x second component with conditional value 0.5x <- matrix(c(0.1, 0.5, 0.3), nrow = 1)# Set TRUE to the second component indicating that it is conditionedgiven_ind <- c(FALSE, TRUE, FALSE)# Calculate conditional (on x2 = 0.5) density approximation's # gradient respect to polynomial coefficients at point xdhpaDiff(x = x,         pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd,         given_ind = given_ind)# Consider third component marginal distribution# conditioned on the second component 0.5 value# Set TRUE to the first component indicating that it is omittedomit_ind <- c(TRUE, FALSE, FALSE)# Calculate conditional (on X2 = 0.5) marginal (for X3) density # approximation's gradient respect to:   # polynomial coefficientsdhpaDiff(x = x,         pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd,         given_ind = given_ind,          omit_ind = omit_ind)  # meandhpaDiff(x = x,         pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd,         given_ind = given_ind,          omit_ind = omit_ind,         type = "mean")  # sddhpaDiff(x = x,         pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd,         given_ind = given_ind,          omit_ind = omit_ind,         type = "sd") # xdhpaDiff(x = x,         pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd,         given_ind = given_ind,          omit_ind = omit_ind,         type = "x")         ## Example demonstrating ehpaDiff function application.## Let's approximate some three random variables (X1, X2, X3) expectation## of the form E((X1 ^ 3) * (x2 ^ 1) * (X3 ^ 2)) and calculate the gradient# Distribution parametersmean <- c(1.1, 1.2, 1.3)sd <- c(2.1, 2.2, 2.3)pol_degrees <- c(1, 2, 3)pol_coefficients_n <- prod(pol_degrees + 1)pol_coefficients <- rep(1, pol_coefficients_n)# Set powers for expectationexpectation_powers <- c(3, 1, 2)# Calculate expectation approximation gradient # respect to all parametersehpaDiff(pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd,         expectation_powers = expectation_powers,         type = "all")# Let's calculate gradient of E(X1 ^ 3 | (X2 = 1, X3 = 2))x <- c(0, 1, 2)                  # x[1] may be arbitrary (not NA) valuesexpectation_powers <- c(3, 0, 0) # expectation_powers[2:3] may be                                  # arbitrary (not NA) valuesgiven_ind <- c(2, 3)ehpaDiff(x = x,         pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd,         given_ind = given_ind,         expectation_powers = expectation_powers,         type = "all")## Example demonstrating ihpaDiff function application.## Let's approximate some three random variables (X1, X2, X3 ) joint interval ## distribution function (intdf) at lower and upper points (0,1, 0.2, 0.3) ## and (0,4, 0.5, 0.6) correspondingly with Hermite polynomial of (1, 2, 3) ## degrees which polynomial coefficients equal 1 except coefficient ## related to x1*(x^3) polynomial element which equals 2.## Also suppose that normal density related mean vector equals ## (1.1, 1.2, 1.3) while standard deviations vector is (2.1, 2.2, 2.3).## In this example let's calculate interval distribution approximating ## function gradient respect to polynomial coefficients.# Prepare initial valuesx_lower <- matrix(c(0.1, 0.2, 0.3), nrow=1)x_upper <- matrix(c(0.4, 0.5, 0.6), nrow=1)mean <- c(1.1, 1.2, 1.3)sd <- c(2.1, 2.2, 2.3)pol_degrees <- c(1, 2, 3)# Create polynomial powers and indexes correspondence matrixpol_ind <- polynomialIndex(pol_degrees)# Set all polynomial coefficients to 1pol_coefficients <- rep(1, ncol(pol_ind))pol_degrees_n <- length(pol_degrees)# Assign coefficient 2 to the polynomial element (x1 ^ 1)*(x2 ^ 0)*(x3 ^ 2)pol_coefficients[apply(pol_ind, 2, function(x) all(x == c(1, 0, 2)))] <- 2# Visualize correspondence between polynomial # elements and their coefficientsas.data.frame(rbind(pol_ind, pol_coefficients),              row.names = c("x1 power", "x2 power",                             "x3 power", "coefficients"),              optional = TRUE)printPolynomial(pol_degrees, pol_coefficients)# Calculate intdf approximation gradient respect to # polynomial coefficients at points x_lower and x_upperihpaDiff(x_lower = x_lower, x_upper = x_upper,          pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd)# Condition second component to be 0.7# Substitute x second component with conditional value 0.7x_upper <- matrix(c(0.4, 0.7, 0.6), nrow = 1)# Set TRUE to the second component indicating that it is conditionedgiven_ind <- c(FALSE, TRUE, FALSE)# Calculate conditional (on X2 = 0.5) intdf approximation# respect to polynomial coefficients at points x_lower and x_upperihpaDiff(x_lower = x_lower, x_upper = x_upper,         pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd,         given_ind = given_ind)# Consider third component marginal distribution# conditioned on the second component 0.7 value# Set TRUE to the first component indicating that it is omittedomit_ind <- c(TRUE, FALSE, FALSE)# Calculate conditional (on X2 = 0.5) marginal (for X3) intdf approximation# respect to:  # polynomial coefficientsihpaDiff(x_lower = x_lower, x_upper = x_upper,         pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd,         given_ind = given_ind, omit_ind = omit_ind)  # meanihpaDiff(x_lower = x_lower, x_upper = x_upper,         pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd,         given_ind = given_ind, omit_ind = omit_ind,         type = "mean")  # sdihpaDiff(x_lower = x_lower, x_upper = x_upper,         pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd,         given_ind = given_ind, omit_ind = omit_ind,         type = "sd")  # x_lowerihpaDiff(x_lower = x_lower, x_upper = x_upper,         pol_coefficients = pol_coefficients,          pol_degrees = pol_degrees,         mean = mean, sd = sd,         given_ind = given_ind, omit_ind = omit_ind,         type = "x_lower")  # x_upperihpaDiff(x_lower = x_lower, x_upper = x_upper,          pol_coefficients = pol_coefficients,           pol_degrees = pol_degrees,          mean = mean, sd = sd,          given_ind = given_ind, omit_ind = omit_ind,          type = "x_upper")          ## Examples demonstrating qhpa function application.## Sub-example 1 - univariate distribution## Consider random variable X# Distribution parametersmean <- 1sd <- 2pol_degrees <- 2pol_coefficients <- c(1, 0.1, -0.01)# The level of quantilep <- 0.7# Calculate quantile of Xqhpa(p = p,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd)     ## Sub-example 2 - marginal distribution## Consider random vector (X1, X2) and quantile of X1# Distribution parametersmean <- c(1, 1.2)sd <- c(2, 3)pol_degrees <- c(2, 2)pol_coefficients <- c(1, 0.1, -0.01, 0.2, 0.012,                       0.0013, 0.0042, 0.00025, 0)# The level of quantilep <- 0.7# Calculate quantile of X1qhpa(p = p,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd,     omit_ind = 2)                          # set omitted variable index     ## Sub-example 3 - marginal and conditional distribution## Consider random vector (X1, X2, X3) and ## quantiles of X1|X3 and X1|(X2,X3)mean <- c(1, 1.2, 0.9)sd <- c(2, 3, 2.5)pol_degrees <- c(1, 1, 1)pol_coefficients <- c(1, 0.1, -0.01, 0.2, 0.012,                       0.0013, 0.0042, 0.00025)# The level of quantilep <- 0.7# Calculate quantile of X1|X3 = 0.2qhpa(p = p,     x = matrix(c(NA, NA, 0.2), nrow = 1),  # set any values to                                            # unconditioned and                                             # omitted components     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd,     omit_ind = 2,                          # set omitted variable index     given_ind = 3)                         # set conditioned variable index     # Calculate quantile of X1|(X2 = 0.5, X3 = 0.2)qhpa(p = p,     x = matrix(c(NA, 0.5, 0.2), nrow = 1), # set any values to                                             # unconditioned and                                             # omitted components     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd,     given_ind = c(2, 3))                   # set conditioned                                            # variables indexes        ## Examples demonstrating rhpa function application.# Set seed for reproducibilityset.seed(123)# Distribution parametersmean <- 1sd <- 2pol_degrees <- 2pol_coefficients <- c(1, 0.1, -0.01)# Simulate two observations from this distributionrhpa(n = 2,     pol_coefficients = pol_coefficients,      pol_degrees = pol_degrees,     mean = mean, sd = sd)

Fast pdf and cdf for standardized univariate PGN distribution

Description

This function uses fast algorithms to calculate densitiesand probabilities (along with their derivatives) related to standardized PGN distribution.

Usage

dhpa0(  x,  pc,  mean = 0,  sd = 1,  is_parallel = FALSE,  log = FALSE,  is_validation = TRUE,  is_grad = FALSE)phpa0(  x,  pc,  mean = 0,  sd = 1,  is_parallel = FALSE,  log = FALSE,  is_validation = TRUE,  is_grad = FALSE)

Arguments

x

numeric vector of functions arguments.

pc

polynomial coefficients without the first term.

mean

expected value (mean) of the distribution.

sd

standard deviation of the distribution.

is_parallel

logical; if TRUE then multiple cores will be used for some calculations. Currently unavailable.

log

logical; ifTRUE then probabilities p are given as log(p)or derivatives will be given respect to log(p).

is_validation

logical value indicating whether function input arguments should be validated. Set it toFALSE for slightperformance boost (default value isTRUE).

is_grad

logical; ifTRUE (default) then function returns gradients respect tox andpc.

Details

Functionsdhpa0 andphpa0 are similar todhpa andphpa correspondingly. However there are two keydifferences. First,dhpa0 andphpa0are deal with univariate PGN distribution only. Second, this distributionis standardized to zero mean and unit variances. Moreoverpc is similar topol_coefficients argument ofdhpa butwithout the first component i.e.pc=pol_coefficients[-1]. Alsomean andsd are not the arguments of the normal densitybut actual mean and standard deviation of the resulting distribution. Soif these arguments are different from0 and1 correspondinglythen standardized PGN distribution will be linearly transformed to havemeanmean and standard deviationsd.

Value

Both functions return a list.Functiondhpa0 returns a list with element named"den" that is a numeric vector of density values. Functionphpa0 returns a list with element named"prob" that is a numeric vector of probabilities.

Ifis_grad = TRUE then elements"grad_x" and"grad_pc"will be add to the list containing gradients respect to input argumentx and parameterspc correspondingly. Iflog = TRUE thenadditional elements will be add to the list containing density, probabilityand gradient values for logarithms of corresponding functions. Theseelements will be named as"grad_x_log","grad_pc_log","grad_prob_log" and"grad_den_log".

Examples

# Calculate density and probability of standartized PGN# distribution  # distribution parameterspc <- c(0.5, -0.2)  # function argumentsx <- c(-0.3, 0.8, 1.5)  # probability density functiondhpa0(x, pc)  # cumulative distribution functionphpa0(x, pc)# Additionally calculate gradients respect to arguments# and parameters of the PGN distributiondhpa0(x, pc, is_grad = TRUE)phpa0(x, pc, is_grad = TRUE)# Let's denote by X standardized PGN random variable and repeat# calculations for 2 * X + 1dhpa0(x, pc, is_grad = TRUE, mean = 1, sd = 2)phpa0(x, pc, is_grad = TRUE, mean = 1, sd = 2)

Semi-nonparametric maximum likelihood estimation

Description

This function performs semi-nonparametric (SNP)maximum likelihood estimation of unknown (possibly truncated) multivariate density using Hermite polynomial based approximating function proposed by Gallant and Nychka in 1987. Please, seedhpa 'Details' section to get more information concerning this approximating function.

Usage

hpaML(  data,  pol_degrees = numeric(0),  tr_left = numeric(0),  tr_right = numeric(0),  given_ind = numeric(0),  omit_ind = numeric(0),  x0 = numeric(0),  cov_type = "sandwich",  boot_iter = 100L,  is_parallel = FALSE,  opt_type = "optim",  opt_control = NULL,  is_validation = TRUE)

Arguments

data

numeric matrix which rows are realizations of independent identically distributed random vectors while columns correspond tovariables.

pol_degrees

non-negative integer vector of polynomial degrees (orders).

tr_left

numeric vector of left (lower) truncation limits.

tr_right

numeric vector of right (upper) truncation limits.

given_ind

logical or numeric vector indicating whether corresponding random vector component is conditioned. By default it is a logical vector ofFALSE values. Ifgive_ind[i] equalsTRUE ori theni-th column ofx matrix will contain conditional values.

omit_ind

logical or numeric vector indicating whether correspondingrandom component is omitted. By default it is a logical vector ofFALSE values. Ifomit_ind[i] equalsTRUE ori then values ini-th column ofx matrix will be ignored.

x0

numeric vector of optimization routine initial values.Note thatx0=c(pol_coefficients[-1], mean, sd). Forpol_coefficients,mean andsd documentation seedhpa function.

cov_type

character determining the type of covariance matrix to bereturned and used for summary. Ifcov_type = "hessian" then negativeinverse of Hessian matrix will be applied. Ifcov_type = "gop" theninverse of Jacobian outer products will be used.Ifcov_type = "sandwich" (default) then sandwich covariance matrixestimator will be applied. Ifcov_type = "bootstrap" then bootstrapwithboot_iter iterations will be used.Ifcov_type = "hessianFD" orcov_type = "sandwichFD" then(probably) more accurate but computationally demanding central difference Hessian approximation will be calculated for the inverse Hessian and sandwich estimators correspondingly. Central differences are computed viaanalytically provided gradient. This Hessian matrix estimation approachseems to be less accurate than BFGS approximation if polynomial orderis high (usually greater then 5).

boot_iter

the number of bootstrap iterationsforcov_type = "bootstrap" covariance matrix estimator type.

is_parallel

ifTRUE then multiple cores will beused for some calculations. It usually provides speed advantage forlarge enough samples (about more than 1000 observations).

opt_type

string value determining the type of the optimizationroutine to be applied. The default is"optim" meaning that BFGS methodfrom theoptim function will be applied.Ifopt_type = "GA" thenga function will beadditionally applied.

opt_control

a list containing arguments to be passed to theoptimization routine depending onopt_type argument value.Please see details to get additional information.

is_validation

logical value indicating whether function input arguments should be validated. Set it toFALSE for slightperformance boost (default value isTRUE).

Details

Densities Hermite polynomial approximation approach has beenproposed by A. Gallant and D. W. Nychka in 1987. The main idea is toapproximate unknown distribution density with scaled Hermite polynomial.For more information please refer to the literature listed below.

Let's use notations introduced indhpa 'Details' section. FunctionhpaML maximizes the followingquasi log-likelihood function:

\ln L(\alpha, \mu, \sigma; x) = \sum\limits_{i=1}^{n} \ln\left(f_{\xi}(x_{i};\alpha, \mu, \sigma)\right),

where (in addition to previously defined notations):

x_{i} - are observations i.e.data matrix rows.

n - is sample size i.e. the number ofdata matrix rows.

Argumentspol_degrees,tr_left,tr_right,given_ind andomit_ind affect the form off_{\xi}\left(x_{i};\alpha, \mu, \sigma)\right) in a way described indhpa 'Details' section. Note that change ofgiven_ind andomit_ind values may result in estimator whichstatistical properties has not been rigorously investigated yet.

The first polynomial coefficient (zero powers) set to 1 for identification purposes i.e.\alpha_{(0,...,0)}=1.

AllNA andNaN values will be removed fromdata matrix.

The function calculates standard errors via sandwich estimatorand significance levels are reported taking into account quasi maximumlikelihood estimator (QMLE) asymptotic normality. If one wants to switchfrom QMLE to semi-nonparametric estimator (SNPE) during hypothesis testingthen covariance matrix should be estimated again using bootstrap.

This function maximizes (quasi) log-likelihood function viaoptim function setting itsmethod argument to "BFGS". Ifopt_type = "GA" then geneticalgorithm fromga functionwill be additionally (afteroptim putting itssolution (par) intosuggestions matrix) applied in order to perform global optimization. Note that global optimization takesmuch more time (usually minutes but sometimes hours or even days). The number of iterations and population size of the genetic algorithmwill grow linearly along with the number of estimated parameters. If it seems that global maximum has not been found then itis possible to continue the search restarting the function setting its input argumentx0 tox1 output value. Note thatifcov_type = "bootstrap" thengafunction will not be used for bootstrap iterations since itmay be extremely time consuming.

Ifopt_type = "GA" thenopt_control should be thelist containing the values to be passed togafunction. It is possible to pass argumentslower,upper,popSize,pcrossover,pmutation,elitism,maxiter,suggestions,optim,optimArgs,seed andmonitor. Note that it is possible to setpopulation,selection,crossover andmutation arguments changingga default parameters viagaControl function. These arguments information reported inga.In order to provide manual values forlower andupper boundsplease follow parameters ordering mentioned above for thex0 argument. If these bounds are not provided manually thenthey (except those related to the polynomial coefficients)will depend on the estimates obtainedby local optimization viaoptim function(this estimates will be in the middlebetweenlower andupper).Specifically for each sd parameterlower (upper) boundis 5 times lower (higher) than thisparameteroptim estimate.For each mean and regression coefficient parameter its lower and upper bounds deviate from correspondingoptim estimateby two absolute values of this estimate.Finally, lower and upper bounds for each polynomialcoefficient are-10 and10 correspondingly (do not dependon theiroptim estimates).

The following arguments are differ from their defaults inga:

The argumentspopSize andmaxiter ofga function have been set proportional to the number ofestimated polynomial coefficients:

Value

This function returns an object of class "hpaML".

An object of class "hpaML" is a list containing the following components:

References

A. Gallant and D. W. Nychka (1987) <doi:10.2307/1913241>

See Also

summary.hpaML,predict.hpaML,logLik.hpaML,plot.hpaML

Examples

## Approximate Student (t) distribution# Set seed for reproducibilityset.seed(123)# Simulate 5000 realizations of Student distribution # with 5 degrees of freedomn <- 5000df <- 5x <- matrix(rt(n, df), ncol = 1)pol_degrees <- c(4)# Apply pseudo maximum likelihood routineml_result <- hpa::hpaML(data = x, pol_degrees = pol_degrees)summary(ml_result)# Get predicted probabilites (density values) approximationspredict(ml_result)# Plot density approximationplot(ml_result)## Approximate chi-squared distribution# Set seed for reproducibilityset.seed(123)# Simulate 5000 realizations of chi-squared distribution # with 5 degrees of freedomn <- 5000df <- 5x <- matrix(rchisq(n, df), ncol = 1)pol_degrees <- c(5)# Apply pseudo maximum likelihood routineml_result <- hpaML(data = x, pol_degrees = as.vector(pol_degrees), tr_left = 0)summary(ml_result)# Get predicted probabilites (density values) approximationspredict(ml_result)# Plot density approximationplot(ml_result)## Approximate multivariate Student (t) distribution## Note that calculations may take up to a minute# Set seed for reproducibilityset.seed(123)# Simulate 5000 realizations of three dimensional Student distribution # with 5 degrees of freedomlibrary("mvtnorm")cov_mat <- matrix(c(1, 0.5, -0.5, 0.5, 1, 0.5, -0.5, 0.5, 1), ncol = 3)x <- rmvt(n = 5000, sigma = cov_mat, df = 5)# Estimate approximating joint distribution parametersml_result <- hpaML(data = x, pol_degrees = c(1, 1, 1))# Get summarysummary(ml_result)# Get predicted values for joint density functionpredict(ml_result)# Plot density approximation for the# second random variableplot(ml_result, ind = 2)# Plot density approximation for the# second random variable conditioning# on x1 = 1plot(ml_result, ind = 2, given = c(1, NA, NA))## Approximate Student (t) distribution and plot densities approximated## under different hermite polynomial degrees against ## true density (of Student distribution)# Simulate 5000 realizations of t-distribution with 5 degrees of freedomn <- 5000df <- 5x <- matrix(rt(n, df), ncol=1)# Apply pseudo maximum likelihood routine# Create matrix of lists where i-th element contains hpaML results for K=iml_result <- matrix(list(), 4, 1)for(i in 1:4){ ml_result[[i]] <- hpa::hpaML(data = x, pol_degrees = i)}# Generate test valuestest_values <- seq(qt(0.001, df), qt(0.999, df), 0.001)n0 <- length(test_values)# t-distribution density function at test values pointstrue_pred <- dt(test_values, df)# Create matrix of lists where i-th element contains # densities predictions for K=iPGN_pred <- matrix(list(), 4, 1)for(i in 1:4){  PGN_pred[[i]] <- predict(object = ml_result[[i]],                            newdata = matrix(test_values, ncol=1))}# Plot the resultlibrary("ggplot2")# prepare the datah <- data.frame("values" = rep(test_values,5),                "predictions" = c(PGN_pred[[1]],PGN_pred[[2]],                                  PGN_pred[[3]],PGN_pred[[4]],                                  true_pred),                 "Density" = c(                  rep("K=1",n0), rep("K=2",n0),                  rep("K=3",n0), rep("K=4",n0),                  rep("t-distribution",n0))                  )                  # build the plotggplot(h, aes(values, predictions)) + geom_point(aes(color = Density)) +  theme_minimal() + theme(legend.position = "top",                           text = element_text(size=26),                          legend.title=element_text(size=20),                           legend.text=element_text(size=28)) +  guides(colour = guide_legend(override.aes = list(size=10))  )# Get informative estimates summary for K=4summary(ml_result[[4]])

Perform semi-nonparametric selection model estimation

Description

This function performs semi-nonparametric (SNP) maximum likelihood estimation of sample selection model using Hermite polynomial based approximating function proposed by Gallant and Nychka in 1987. Please, seedhpa 'Details' section to get more information concerning this approximating function.

Usage

hpaSelection(  selection,  outcome,  data,  selection_K = 1L,  outcome_K = 1L,  pol_elements = 3L,  is_Newey = FALSE,  x0 = numeric(0),  is_Newey_loocv = FALSE,  cov_type = "sandwich",  boot_iter = 100L,  is_parallel = FALSE,  opt_type = "optim",  opt_control = NULL,  is_validation = TRUE)

Arguments

selection

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the selection equation form. All variables inselection should be numeric vectors of the same length.

outcome

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the outcome equation form. All variables inoutcome should be numeric vectors of the same length.

data

data frame containing the variables in the model.

selection_K

non-negative integer representing polynomial degree related to selection equation.

outcome_K

non-negative integer representing polynomial degree related to outcome equation.

pol_elements

number of conditional expectation approximating terms for Newey's method. Ifis_Newey_loocv isTRUE then determines maximum number of these terms during leave-one-out cross-validation.

is_Newey

logical; if TRUE then returns only Newey's method estimation results (default value is FALSE).

x0

numeric vector of optimization routine initial values.Note thatx0 = c(pol_coefficients[-1], mean, sd, z_coef, y_coef).

is_Newey_loocv

logical; if TRUE then number of conditional expectation approximating terms for Newey's method will be selectedbased on leave-one-out cross-validation criteria iterating through 0 to pol_elements number of these terms.

cov_type

character determining the type of covariance matrix to bereturned and used for summary. Ifcov_type = "hessian" then negativeinverse of Hessian matrix will be applied. Ifcov_type = "gop" theninverse of Jacobian outer products will be used.Ifcov_type = "sandwich" (default) then sandwich covariance matrixestimator will be applied. Ifcov_type = "bootstrap" then bootstrapwithboot_iter iterations will be used.Ifcov_type = "hessianFD" orcov_type = "sandwichFD" then(probably) more accurate but computationally demanding central difference Hessian approximation will be calculated for the inverse Hessian and sandwich estimators correspondingly. Central differences are computed viaanalytically provided gradient. This Hessian matrix estimation approachseems to be less accurate than BFGS approximation if polynomial orderis high (usually greater then 5).

boot_iter

the number of bootstrap iterationsforcov_type = "bootstrap" covariance matrix estimator type.

is_parallel

ifTRUE then multiple cores will beused for some calculations. It usually provides speed advantage forlarge enough samples (about more than 1000 observations).

opt_type

string value determining the type of the optimizationroutine to be applied. The default is"optim" meaning that BFGS methodfrom theoptim function will be applied.Ifopt_type = "GA" thenga function will beadditionally applied.

opt_control

a list containing arguments to be passed to theoptimization routine depending onopt_type argument value.Please see details to get additional information.

is_validation

logical value indicating whether function input arguments should be validated. Set it toFALSE for slightperformance boost (default value isTRUE).

Details

Densities Hermite polynomial approximation approach has beenproposed by A. Gallant and D. W. Nychka in 1987. The main idea is toapproximate unknown distribution density with scaled Hermite polynomial.For more information please refer to the literature listed below.

Let's use notations introduced indhpa 'Details' section. FunctionhpaSelection maximizes the followingquasi log-likelihood function:

\ln L(\gamma, \beta, \alpha, \mu, \sigma; x) = \sum\limits_{i:z_{i}=1} \ln\left(\overline{F}_{\left(\xi_{1}|\xi_{2}=y_{i}-x_{i}^{o}\beta\right)}\left(-\gamma x_{i}^{s}, \infty;\alpha, \mu, \sigma\right)\right)f_{\xi_{2}}\left(y_{i}-x_{i}^{o}\beta\right)+

+\sum\limits_{i:z_{i}=0} \ln\left(\overline{F}_{\xi}(-\infty, -x_{i}^{s}\gamma;\alpha, \mu, \sigma)\right),

where (in addition to previously defined notations):

x_{i}^{s} - is row vector of selection equation regressors derived fromdata according toselection formula.

x_{i}^{o} - is row vector of outcome equation regressors derived fromdata according tooutcome formula.

\gamma - is column vector of selection equation regression coefficients (constant will not be added by default).

\beta - is column vector of outcome equation regression coefficients (constant will not be added by default).

z_{i} - binary (0 or 1) dependent variable defined inselection formula.

y_{i} - continuous dependent variable defined inoutcome formula.

Note that\xi is two dimensional andselection_K correspondstoK_{1} whileoutcome_K determinesK_{2}.

The first polynomial coefficient (zero powers) set to 1 for identification purposes i.e.\alpha_{0}=1.

Rows indata corresponding to variables mentioned inselectionandoutcome formulas which have at least oneNAvalue will be ignored. The exception is continues dependent variabley which may haveNA values for observation wherez_{i}=0.

Note that coefficient for the firstindependent variable inselection will be fixedto 1 i.e.\gamma_{1}=1.

All variables mentioned inselection andoutcome should be numeric vectors.

The function calculates standard errors via sandwich estimatorand significance levels are reported taking into account quasi maximumlikelihood estimator (QMLE) asymptotic normality. If one wants to switchfrom QMLE to semi-nonparametric estimator (SNPE) during hypothesis testingthen covariance matrix should be estimated again using bootstrap.

Initial values for optimization routine are obtained by Newey's method (see the reference below). In order to obtain initial valuesvia least squares please, setpol_elements = 0. Initial values forthe outcome equation are obtained viahpaBinary functionsettingK toselection_K.

Note that selection equation dependent variables should have exactly two levels (0 and 1) where "0" states for the selection results which leads to unobservable values of dependent variable in outcome equation.

This function maximizes (quasi) log-likelihood function viaoptim function setting itsmethod argument to "BFGS". Ifopt_type = "GA" then geneticalgorithm fromga functionwill be additionally (afteroptim putting itssolution (par) intosuggestions matrix) applied in order to perform global optimization. Note that global optimization takesmuch more time (usually minutes but sometimes hours or even days). The number of iterations and population size of the genetic algorithmwill grow linearly along with the number of estimated parameters. If it seems that global maximum has not been found then itis possible to continue the search restarting the function setting its input argumentx0 tox1 output value. Note thatifcov_type = "bootstrap" thengafunction will not be used for bootstrap iterations since itmay be extremely time consuming.

Ifopt_type = "GA" thenopt_control should be thelist containing the values to be passed togafunction. It is possible to pass argumentslower,upper,popSize,pcrossover,pmutation,elitism,maxiter,suggestions,optim,optimArgs,seed andmonitor. Note that it is possible to setpopulation,selection,crossover andmutation arguments changingga default parameters viagaControl function. These arguments information reported inga.In order to provide manual values forlower andupper boundsplease follow parameters ordering mentioned above for thex0 argument. If these bounds are not provided manually thenthey (except those related to the polynomial coefficients)will depend on the estimates obtainedby local optimization viaoptim function(this estimates will be in the middlebetweenlower andupper).Specifically for each sd parameterlower (upper) boundis 5 times lower (higher) than thisparameteroptim estimate.For each mean and regression coefficient parameter its lower and upper bounds deviate from correspondingoptim estimateby two absolute values of this estimate.Finally, lower and upper bounds for each polynomialcoefficient are-10 and10 correspondingly (do not dependon theiroptim estimates).

The following arguments are differ from their defaults inga:

Let's denote byn_reg the number of regressorsincluded into theselection andoutcome formulas.The argumentspopSize andmaxiter ofga function have been set proportional to the number ofestimated polynomial coefficients and independent variables:

Value

This function returns an object of class "hpaSelection".

An object of class "hpaSelection" is a list containing the following components:

Abovementioned listNewey has class "hpaNewey" and contains the following components:

Abovementioned listre_moments contains the following components:

References

A. Gallant and D. W. Nychka (1987) <doi:10.2307/1913241>

W. K. Newey (2009) <https://doi.org/10.1111/j.1368-423X.2008.00263.x>

Mroz T. A. (1987) <doi:10.2307/1911029>

See Also

summary.hpaSelection,predict.hpaSelection,plot.hpaSelection,logLik.hpaSelection

Examples

## Let's estimate wage equation accounting for non-random selection.## See the reference to Mroz TA (1987) to get additional details about## the data this examples use# Prepare datalibrary("sampleSelection")data("Mroz87")h = data.frame("kids" = as.numeric(Mroz87$kids5 + Mroz87$kids618 > 0),"age" = as.numeric(Mroz87$age),"faminc" = as.numeric(Mroz87$faminc),"educ" = as.numeric(Mroz87$educ),"exper" = as.numeric(Mroz87$exper),"city" = as.numeric(Mroz87$city),"wage" = as.numeric(Mroz87$wage),"lfp" = as.numeric(Mroz87$lfp))# Estimate model parametersmodel <- hpaSelection(selection = lfp ~ educ + age + I(age ^ 2) +                                         kids + log(faminc),                      outcome = log(wage) ~ exper + I(exper ^ 2) +                                             educ + city,                                  selection_K = 2, outcome_K = 3,                                   data = h,                                   pol_elements = 3, is_Newey_loocv = TRUE)summary(model)# Plot outcome equation random errors densityplot(model, type = "outcome")# Plot selection equation random errors densityplot(model, type = "selection")## Estimate semi-nonparametric sample selection model## parameters on simulated data given chi-squared random errorsset.seed(100)library("mvtnorm")# Sample sizen <- 1000# Simulate independent variablesX_rho <- 0.5X_sigma <- matrix(c(1, X_rho, X_rho,                    X_rho, 1, X_rho,                     X_rho,X_rho,1),                   ncol=3)X <- rmvnorm(n=n, mean = c(0,0,0),              sigma = X_sigma)# Simulate random errorsepsilon <- matrix(0, n, 2)epsilon_z_y <- rchisq(n, 5)epsilon[, 1] <- (rchisq(n, 5) + epsilon_z_y) * (sqrt(3/20)) - 3.8736epsilon[, 2] <- (rchisq(n, 5) + epsilon_z_y) * (sqrt(3/20)) - 3.8736# Simulate selection equationz_star <- 1 + 1 * X[,1] + 1 * X[,2] + epsilon[,1]z <- as.numeric((z_star > 0))# Simulate outcome equationy_star <- 1 + 1 * X[,1] + 1 * X[,3] + epsilon[,2]z <- as.numeric((z_star > 0))y <- y_stary[z==0] <- NAh <- as.data.frame(cbind(z, y, X))names(h) <- c("z", "y", "x1", "x2", "x3")# Estimate parametersmodel <- hpaSelection(selection = z ~ x1 + x2,                       outcome = y ~ x1 + x3,                      data = h,                       selection_K = 1, outcome_K = 3)summary(model)# Get conditional predictions for outcome equationmodel_pred_c <- predict(model, is_cond = TRUE)# Conditional predictions y|z=1model_pred_c$y_1# Conditional predictions y|z=0model_pred_c$y_0# Get unconditional predictions for outcome equationmodel_pred_u <- predict(model, is_cond = FALSE)model_pred_u$y# Get conditional predictions for selection equation# Note that for z=0 these predictions are NApredict(model, is_cond = TRUE, type = "selection")# Get unconditional predictions for selection equationpredict(model, is_cond = FALSE, type = "selection")

Probabilities and Moments Hermite Spline Approximation

Description

The set of functions similar todhpa-likefunctions. The difference is that instead of polynomial these functionsutilize spline.

Usage

dhsa(x, m, knots, mean = 0, sd = 1, log = FALSE)ehsa(m, knots, mean = 0, sd = 1, power = 1)

Arguments

x

numeric vector of values for which the function should be estimated.

m

numeric matrix which rows correspond to spline intervalswhile columns represent variables powers. Therefore the element in i-th row and j-th column represents the coefficient associated withthe variable that 1) belongs to the i-th interval i.e. between i-th and(i + 1)-th knots 2) raised to the power of (j - 1).

knots

sorted in ascending order numeric vector representingknots of the spline.

mean

expected value of a normal distribution.

sd

standard deviation of a normal distribution.

log

logical; ifTRUE then probabilities p are given as log(p)or derivatives will be given respect to log(p).

power

non-negative integer representing the power of the expected value i.e. E(X ^ power) will be estimated.

Details

In contrast todhpa-like functions thesefunctions may deal with univariate distributions only. In future thisfunctions will be generalized to work with multivariate distributions.The main idea of these functions is to use squared spline instead of squared polynomial in order to provide greater numeric stability and approximation accuracy. To provide spline parameters please usem andknotsarguments (i.e. instead ofpol_degrees andpol_coefficientsarguments that where used to specify the polynomialfordhpa-like functions).

Value

Functiondhsa returns vector of probabilitiesof the same length asx. Functionehsa returns moment value.

See Also

dhpa,bsplineGenerate

Examples

## Examples demonstrating dhsa and ehsa functions application.# Generate a b-splinesb <- bsplineGenerate(knots = c(-2.1, 1.5, 1.5, 2.2, 3.7, 4.2, 5),                     degree = 3)                      # Combine b-splines into a splinespline <- bsplineComb(splines = b, weights = c(1.6, -1.2, 3.2))# Assign parameters using the spline created aboveknots <- spline$knotsm <- spline$mmean <- 1sd <- 2# Estimate the density at particular pointsx <- c(2, 3.7, 8)dhsa(x,      m = m, knots = knots,     mean = mean, sd = sd)     # Calculate expected valueehsa(m = m, knots = knots,     mean = mean, sd = sd,     power = 1)      # Evaluate the third momentehsa(m = m, knots = knots,     mean = mean, sd = sd,     power = 3)

Calculates log-likelihood for "hpaBinary" object

Description

This function calculates log-likelihood for "hpaBinary" object

Usage

## S3 method for class 'hpaBinary'logLik(object, ...)

Arguments

object

Object of class "hpaBinary"

...

further arguments (currently ignored)


Calculates log-likelihood for "hpaML" object

Description

This function calculates log-likelihood for "hpaML" object

Usage

## S3 method for class 'hpaML'logLik(object, ...)

Arguments

object

Object of class "hpaML"

...

further arguments (currently ignored)


Calculates log-likelihood for "hpaSelection" object

Description

This function calculates log-likelihood for "hpaSelection" object

Usage

## S3 method for class 'hpaSelection'logLik(object, ...)

Arguments

object

Object of class "hpaSelection"

...

further arguments (currently ignored)


Calculates log-likelihood for "hpaBinary" object

Description

This function calculates log-likelihood for "hpaBinary" object

Usage

logLik_hpaBinary(object)

Arguments

object

Object of class "hpaBinary"


Calculates log-likelihood for "hpaML" object

Description

This function calculates log-likelihood for "hpaML" object

Usage

logLik_hpaML(object)

Arguments

object

Object of class "hpaML"


Calculates log-likelihood for "hpaSelection" object

Description

This function calculates log-likelihood for "hpaSelection" object

Usage

logLik_hpaSelection(object)

Arguments

object

Object of class "hpaSelection"


Calculates multivariate empirical cumulative distribution function

Description

This function calculates multivariate empirical cumulative distribution functionat each point of the sample

Usage

mecdf(x)

Arguments

x

numeric matrix which rows are observations


Calculate k-th order moment of normal distribution

Description

This function recursively calculates k-th order moment of normal distribution.

Usage

normalMoment(  k = 0L,  mean = 0,  sd = 1,  return_all_moments = FALSE,  is_validation = TRUE,  is_central = FALSE,  diff_type = "NO")

Arguments

k

non-negative integer moment order.

mean

numeric expected value.

sd

positive numeric standard deviation.

return_all_moments

logical; ifTRUE, function returns (k+1)-dimensional numeric vector of moments of normally distributed random variable with mean =mean and standard deviation =sd. Note that i-th vector's component value corresponds to the (i-1)-th moment.

is_validation

logical value indicating whether function input arguments should be validated. Set it toFALSE for slightperformance boost (default value isTRUE).

is_central

logical; ifTRUE, then central moments will be calculated.

diff_type

string value indicating the type of the argumentthe moment should be differentiated respect to.Default value is"NO" so the moments itself will be returned. Alternative values are"mean" and"sd". Also"x_lower" and"x_upper" values are available fortruncatedNormalMoment.

Details

This function estimatesk-th order moment of normal distribution which mean equals tomean and standard deviation equals tosd.

Note that parameterk value automatically converts to integer. So passing non-integerk value will not cause any errors but the calculations will be performed for roundedk value only.

Value

This function returnsk-th order moment ofnormal distribution which mean equals tomean and standard deviation issd. Ifreturn_all_moments isTRUE then see this argument description above for output details.

Examples

## Calculate 5-th order moment of normal random variable which## mean equals to 3 and standard deviation is 5.# 5-th momentnormalMoment(k = 5, mean = 3, sd = 5)# (0-5)-th momentsnormalMoment(k = 5, mean = 3, sd = 5, return_all_moments = TRUE)# 5-th moment derivative respect to meannormalMoment(k = 5, mean = 3, sd = 5, diff_type = "mean")# 5-th moment derivative respect to sdnormalMoment(k = 5, mean = 3, sd = 5, diff_type = "sd")

Plot hpaBinary random errors approximated density

Description

Plot hpaBinary random errors approximated density

Usage

## S3 method for class 'hpaBinary'plot(x, y = NULL, ...)

Arguments

x

Object of class "hpaBinary"

y

this parameter currently ignored

...

further arguments to be passed toplotfunction.


Plot approximated marginal density using hpaML output

Description

Plot approximated marginal density using hpaML output

Usage

## S3 method for class 'hpaML'plot(x, y = NULL, ..., ind = 1, given = NULL)

Arguments

x

Object of class "hpaML"

y

this parameter currently ignored

...

further arguments to be passed toplotfunction.

ind

index of random variable for whichapproximation to marginal density should be plotted

given

numeric vector of the same length as given_indfromx. Determines conditional values for the correspondingcomponents.NA values ingiven vector indicate thatcorresponding random variable is not conditioned. By default allgiven components areNA so unconditional marginaldensity will be plotted for theind-th random variable.


Plot hpaSelection random errors approximated density

Description

Plot hpaSelection random errors approximated density

Usage

## S3 method for class 'hpaSelection'plot(x, y = NULL, ..., type = "outcome")

Arguments

x

Object of class "hpaSelection"

y

this parameter currently ignored

...

further arguments to be passed toplotfunction.

type

character; if "outcome" then function plots the graph for outcome equation random errors, if "selection" then plot for selection equation random errors will be generated.


Calculate normal cdf in parallel

Description

Calculate in parallel for each value from vectorx distribution function of normal distribution with mean equal tomean and standard deviation equal tosd.

Usage

pnorm_parallel(x, mean = 0, sd = 1, is_parallel = FALSE)

Arguments

x

vector of quantiles: should be numeric vector,not just double value.

mean

double value.

sd

double positive value.

is_parallel

ifTRUE then multiple cores will beused for some calculations. It usually provides speed advantage forlarge enough samples (about more than 1000 observations).


Multivariate Polynomial Representation

Description

FunctionpolynomialIndex provides matrix which allows to iterate through the elements of multivariate polynomial being aware of these elements powers. So (i, j)-th element of the matrix is power of j-th variable in i-th multivariate polynomial element.

FunctionprintPolynomial prints multivariate polynomialgiven its degrees (pol_degrees) and coefficients (pol_coefficients) vectors.

Usage

polynomialIndex(pol_degrees = numeric(0), is_validation = TRUE)printPolynomial(pol_degrees, pol_coefficients, is_validation = TRUE)

Arguments

pol_degrees

non-negative integer vector of polynomial degrees (orders).

is_validation

logical value indicating whether function input arguments should be validated. Set it toFALSE for slightperformance boost (default value isTRUE).

pol_coefficients

numeric vector of polynomial coefficients.

Details

Multivariate polynomial of degrees(K_{1},...,K_{m}) (pol_degrees) has the form:

a_{(0,...,0)}x_{1}^{0}*...*x_{m}^{0}+ ... + a_{(K_{1},...,K_{m})}x_{1}^{K_{1}}*...*x_{m}^{K_{m}},

wherea_{(i_{1},...,i_{m})} are polynomial coefficients, whilepolynomial elements are:

a_{(i_{1},...,i_{m})}x_{1}^{i_{1}}*...*x_{m}^{i_{m}},

where(i_{1},...,i_{m}) are polynomial element's powers correspondingto variables(x_{1},...,x_{m}) respectively. Note thati_{j}\in \{0,...,K_{j}\}.

FunctionprintPolynomial removes polynomial elements which coefficients are zero and variables which powers are zero. Output may contain long coefficients representation as they are not rounded.

Value

FunctionpolynomialIndex returns matrix which rows are responsible for variables while columns are related to powers. So(i, j)-th element of this matrix corresponds to the poweri_{j} of thex_{j} variable ini-th polynomial element. Thereforei-th column of this matrix contains vector ofpowers(i_{1},...,i_{m}) for thei-th polynomial element.So the function transformsm-dimensional elements indexingto one-dimensional.

FunctionprintPolynomial returns the string which contains polynomial symbolic representation.

Examples

## Get polynomial indexes matrix for the polynomial ## which degrees are (1, 3, 5)polynomialIndex(c(1, 3, 5))## Consider multivariate polynomial of degrees (2, 1) such that coefficients## for elements which powers sum is even are 2 and for those which powers sum## is odd are 5. So the polynomial is 2+5x2+5x1+2x1x2+2x1^2+5x1^2x2 where## x1 and x2 are polynomial variables.# Create variable to store polynomial degreespol_degrees <- c(2, 1)# Let's represent its powers (not coefficients) in a matrix formpol_matrix <- polynomialIndex(pol_degrees)# Calculate polynomial elements' powers sumspol_powers_sum <- pol_matrix[1, ] + pol_matrix[2, ]# Let's create polynomial coefficients vector filling it# with NA valuespol_coefficients <- rep(NA, (pol_degrees[1] + 1) * (pol_degrees[2] + 1))# Now let's fill coefficients vector with correct valuespol_coefficients[pol_powers_sum %% 2 == 0] <- 2pol_coefficients[pol_powers_sum %% 2 != 0] <- 5# Finally, let's check that correspondence is correctprintPolynomial(pol_degrees, pol_coefficients)## Let's represent polynomial 0.3+0.5x2-x2^2+2x1+1.5x1x2+x1x2^2pol_degrees <- c(1, 2)pol_coefficients <- c(0.3, 0.5, -1, 2, 1.5, 1)printPolynomial(pol_degrees, pol_coefficients)

Predict method for hpaBinary

Description

Predict method for hpaBinary

Usage

## S3 method for class 'hpaBinary'predict(object, ..., newdata = NULL, is_prob = TRUE)

Arguments

object

Object of class "hpaBinary"

...

further arguments (currently ignored)

newdata

An optional data frame (forhpaBinary andhpaSelection) or numeric matrix (forhpaML)in which to look for variables with which to predict. If omitted,the original data frame (matrix) used.

is_prob

logical; if TRUE (default) then function returns predicted probabilities. Otherwise latent variable(single index) estimates will be returned.

Value

This function returns predicted probabilities based onhpaBinary estimation results.


Predict method for hpaML

Description

Predict method for hpaML

Usage

## S3 method for class 'hpaML'predict(object, ..., newdata = matrix(c(0)))

Arguments

object

Object of class "hpaML"

...

further arguments (currently ignored)

newdata

An optional data frame (forhpaBinary andhpaSelection) or numeric matrix (forhpaML)in which to look for variables with which to predict. If omitted,the original data frame (matrix) used.

Value

This function returns predictions based onhpaML estimation results.


Predict outcome and selection equation values from hpaSelection model

Description

This function predicts outcome and selection equation values from hpaSelection model.

Usage

## S3 method for class 'hpaSelection'predict(  object,  ...,  newdata = NULL,  method = "HPA",  is_cond = TRUE,  type = "outcome")

Arguments

object

Object of class "hpaSelection"

...

further arguments (currently ignored)

newdata

An optional data frame (forhpaBinary andhpaSelection) or numeric matrix (forhpaML)in which to look for variables with which to predict. If omitted,the original data frame (matrix) used.

method

string value indicating prediction method based on hermite polynomial approximation "HPA" or Newey method "Newey".

is_cond

logical; ifTRUE (default) then conditional predictions will be estimated. Otherwise unconditional predictions will be returned.

type

character; if "outcome" (default) then predictions for selection equation will be estimated according tomethod.If "selection" then selection equation predictions (probabilities) will be returned.

Details

Note that Newey method can't predict conditional outcomes for zero selection equation value. Conditional probabilities for selection equation could be estimated only when dependent variable from outcome equation is observable.

Value

This function returns the list which structure depends onmethod,is_probit andis_outcome values.


Predict method for hpaBinary

Description

Predict method for hpaBinary

Usage

predict_hpaBinary(object, newdata = NULL, is_prob = TRUE)

Arguments

object

Object of class "hpaBinary"

newdata

An optional data frame (forhpaBinary andhpaSelection) or numeric matrix (forhpaML)in which to look for variables with which to predict. If omitted,the original data frame (matrix) used.

is_prob

logical; if TRUE (default) then function returns predicted probabilities. Otherwise latent variable(single index) estimates will be returned.

Value

This function returns predicted probabilities based onhpaBinary estimation results.


Predict method for hpaML

Description

Predict method for hpaML

Usage

predict_hpaML(object, newdata = matrix(1, 1))

Arguments

object

Object of class "hpaML"

newdata

An optional data frame (forhpaBinary andhpaSelection) or numeric matrix (forhpaML)in which to look for variables with which to predict. If omitted,the original data frame (matrix) used.

Value

This function returns predictions based onhpaML estimation results.


Predict outcome and selection equation values from hpaSelection model

Description

This function predicts outcome and selection equation values from hpaSelection model.

Usage

predict_hpaSelection(  object,  newdata = NULL,  method = "HPA",  is_cond = TRUE,  is_outcome = TRUE)

Arguments

object

Object of class "hpaSelection"

newdata

An optional data frame (forhpaBinary andhpaSelection) or numeric matrix (forhpaML)in which to look for variables with which to predict. If omitted,the original data frame (matrix) used.

method

string value indicating prediction method based on hermite polynomial approximation "HPA" or Newey method "Newey".

is_cond

logical; ifTRUE (default) then conditional predictions will be estimated. Otherwise unconditional predictions will be returned.

is_outcome

logical; ifTRUE (default) then predictions for selection equation will be estimated using "HPA" method.Otherwise selection equation predictions (probabilities) will be returned.

Details

Note that Newey method can't predict conditional outcomes for zero selection equation value. Conditional probabilities for selection equation could be estimated only when dependent variable from outcome equation is observable.

Value

This function returns the list which structure depends onmethod,is_probit andis_outcome values.


Print method for "hpaBinary" object

Description

Print method for "hpaBinary" object

Usage

## S3 method for class 'hpaBinary'print(x, ...)

Arguments

x

Object of class "hpaBinary"

...

further arguments (currently ignored)


Print method for "hpaML" object

Description

Print method for "hpaML" object

Usage

## S3 method for class 'hpaML'print(x, ...)

Arguments

x

Object of class "hpaML"

...

further arguments (currently ignored)


Print method for "hpaSelection" object

Description

Print method for "hpaSelection" object

Usage

## S3 method for class 'hpaSelection'print(x, ...)

Arguments

x

Object of class "hpaSelection"

...

further arguments (currently ignored)


Summary for "hpaBinary" object

Description

Summary for "hpaBinary" object

Usage

## S3 method for class 'summary.hpaBinary'print(x, ...)

Arguments

x

Object of class "hpaBinary"

...

further arguments (currently ignored)


Summary for hpaML output

Description

Summary for hpaML output

Usage

## S3 method for class 'summary.hpaML'print(x, ...)

Arguments

x

Object of class "hpaML"

...

further arguments (currently ignored)


Summary for "hpaSelection" object

Description

Summary for "hpaSelection" object

Usage

## S3 method for class 'summary.hpaSelection'print(x, ...)

Arguments

x

Object of class "hpaSelection"

...

further arguments (currently ignored)


Description

Summary for hpaBinary output

Usage

print_summary_hpaBinary(x)

Arguments

x

Object of class "hpaML"


Description

Summary for hpaML output

Usage

print_summary_hpaML(x)

Arguments

x

Object of class "hpaML"


Description

Summary for hpaSelection output

Usage

print_summary_hpaSelection(x)

Arguments

x

Object of class "hpaSelection"


Summarizing hpaBinary Fits

Description

Summarizing hpaBinary Fits

Usage

## S3 method for class 'hpaBinary'summary(object, ...)

Arguments

object

Object of class "hpaBinary"

...

further arguments (currently ignored)

Value

This function returns the same list ashpaBinary function changing its class to "summary.hpaBinary".


Summarizing hpaML Fits

Description

Summarizing hpaML Fits

Usage

## S3 method for class 'hpaML'summary(object, ...)

Arguments

object

Object of class "hpaML"

...

further arguments (currently ignored)

Value

This function returns the same list ashpaML function changing its class to "summary.hpaML".


Summarizing hpaSelection Fits

Description

This function summarizing hpaSelection Fits

Usage

## S3 method for class 'hpaSelection'summary(object, ...)

Arguments

object

Object of class "hpaSelection"

...

further arguments (currently ignored)

Value

This function returns the same list ashpaSelection function changing its class to "summary.hpaSelection".


Summarizing hpaBinary Fits

Description

Summarizing hpaBinary Fits

Usage

summary_hpaBinary(object)

Arguments

object

Object of class "hpaBinary"

Value

This function returns the same list ashpaBinary function changing its class to "summary.hpaBinary".


Summarizing hpaML Fits

Description

Summarizing hpaML Fits

Usage

summary_hpaML(object)

Arguments

object

Object of class "hpaML"

Value

This function returns the same list ashpaML function changing its class to "summary.hpaML".


Summarizing hpaSelection Fits

Description

This function summarizing hpaSelection Fits.

Usage

summary_hpaSelection(object)

Arguments

object

Object of class "hpaSelection".

Value

This function returns the same list ashpaSelection function changing its class to "summary.hpaSelection".


Calculate k-th order moment of truncated normal distribution

Description

This function recursively calculates k-th order moment of truncated normal distribution.

Usage

truncatedNormalMoment(  k = 1L,  x_lower = numeric(0),  x_upper = numeric(0),  mean = 0,  sd = 1,  pdf_lower = numeric(0),  cdf_lower = numeric(0),  pdf_upper = numeric(0),  cdf_upper = numeric(0),  cdf_difference = numeric(0),  return_all_moments = FALSE,  is_validation = TRUE,  is_parallel = FALSE,  diff_type = "NO")

Arguments

k

non-negative integer moment order.

x_lower

numeric vector of lower truncation points.

x_upper

numeric vector of upper truncation points.

mean

numeric expected value.

sd

positive numeric standard deviation.

pdf_lower

non-negative numeric matrix of precalculated normaldensity functions with meanmean and standard deviationsd atpoints given byx_lower.

cdf_lower

non-negative numeric matrix of precalculated normal cumulative distribution functionswith meanmean and standard deviationsd at points given byx_lower.

pdf_upper

non-negative numeric matrix of precalculated normaldensity functions with meanmean and standard deviationsd atpoints given byx_upper.

cdf_upper

non-negative numeric matrix ofprecalculated normal cumulative distribution functionswith meanmean and standard deviationsd at points given byx_upper.

cdf_difference

non-negative numeric matrix of precalculatedcdf_upper-cdf_lower values.

return_all_moments

logical; ifTRUE, function returns the matrix of moments of normally distributed random variable with mean =mean and standard deviation =sd under lower and upper truncation pointsx_lower andx_upper correspondingly. Note that element in i-th row and j-th column of this matrix corresponds to the i-th observation (j-1)-th order moment.

is_validation

logical value indicating whether function input arguments should be validated. Set it toFALSE for slightperformance boost (default value isTRUE).

is_parallel

ifTRUE then multiple cores will beused for some calculations. It usually provides speed advantage forlarge enough samples (about more than 1000 observations).

diff_type

string value indicating the type of the argumentthe moment should be differentiated respect to.Default value is"NO" so the moments itself will be returned. Alternative values are"mean" and"sd". Also"x_lower" and"x_upper" values are available fortruncatedNormalMoment.

Details

This function estimatesk-th order moment ofnormal distribution which mean equals tomean and standard deviation equals tosd truncated at points given byx_lower andx_upper. Note that the function is vectorized so you can providex_lower andx_upper as vectors of equal size. If vectors values forx_lower andx_upper are not provided then their default values will be set to-(.Machine$double.xmin * 0.99) and(.Machine$double.xmax * 0.99) correspondingly.

Note that parameterk value automatically converts to integer. So passing non-integerk value will not cause any errors but the calculations will be performed for roundedk value only.

If there is precalculated density or cumulative distributionfunctions at standardized truncation points (subtractmean and then divide bysd) then it is possible to providethem throughpdf_lower,pdf_upper,cdf_lower andcdf_upper arguments inorder to decrease number of calculations.

Value

This function returns vector of k-th order moments for normally distributed random variable with mean =mean and standard deviation =sd underx_lower andx_upper truncation pointsx_lower andx_upper correspondingly. Ifreturn_all_moments isTRUE then see this argument description above for output details.

Examples

## Calculate 5-th order moment of three truncated normal random  ## variables (x1, x2, x3) which mean is 5 and standard deviation is 3. ## These random variables truncation points are given ## as follows:-1<x1<1, 0<x2<2, 1<x3<3.k <- 3x_lower <- c(-1, 0, 1, -Inf, -Inf)x_upper <- c(1, 2 , 3, 2, Inf)mean <- 3sd <- 5# get the momentstruncatedNormalMoment(k, x_lower, x_upper, mean, sd)# get matrix of (0-5)-th moments (columns) for each variable (rows)truncatedNormalMoment(k, x_lower, x_upper,                       mean, sd,                       return_all_moments = TRUE)# get the moments derivatives respect to meantruncatedNormalMoment(k, x_lower, x_upper,                       mean, sd,                       diff_type = "mean")# get the moments derivatives respect to standard deviationtruncatedNormalMoment(k, x_lower, x_upper,                       mean, sd,                       diff_type = "sd")

Extract covariance matrix from hpaBinary object

Description

Extract covariance matrix from hpaBinary object

Usage

## S3 method for class 'hpaBinary'vcov(object, ...)

Arguments

object

Object of class "hpaBinary"

...

further arguments (currently ignored)


Extract covariance matrix from hpaML object

Description

Extract covariance matrix from hpaML object

Usage

## S3 method for class 'hpaML'vcov(object, ...)

Arguments

object

Object of class "hpaML"

...

further arguments (currently ignored)


Extract covariance matrix from hpaSelection object

Description

Extract covariance matrix from hpaSelection object

Usage

## S3 method for class 'hpaSelection'vcov(object, ...)

Arguments

object

Object of class "hpaSelection"

...

further arguments (currently ignored)


[8]ページ先頭

©2009-2025 Movatter.jp