| Version: | 2.1.1 |
| Date: | 2023-04-04 |
| Title: | The Skew-Normal and Related Distributions Such as the Skew-t andthe SUN |
| Maintainer: | Adelchi Azzalini <adelchi.azzalini@unipd.it> |
| Depends: | R (≥ 3.0.0), methods, stats4 |
| Imports: | mnormt (≥ 2.0.0), numDeriv, utils, quantreg |
| Suggests: | R.rsp |
| VignetteBuilder: | R.rsp |
| Description: | Build and manipulate probability distributions of the skew-normal family and some related ones, notably the skew-t and the SUN families. For the skew-normal and the skew-t distributions, statistical methods are provided for data fitting and model diagnostics, in the univariate and the multivariate case. |
| License: | GPL-2 |GPL-3 |
| URL: | http://azzalini.stat.unipd.it/SN/ |
| Encoding: | UTF-8 |
| NeedsCompilation: | no |
| Packaged: | 2023-04-04 17:13:55 UTC; aa |
| Author: | Adelchi Azzalini |
| Repository: | CRAN |
| Date/Publication: | 2023-04-04 18:10:02 UTC |
Packagesn: overview, background and history
Description
Thesn package provides facilities to define and manipulate probability distributions of the skew-normal (SN) family and some related ones, notably the skew-t (ST) and the unified skew-normal (SUN) families. For a number of these families, statistical methods are provided, to perform data fitting and model diagnostics, in the univariate and the multivariate case.
Overview of the package structure and commands
A separatate document is entirely dedicated to the presentation of the package structure and its basic functions; see thepackage overview.
Background information and references
The package adopts the terminology, notation and general framework of themonograph by Azzalini and Capitanio (2014). This matching constitutes a reason for the numerous references to the book in the documentation of the package.
An additional reason for referring to that monograph instead of the originalresearch papers is that the book provides a relatively not-so-formal account of material which has been elaborated in a number of publications, sometimesvery technical, or re-elabotated over a few papers or possibly mixing the information of key interest with other material. In other words, the motivation behind this policy is readability,not indulgence in self-citation.
When one or a few original sources appeared to deliver the requiredinformation in a compact and accessible form, they have been cited directly. In any case, the cited sections of the book include bibliographic noteswhich refer back to the original sources.
A bit of history
The first version of the package was written in 1997, and it was uploaded onCRAN in 1998.Subsequent versions have evolved gradually up to version 0.4-18 in May 2013.
In January 2014, version 1.0-0 has been uploaded toCRAN.This represented a substantial re-writing of the earlier ‘version 0.x’,developed in broad connection with the book by Azzalini and Capitanio (2014). Differences between the ‘version 0’ and the ‘version 1’ series are radical; they concern the core computational and graphical part as well as the user interface. Since version 1.0-0, the S4 protocol for classes and methods has been adopted.
After various versions 1.x-y, version 2.0.0 has appeared in March 2021,providing support for theSUN distribution.
Additional information on the evolution of the package is provided inNEWS file, accessible from the package documentation index page.
Backward compatibility versus ‘version 0.4-18’
There is a partial backward compatibility of newer version versus‘version 0-4.18’ of the package.Some functions of the older version would work as beforewith virtually no change; a wider set arguments is now allowed. Functionsdsn,dst,dmsn and alike fall inthis category: in some cases, the names of the arguments have been altered, but they work as before if called with unnamed arguments; similar cases aremsn.mle,sn.cumulants andT.Owen.Notice, however, thatmsn.mle and other fitting functions haveeffectively been subsumed into the more comprehensive fitting functionselm.
A second group of functions will work with little or even minimal changes.Specific examples are functionssn.mle andst.mle which havebecomesn.mple andst.mple, with some additionalarguments (again, one can achieve the same result viaselm). Another example is constitude by the group of functionsdp.to.cp,cp.to.dp andst.cumulants.inversion, which have been replaced by the more general functionsdp2cp andcp2dp; one only needs to pay attention to conversion from 3rd and 4th order cumulants to their standardized form in connection with the replacement ofst.cumulants.inversion.
Finally, some functions are not there any longer, with no similarly-workingfunctions in the new version. This is the case ofsn.mle.groupedandst.mle.grouped for maximum likelihood estimation fromgrouped data, that is, data recorded as intervals and corresponding frequencies.
Requirements
R version 2.15-3 or higher, plus packagesmnormt,numDeriv andquantreg, in addition to standardpackages (methods,graphics,stats4, etc.)
Version
The commandcitation("sn") indicates, among other information,the running version of the package.The most recent version of the package can be obtained fromthe web page:http://azzalini.stat.unipd.it/SN/which also provides related material.
From the above-indicated web page, one can also obtain the package ‘sn0’ which is essentially the last ‘version 0’ (that is, 0.4-18) with suitable renaming of certain ingredients.This allows to have both the current and the old package installed at the same time.
Author
Adelchi Azzalini.Please send comments, error reportset cetera to the author, whose web page ishttp://azzalini.stat.unipd.it/.
Licence
This package and its documentation are usable under the terms of the “GNU General Public License” version 3 or version 2,as you prefer; a copy of them is available fromhttps://www.R-project.org/Licenses/.
While the software is freely usable, it would be appreciatedif a reference is inserted in publications or other workwhich makes use of it. For the appropriate way of referencing it, see the commandcitation("sn").
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
Penalty function for log-likelihood ofselm models
Description
Penalty function for the log-likelihood ofselm models whenmethod="MPLE".Qpenalty is the default function;MPpenalty is an example of a user-defined function effectivelycorresponding to a prior distributio onalpha.
Usage
Qpenalty(alpha_etc, nu = NULL, der = 0)MPpenalty(alpha, der = 0)Arguments
alpha_etc,alpha | in the univariate case, a single value |
nu | degrees of freedom, only required if |
der | a numeric value in the set0,1,2 which indicates therequired numer of derivatives of the function. In the multivariate casethe function will only be called with |
Details
The penalty is a function ofalpha, but its expression maydepend on other ingredients, specificallynu andcov2cor(Omega).See ‘Details’ ofselm for additional information.
The penalty mechanism allows to introduce a prior distribution\pi for\alpha by settingQ=-\log\pi, leading to a maximuma posteriori estimate in the stated sense.
As a simple illustration of this mechanism, functionMPpenaltyimplements the ‘matching prior’ distribution for the univariateSNdistribution studied by Cabraset al. (2012); a brief summary of theproposal is provided in Section 3.2 of Azzalini and Capitanio (2014). Notethat, besidesalpha=+/-Inf, this choice also penalizesalpha=0withQ=Inf, effectively removingalpha=0 from the parameterspace.
Starting from the code of functionMPpenalty, a user should be able to introduce an alternative prior distribution if so desired.
Value
A positive numberQ representing the penalty, possiblywith attributesattr(Q, "der1") andattr(Q, "der2"), depending onthe input valueder.
Author(s)
Adelchi Azzalini
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
Cabras, S., Racugno, W., Castellanos, M. E., and Ventura, L. (2012).A matching prior for the shape parameter of the skew-normal distribution.Scand. J. Statist.39, 236–247.
See Also
selm function
Examples
data(frontier)m2 <- selm(frontier ~ 1) # no penaltym2a <- selm(frontier ~ 1, method="MPLE") # penalty="Qpenalty" is implied herem2b <- selm(frontier ~ 1, method="MPLE", penalty="MPpenalty")Class"SECdistrMv"
Description
A class of objects representing multivariate skew-elliptically contoured (SEC) distributions.
Objects from the Class
Objects can be created by a call to functionmakeSECdistr,when its argumentdp is a list, or by a suitable transformation of some object of this class. They can also obtained from an object generatedbyselm using the functionextractSEDdistr.
Slots
family:a character string which identifies the parametricfamily; currently, possible values are:"SN","ESN","ST","SC".
dp:a list of parameters; its length depends on the selected
family.name:a character string with the name of the multivariatevariable; it can be an empty string.
compNames:a vector of character strings with the names of the component variables.
Methods
- show
signature(object = "SECdistrMv-class"): ...- plot
signature(x = "SECdistrMv-class"): ...- summary
signature(object = "SECdistrMv-class"): ...- mean
signature(x = "SECdistrMv"): ...- vcov
signature(object = "SECdistrMv"): ...
Note
SeemakeSECdistr for a detailed description offamily anddp.
Note that here methodsmean andvcov are not appliedto data or to a fitted model, but to aprobabilitydistribution instead, of which they provide the mean (vector) valueand the variance-covariance matrix. If methodsmean andvcov are applied to a distribution for which the mean or the variance donot exist, aNULL value is returned and a warning messageis issued.
Author(s)
Adelchi Azzalini
See Also
SECdistrUv,plot,SECdistrMv-method,summary,SECdistrMv-method,affineTransSECdistr,marginalSECdistr,extractSECdistr
Examples
dp0 <- list(xi=1:2, Omega=diag(3:4), alpha=c(3, -5)) f10 <- makeSECdistr(dp=dp0, family="SN", name="SN-2D", compNames=c("x", "y")) show(f10) plot(f10) summary(f10) mean(f10) # the mean value of the probability distribution vcov(f10) # the variance-covariance matrix of the probability distributionClass"SECdistrUv"
Description
A class of objects representing univariate skew-ellipticallycontoured (SEC) distributions.
Objects from the class
Objects can be created by a call to functionmakeSECdistr when its argumentdp is a vector. They can also obtained froman object generated byselm using the functionextractSEDdistr.
Slots
family:a character string which selects the parametricfamily; currently, possible values are:"SN","ESN","ST","SC".
dp:a numeric vector of parameters; its length depends on the selected
family.name:a character string with name of the distribution.
Methods
- show
signature(object = "SECdistrUv"): ...- plot
signature(x = "SECdistrUv"): ...- summary
signature(object = "SECdistrUv"): ...- mean
signature(x = "SECdistrUv"): ...- sd
signature(object = "SECdistrUv"): ...
Note
SeemakeSECdistr for a detailed description offamily anddp.
Unlike various other packages, methodsmean andsd here are not targeted to data or to a fitted model, but to aprobabilitydistribution instead, of which they provide the mean valueand the standard deviation. If these methods are appliedto a distribution of which the mean or the variance do not exist, aNULL value is returned and a warning message is issued.
Author(s)
Adelchi Azzalini
See Also
SECdistrMv,plot,SECdistrUv-method,summary,SECdistrUv-method,extractSECdistr
Examples
f2 <- makeSECdistr(dp=c(3, 5, -pi, 6), family="ST", name="My first ST")show(f2)plot(f2)plot(f2, probs=c(1,5,9)/10)plot(f2, range=c(-30,10), probs=NULL, col=2, main=NULL)summary(f2)mean(f2) # the mean value of the probability distributionsd(f2) # the standard deviation of the distributionThe Unified Skew-Normal (SUN) probability distribution
Description
Density, distribution function, random number generation, the mean value,the variance-covariance matrix and the Mardia's measures of multivariateskewness and kurtosis of theSUN probability distribution.
Usage
dsun(x, xi, Omega, Delta, tau, Gamma, dp = NULL, log = FALSE, silent=FALSE, ...)psun(x, xi, Omega, Delta, tau, Gamma, dp = NULL, log = FALSE, silent=FALSE, ...)rsun(n=1, xi, Omega, Delta, tau, Gamma, dp = NULL, silent=FALSE)sunMean(xi, Omega, Delta, tau, Gamma, dp = NULL, silent=FALSE, ...)sunVcov(xi, Omega, Delta, tau, Gamma, dp = NULL, silent=FALSE, ...)sunMardia(xi, Omega, Delta, tau, Gamma, dp = NULL, silent=FALSE, ...)Arguments
x | either a vector of length |
xi | a numeric vector of length |
Omega | a symmetric positive definite matrix of dimension |
Delta | a matrix of size |
tau | a vector of length |
Gamma | a symmetric positive definite matrix of dimension |
dp | a list with five elements, representing |
n | a positive integer value. |
log | a logical value (default value: |
silent | a logical value which indicates the action to take in the case |
... | additional tuning arguments passed either to |
Details
The optional arguments in... passed topmnorm,which usesptriv.nt whend=3,biv.nt.prob whend=2 and andsadmvn whend>2.In practice these arguments are effective only ifd>3, since for lower dimensions the computations are made to full available precision anyway.A similar fact applies to the... argument passed tomom.mtruncnorm.
Some numerical inaccuracy is inevitably involved in these computations.In most cases, they are of negligible extent, but they can possibly becomemore relevant, especially in the computation of higher order momentsinvolved bysunMardia, depending on the dimensiond and on the specific parameter values.Consider the ‘Warning’ section inrecintabwhich is used bymom.mtruncnorm.
The above-described functions operate following the traditionalR scheme for probability distributions. Another scheme, coexisting with the classical one, works withSUNdistr-class objects, which representSUN distributions, by encapsulating their parameters and other characteristics. These objects are created bymakeSUNdistr, and various methods exist for them; seeSUNdistr-class. Moreover these objects can be manipulated by a number of tools, described inSUNdistr-op, leading to new objects of the same class.
Value
The structure of the returned value depends on the called function, as follows:
dsun, psun | a vector of lengthnrow(x) representing density or probability values, |
or their log-transformed values iflog=TRUE, | |
rsun | a matrix of size(n,d), where each row represents aSUN random vectors, |
sunMean | a vector of lengthd representing the mean value, |
sunVcov | a matrix of size(d,d) representing the variance-covariance matrix, |
sunMardia | a vector of length two with the Mardia's measures of multivariate skewness and kurtosis. |
Background
\xi | xi | a vector of lengthd, |
\Omega | Omega | a matrix of size(d,d), |
\Delta | Delta | a matrix of size(d,m), |
\tau | tau | a vector of lengthm, |
\Gamma | Gamma | a matrix of size(m,m), |
and must satisfy the following conditions:
\Omegais a symmetric positive definite matrix;\Gammais a symmetric positive definite matrix with 1's on the main diagonal, hence a correlation matrix;if
\bar\Omegadenotes the correlation matrix associated to\Omega, the matrix of size(d+m)\times(d+m)formed by the2 x 2blocks\bar\Omega\Delta\Delta'\Gammamust be a positive definite correlation matrix.
The formulation adopted here has arisen as the evolution of earlier constructions, which are recalled very briefly next.A number of extensions of the multivariate skew-normal distributions, all involving a numberm (withm\ge1) of latent variables (instead ofm=1 like the skew-normal distribution), have been put-forward in close succession in the years 2003-2005. Special attention has been drawn by the ‘closed skew-normal (CSN)’distribution developed by González-Faríaset alii (2004a, 2004b)and the ‘fundamental skew-normal (FUSN)’ distributiondeveloped by Arellano-Valle and Genton (2005),but other formulations have been considered too.
Arellano Valle and Azzalini (2006) have shown the essential equivalenceof these apparently alternative constructions, after appropriatereparameterizations, and underlined the necessity of removing over-parameterizations in some cases, to avoid lack of identifiability. This elaboration has led to theSUN formulation. A relatively less technical account of their development is provided in Section 7.1 of Azzalini and Capitanio (2014), using very slightly modified notation and parameterization, which are the ones adopted here.
Additional results have been presented by Arellano-Valle and Azzalini (2021), such as expressions for the variance matrixand higher order moments, the Mardia's measures of multivariate skewnessand kurtosis, which are implemented here. Another result is theconditional distribution when the conditioning event is representedby an orthant.
Note
The present structure and user interface of this function, and of other ones related to theSUN distribution, must be considered experimental, and they might possibly change in the future.
Author(s)
Adelchi Azzalini
References
Arellano-Valle, R. B., and Azzalini, A. (2006).On the unification of families of skew-normal distributions.Scand. J. Stat.33, 561-574. Corrigendum in49 (2022), 1418-1419.
Arellano-Valle, R. B. and Azzalini, A. (2021).Some properties of the unified skew-normal distribution.Statistical Papers63, 461-487,doi:10.1007/s00362-021-01235-2;see alsoarXiv:2011.06316
Arellano-Valle, R. B. and Genton, M. G. (2005). On fundamental skew distributions.J. Multivariate Anal.96, 93–1116.
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
González-Farías, G., Domínguez-Molina, J. A., & Gupta, A. K. (2004a). Additive properties of skew normal random vectors.J. Statist. Plann. Inference126, 521-534.
González-Farías, G., Domínguez-Molina, J. A., & Gupta, A. K. (2004b). The closed skew-normal distribution. In M. G. Genton (Ed.),Skew-elliptical Distributions and Their Applications: a Journey Beyond Normality, Chapter 2, (pp. 25–42). Chapman & Hall/CRC.
See Also
makeSUNdistr to build aSUN distribution object, with related methods inSUNdistr-class,and other facilities inSUNdistr-op
convertCSN2SUNpar to convert a parameter set of the Closed Skew-Normal formulation to the equivalentSUN parameter set
Examples
xi <- c(1, 0, -1)Omega <- matrix(c(2,1,1, 1,3,1, 1,1,4), 3, 3)Delta <- matrix(c(0.72,0.20, 0.51,0.42, 0.88, 0.94), 3, 2, byrow=TRUE)Gamma <- matrix(c(1, 0.8, 0.8, 1), 2, 2)dp3 <- list(xi=xi, Omega=Omega, Delta=Delta, tau=c(-0.5, 0), Gamma=Gamma)x <- c(0.8, 0.5, -1.1)f1 <- dsun(x, xi, Omega, Delta, c(-0.5, 0), Gamma) # mode 1f2 <- dsun(x, dp=dp3) # mode 2, equivalent to mode 1set.seed(1)xm <- rsun(10, dp=dp3)f3 <- dsun(xm, dp=dp3) psun(xm, dp=dp3)sunMean(dp=dp3)sunVcov(dp=dp3)sunMardia(dp=dp3)Class"SUNdistr" and its methods
Description
A class of objects representing Unified Skew-Normal (SUN) distributions.
Details
SeeSUNdistr-base for a description of the required structure ofdp.
Note that here the methodsmean andvcov are not appliedto data or to a fitted model, but to aprobabilitydistribution, of which they provide the mean (expected) valueand the variance-covariance matrix.
The object of this class follow the S4 protocol.
Objects from the class
Objects can be created by a call to the functionmakeSUNdistr or by a suitable transformation of some object of this class.
Slots
dp:a list of parameters of length five, as described in
SUNdistr-basename:a character string with the name of the multivariatevariable; it can be an empty string.
compNames:a vector of character strings with the names of the component variables.
- HcompNames
a vector of character strings with the names of the hidden variables.
Methods
- show
signature(object = "SUNdistr-class"): ...- plot
signature(x = "SUNdistr-class"): ...- summary
signature(object = "SUNdistr-class"): ...- mean
signature(x = "SUNdistr"): ...- vcov
signature(object = "SUNdistr"): ...
Author(s)
Adelchi Azzalini
See Also
plot,SUNdistr-method,\quadsummary,SUNdistr-method,\quadaffineTransSUNdistr,marginalSUNdistr
convertSN2SUNdistr to convert aSECdistr object withfamily"SN" or"ESN" to the equivalentSUNdistr-classobject
Examples
xi <- c(1, 0, -1) Omega <- matrix(c(2,1,1, 1,3,1, 1,1,4), 3, 3) Delta <- matrix(c(0.72,0.20, 0.51,0.42, 0.88, 0.94), 3, 2, byrow=TRUE) Gamma <- matrix(c(1, 0.8, 0.8, 1), 2, 2) dp3 <- list(xi=xi, Omega=Omega, Delta=Delta, tau=c(-0.5, 0), Gamma=Gamma) sun3 <- makeSUNdistr(dp=dp3, name="firstSUN", compNames=c("x", "w", "z")) show(sun3) plot(sun3) mean(sun3) # the mean value of the probability distribution vcov(sun3) # the variance-covariance matrix of the probability distribution summary(sun3) # a more detailed summaryOperations on SUNdistr-class objects
Description
Given an object ofSUNdistr-class, or possibly two such thingsin some cases, the functions perform various operations,and produce a new object of the same class.
Usage
affineTransSUNdistr(object, a, A, name, compNames, HcompNames, drop = TRUE)conditionalSUNdistr(object, comp, values, eventType = "=", name, drop = TRUE) convolutionSUNdistr(object1, object2, name, compNames, HcompNames) joinSUNdistr(object1, object2, name, compNames, HcompNames) marginalSUNdistr(object, comp, name, drop=TRUE)Arguments
object,object1,object2 | objects of class |
a | a numeric vector; see ‘Details’ |
A | a numeric matrix; see ‘Details’ |
name | an optional character string with the name of the returneddistribution |
compNames | an optional vector of character strings with the names of the component variables of the returned distribution |
HcompNames | an optional vector of character strings with the names of the hidden variables of the returned distribution |
drop | a logical value (default: |
comp | a vector of integers representing the selected components |
values | a numeric vector which identifies the conditioning event |
eventType | a single character value which indicates the type of theconditioning event, as described in the ‘Details’ section;possible values are |
Details
For anobject which represents the distribution of a multivariateSUN random variableY of dimensiond, say, a number ofoperations are possible, producing a new object of the same class.Thisobject could have been created bymakeSUNdistr or it could be the outcome from some previous call to one of the functionsdescribed here.
The functionaffineTransSUNdistr computes the distribution ofa+A'Y, providedA is a full-rank matrix withnrow(A)=d andlength(a)=ncol(A).See equation (7.6) of Azzalini & Capitanio (2014).
The functionmarginalSUNdistr builds aSUN distribution from the components selected by thecomp vector.
A conditional distribution can be computed usingconditionalSUNdistr for two type of events, selected byeventType. The"=" case corresponds to the eventY_1=y_1 whereY_1 is the subset of components identifiedby thecomp argument,y_1 is vector specified by thevalues argument and the equality sign must hold for each component.See equation (7.6) of Azzalini & Capitanio (2014).
IfconditionalSUNdistr is used witheventType=">",the conditiong refers to the eventY_1>y_1, where the inequality must be interpreted components-wise;see Arellano-Valle & Azzalini (2021) for the underlying mathematical result.If the conditional distribution is required for the reverse inequality condition,"<" say, this is equivalent to consideration of the event-Y_1>-y_1. The corresponding distribution can be obtained in two steps: first a new variable is constructed reversing the sign of the requiredcomponents usingaffineTransSUNdistr;thenconditionalSUNdistr is applied to this new variable withthe">" condition and values-y_1. More complex conditions, where the"<" and">" signs are mixed for different component varables, can be handled similarly, by introducing a square matrixA foraffineTransSUNdistr having an appropriate combination of1s' and-1's on its maindiagonal, and 0's elsewhere, and matching changes of sign to the components ofy_1.
FunctionsconvolutionSUNdistr andjoinSUNdistr operate under the assumptions thatobject1 andobject2 refer to independent variables. Specifically,convolutionSUNdistr computes the convolution of thetwo objects (i.e. the distribution of the sum of two independent variables),which must have the same dimensiond. FunctionjoinSUNdistr combines two objects into a joint distribution.
If the argumentsname,compNames andHcompNamesare missing, they are composed from the supplied arguments.
Value
an object ofSUNdistr-class
Note
The present structure and user interface of this function, and of other ones related to theSUN distribution, must be considered experimental, and they might possibly change in the future.
Author(s)
Adelchi Azzalini
References
Arellano-Valle, R. B. and Azzalini, A. (2021).Some properties of the unified skew-normal distribution.Statistical Papers,doi:10.1007/s00362-021-01235-2andarXiv:2011.06316
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
SUNdistr-base,makeSUNdistr,SUNdistr-class
Examples
xi <- c(1, 0, -1)Omega <- matrix(c(2,1,1, 1,3,1, 1,1,4), 3, 3)Delta <- matrix(c(0.72,0.20, 0.51,0.42, 0.88, 0.94), 3, 2, byrow=TRUE)Gamma <- matrix(c(1, 0.8, 0.8, 1), 2, 2)dp3 <- list(xi=xi, Omega=Omega, Delta=Delta, tau=c(-0.5, 0), Gamma=Gamma)sun3 <- makeSUNdistr(dp=dp3, name="SUN3", compNames=c("x", "w", "z"))#a <- c(1,-2)A <- matrix(1:6, 3, 2)sun2at <- affineTransSUNdistr(sun3, a, A, "SUN2at", compNames=c("at1", "at2"))sun2m <- marginalSUNdistr(sun3, comp=c(1,3), name="SUN2m")sun1c <- conditionalSUNdistr(sun3, comp=c(1,3), values=c(1.1, 0.8), eventType=">", name="SUN1c", drop=FALSE)#Omega <- matrix(c(5, 1, 1, 6), 2, 2)Delta <- matrix(c(0.30, 0.50, 0.50, 0.85), 2, 2, byrow=TRUE)Gamma <- matrix(c(1, 0.18, 0.18, 1), 2, 2)tau <- c(0.4, -0.8)dp2 <- list(x=c(1, 0), Omega=Omega, Delta=Delta, tau=tau, Gamma=Gamma)sun2 <- makeSUNdistr(dp=dp2, name="SUN2", compNames=c("u", "v"))#sun2conv <- convolutionSUNdistr(sun2, sun2m, name="SUN2sum")sun5 <- joinSUNdistr(sun3, sun2)Owen's function
Description
Evaluates functionT(h,a) studied by D.B.Owen
Usage
T.Owen(h, a, jmax=50, cut.point=8)Arguments
h | a numeric vector. Missing values ( |
a | a numeric value. |
jmax | an integer scalar value which regulates the accuracy of the result.See Section ‘Details’ below for explanation. |
cut.point | a scalar value which regulates the behaviour of the algorithm,as explained in Section ‘Details’ below (default value: |
Details
Ifa>1 and0<h<=cut.point, a series expansion is used,truncated afterjmax terms.Ifa>1 andh>cut.point, an asymptotic approximation is used.In the other cases, various reflection properties of the functionare exploited. See the reference below for more information.
Value
a numeric vector.
Background
The functionT(h,a) studied by Owen (1956) is useful for the computation of the bivariate normal distribution function and related quantities, including the distribution function of a skew-normal variate; seepsn.See the reference below for more information on functionT(h,a).
Author(s)
Adelchi Azzalini and Francesca Furlan
References
Owen, D. B. (1956).Tables for computing bivariate normal probabilities.Ann. Math. Statist.27, 1075-1090.
See Also
Examples
owen <- T.Owen(1:10, 2)Affine transformations and marginals of a skew-elliptical distribution
Description
Given a multivariate random variableY with skew-elliptical(SEC) distribution, compute the distribution of a (possibly multivariate) marginal or the distributionof an affine transformationa + A^{\top}Y.
Usage
affineTransSECdistr(object, a, A, name, compNames, drop=TRUE) marginalSECdistr(object, comp, name, drop=TRUE)Arguments
object | an object of class |
a | a numeric vector with the length |
A | a full-rank matrix with |
name | an optional character string representing the name of the outcome distribution; if missing, one such string is constructed. |
compNames | an optional vector of length |
drop | a logical flag (default value: |
comp | a vector formed by a subset of |
Value
Ifobject defines the distribution of aSEC randomvariableY,affineTransSECdistr computes the distribution ofa+A'Y andmarginalSECdistr computes the marginaldistribution of thecomp components. In both cases the returnedobject is of classSECdistrMv, except whendrop=TRUEoperates, leading to an object of classSECdistrUv.
Background
These functions implement formulae given in Sections 5.1.4, 5.1.6 and 6.2.2 of the reference below.
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
makeSECdistr,extractSECdistr,SECdistrMv-class
Examples
dp3 <- list(xi=1:3, Omega=toeplitz(1/(1:3)), alpha=c(3,-1,2), nu=5)st3 <- makeSECdistr(dp3, family="ST", name="ST3", compNames=c("U", "V", "W"))A <- matrix(c(1,-1,1, 3,0,-2), 3, 2)new.st <- affineTransSECdistr(st3, a=c(-3,0), A=A)#st2 <- marginalSECdistr(st3, comp=c(3,1), name="2D marginal of ST3")Australian Institute of Sport data
Description
Data on 102 male and 100 female athletes collected at the Australian Institute of Sport, courtesy of Richard Telford and Ross Cunningham.
Usage
data(ais)Format
A data frame with 202 observations on the following 13 variables.
| [,1] | sex | categorical, levels:female,male |
| [,2] | sport | categorical, levels:B_Ball,Field,Gym,Netball,Row,Swim,T_400m, |
Tennis,T_Sprnt,W_Polo | ||
| [,3] | RCC | red cell count (numeric) |
| [,4] | WCC | white cell count (numeric) |
| [,5] | Hc | Hematocrit (numeric) |
| [,6] | Hg | Hemoglobin (numeric) |
| [,7] | Fe | plasma ferritin concentration (numeric) |
| [,8] | BMI | body mass index, weight/(height)^2 (numeric) |
| [,9] | SSF | sum of skin folds (numeric) |
| [,10] | Bfat | body fat percentage (numeric) |
| [,11] | LBM | lean body mass (numeric) |
| [,12] | Ht | height, cm (numeric) |
| [,13] | Wt | weight, kg (numeric) |
Details
The data have been made publicly available in connection with thebook by Cook and Weisberg (1994).
References
Cook and Weisberg (1994),An Introduction to Regression Graphics. John Wiley & Sons, New York.
Examples
data(ais, package="sn")pairs(ais[,c(3:4,10:13)], col=as.numeric(ais[,1]), main = "AIS data")Price of Barolo wine
Description
A data frame with prices of bottles of Barolo wine and some auxiliary variables
Usage
data(barolo)Format
A data frame with 307 observations on five variables, as follows:
reseller | reseller code (factor with levelsA, B, C, D) |
vintage | vintage year (numeric) |
volume | content volume in centilitres (numeric) |
price | price in Euro (numeric) |
age | age in 2010 (numeric) |
For six items,vintage isNA's and so alsoage.Three items have a non-standard volume of 50 cl.
Details
The data have been obtained in July 2010 from the websites of four Italian wine resellers, selecting only quotations of Barolo wine, which is produced in the Piedmont region of Italy. The price does not include the delivery charge.
The data have been presented in Section 4.3.2 of the reference below,where a subset of them has been used for illustrative purposes.This subset refers to reseller"A" and bottles of 75cl.
Source
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
Examples
data(barolo)attach(barolo) f <- cut(age, c(0, 5, 6, 8, 11, 30))table(volume, f)plot(volume, price, col=as.numeric(f), pch=as.character(reseller))legend(400, 990, col=1:5, lty=1, title="age class", legend=c("4-5", "6", "7-8", "9-11", "12-30"))#A75 <- (reseller=="A" & volume==75)hist(log(price[A75],10), col="gray85")# see Figure 4.7 of the sourceCoefficients of objects created byselm
Description
coef method for classes"selm" and"mselm".
Usage
## S4 method for signature 'selm'coef(object, param.type = "CP", ...)## S4 method for signature 'mselm'coef(object, param.type = "CP", vector=TRUE, ...)Arguments
object | an object of class |
param.type | a character string which indicates the required type of parameter type; possible values are |
vector | a logical value (default is |
... | not used, included for compatibility with the generic method |
Value
a numeric vector or a list(the latter only formselm-class objects ifvector=FALSE)
Note
The possible options ofparam.type are described in thedocumentation ofdp2cp; their corresponding outcomes differ by an additive constant only. With the"CP" option (that is,the ‘centred parametrization’), the residuals are centred around 0, at least approximately; this is a reason for setting"CP" as the default option. For more information, see the ‘Note’ in the documentation ofsummary.selm.
Author(s)
Adelchi Azzalini
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
dp2cp,summary.selm,selm function,selm-class
Examples
data(wines, package="sn")m5 <- selm(acidity ~ phenols + wine, family="SN", data=wines)coef(m5)coef(m5, "dp")#m12 <- selm(cbind(acidity, alcohol) ~ phenols + wine, family="SN", data=wines)coef(m12)coef(m12, "DP", vector=FALSE)Skew-normal conditional distribution
Description
For a multivariate (extended) skew-normal distribution, compute its conditional distribution for given values of some of its components.
Usage
conditionalSECdistr(object, fixed.comp, fixed.values, name, drop = TRUE)Arguments
object | an object of class |
fixed.comp | a vector containing a subset of |
fixed.values | a numeric vector of values taken on by the components |
name | an optional character string with the name of the outcome distribution; if missing, one such string is constructed. |
drop | logical (default= |
Details
For background information, see Section 5.3.2 of the reference below.
Value
an object of classSECdistrMv, except in the case whendrop=TRUE operates, leading to an object of classSECdistrUv-class.
References
Azzalini, A. and Capitanio, A. (2014).The Skew-normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
makeSECdistr,SECdistrMv-class,affineTransSECdistr
Examples
Omega <- diag(3) + outer(1:3,1:3)sn <- makeSECdistr(dp=list(xi=rep(0,3), Omega=Omega, alpha=1:3), family="SN")esn <- conditionalSECdistr(sn, fixed.comp=2, fixed.values=1.5)show(esn)Confidence intervals for parameters of a selm-class object
Description
Computes confidence intervals for parameters in a selm-class object producesbyselm fit when the response variable is univariate.
Usage
## S3 method for class 'selm'confint(object, parm, level=0.95, param.type, tol=1e-3, ...)Arguments
object | an object of class |
parm | a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered. |
level | the confidence level required (default value is |
param.type | a character string with the required parameterization; it must be either |
tol | the desired accuracy (convergence tolerance); this is a parameter passed to |
... | not used, only there for compatibility reasons. |
Details
A description of the methodology underlyingconfint.selm is provided in the technical note of Azzalini (2016).That document also explains why in certain cases an interval is not constructed andNA's are returned as endpoint.
Value
An invisible list whose components, described below, are partly different in the one- and the two-parameter cases.
call | the calling statement |
<param1> | values of the first parameter |
<param2> | values of the second parameter (in a two-parameter case) |
logLik | numeric vector or matrix of the profile log-likelihood values |
confint | in the one-parameter case, the confidence interval |
level | in the one-parameter case, the confidence level |
deviance.contour | in the two-parameter case, a list of lists whoseelements identify each curve of the contour plot |
Author(s)
Adelchi Azzalini
References
Azzalini, A. (2016). Derivation of various types of intervals from aselm object.Technical note distributed with the documentation of theR packagesn in fileselm-intervals.pdf within section ‘User guide, package vignettes and other documentation’.
See Also
selm,summary.selm,profile.selm,
makeSECdistr for theCP/DP parameterizations,
uniroot for itstol argument
Examples
data(ais)m1 <- selm(log(Fe) ~ BMI + LBM, family = "sn", data = ais)intervCP <- confint(m1)intervDP <- confint(m1, param.type="DP")confint(m1, parm=2:3) confint(m1, parm=c("omega", "alpha"), param.type="DP")Conversion ofCSN parameters toSUN parameters
Description
The parameter set of a Closed Skew-Normal (CSN)distribution is converted into the parameter set of the equivalent Unified Skew-Normal (SUN) distribution.
Usage
convertCSN2SUNpar(mu, Sigma, D, nu, Delta)Arguments
mu | a numeric vector of length |
Sigma | a positive definite variance matrix of size |
D | an arbitrary numeric matrix of size say |
nu | a numeric vector of length |
Delta | a positive definite variance matrix of size |
Details
The arguments of the function match the parameters(\mu, \Sigma, D, \nu, \Delta) of theCSN distributionpresented by González-Faríaset alii (2004a, 2004b). These parameters are converted into those of the equivalentSUN distribution, which is unique. The converse operation, that is,mapping parameters from theSUN to theCSN family, is not handled here. Its solution would be non-unique, because theCSN family is over-parameterized.
Note that, having retained the exact notation of the above-quoted papers,there is aDelta argument which must not be confused with one of thearguments for theSUN distribution inSUNdistr-base. The coincidence of these names is entirely accidental.
TheCSN parameters must only satisfy the requirements that\Sigma and\Delta are symmetric positive definite matrices. Since these conditions are somewhat simpler to check than those for theSUN parameters, as indicated inSUNdistr-base, this function may provide a simple option for the specification of aCSN/SUN distribution.
The parameter listdp produced by this function can be used as aninput for the functions inSUNdistr-base or formakeSUNdistr.
Value
a list representing thedp parameter set of thecorrespondingSUN distribution
Author(s)
Adelchi Azzalini
References
González-Farías, G., Domínguez-Molina, J. A., & Gupta, A. K. (2004a). Additive properties of skew normal random vectors.J. Statist. Plann. Inference126, 521-534.
González-Farías, G., Domínguez-Molina, J. A., & Gupta, A. K. (2004b). The closed skew-normal distribution. In M. G. Genton (Ed.),Skew-elliptical Distributions and Their Applications: a Journey Beyond Normality, Chapter 2, (pp. 25–42). Chapman & Hall/CRC.
See Also
Examples
p <- 3q <- 2mu <- 1:pSigma <- toeplitz(1/(1:p))D <- matrix(sqrt(1:(p*q)), q, p)nu <- 1/(1:q)Delta <- diag(q) + outer(rep(1,q), rep(1,q))dp <- convertCSN2SUNpar(mu, Sigma, D, nu, Delta)Convert a SN distribution into a SUN
Description
An object ofSECdistrMv-class orSECdistrUv-class representing aSN orESN distribution is converted into aSUNdistr-class object representing the same distribution.
Usage
convertSN2SUNdistr(object, HcompNames = "h", silent = FALSE)Arguments
object | an object of |
HcompNames | an optional character string for the hidden component |
silent | a logical value which controls the behaviour if the supplied |
Value
an object ofSUNdistr-class
Author(s)
Adelchi Azzalini
See Also
SUNdistr-class,SECdistrMv-class,SECdistrUv-class
Examples
esn <- makeSECdistr(dp=c(0, 1, 2, 0.5), family="ESN")sun <- convertSN2SUNdistr(esn)mean(sun) - mean(esn)vcov(sun) - sd(esn)^2#dp0 <- list(xi=1:2, Omega=diag(3:4), alpha=c(3, -5))f10 <- makeSECdistr(dp=dp0, family="SN", name="SN-2d", compNames=c("u1", "u2"))sun10 <- convertSN2SUNdistr(f10)mean(sun10) - mean(f10)vcov(sun10) - vcov(f10)Multivariate skew-normal distribution
Description
Probability density function, distribution function and random number generation for the multivariate skew-normal (SN) distribution.
Usage
dmsn(x, xi=rep(0,length(alpha)), Omega, alpha, tau=0, dp=NULL, log=FALSE)pmsn(x, xi=rep(0,length(alpha)), Omega, alpha, tau=0, dp=NULL, ...)rmsn(n=1, xi=rep(0,length(alpha)), Omega, alpha, tau=0, dp=NULL)Arguments
x | either a vector of length |
xi | a numeric vector of length |
Omega | a symmetric positive-definite matrix of dimension |
alpha | a numeric vector which regulates the slant of the density; see ‘Background’. |
tau | a single value representing the ‘hidden mean’ parameter of theESN distribution; |
dp | a list with three elements, corresponding to |
n | a numeric value which represents the number of random vectorsto be drawn. |
log | logical (default value: |
... | additional parameters passed to |
Details
Typical usages are
dmsn(x, xi=rep(0,length(alpha)), Omega, alpha, log=FALSE)dmsn(x, dp=, log=FALSE)pmsn(x, xi=rep(0,length(alpha)), Omega, alpha, ...)pmsn(x, dp=)rmsn(n=1, xi=rep(0,length(alpha)), Omega, alpha)rmsn(n=1, dp=)
For efficiency reasons,rmsn makes very limited checks on the validity of the arguments. For instance, failure to positive definiteness ofOmega would not be detected, and an uncontrolled crash occurs. Functionpmsn makes use ofpmnorm from packagemnormt;the accuracy of its computation can be controlled via...
Value
A vector of density values (dmsn) or of probabilities(pmsn) or a matrix of random points (rmsn).
Background
The multivariate skew-normal distribution is discussed by Azzalini and Dalla Valle (1996). The(Omega,alpha)parametrization adopted here is the one of Azzalini and Capitanio (1999).Chapter 5 of Azzalini and Capitanio (2014) provides an extensive account,including subsequent developments.
Notice that the location vectorxi does not represent the mean vector of the distribution. Similarly,Omega is notthe covariance matrix of the distribution, although it isa covariance matrix. Finally, the components ofalpha are not equal to the slant parameters of the marginal distributions; to fix the marginal parameters at prescribed values, it is convenient to start from the OP parameterization, as illustrated in the ‘Examples’ below. Another option is to start from theCPparameterization, but notice that, at variance from theOP, not allCP sets are invertible to lend aDP set.
References
Azzalini, A. and Capitanio, A. (1999).Statistical applications of the multivariate skew normal distribution.J.Roy.Statist.Soc. B61, 579–602. Full-length version available athttps://arXiv.org/abs/0911.2093
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
Azzalini, A. and Dalla Valle, A. (1996).The multivariate skew-normal distribution.Biometrika83, 715–726.
See Also
Examples
x <- seq(-3,3,length=15)xi <- c(0.5, -1)Omega <- diag(2)Omega[2,1] <- Omega[1,2] <- 0.5alpha <- c(2,-6)pdf <- dmsn(cbind(x, 2*x-1), xi, Omega, alpha)cdf <- pmsn(cbind(x, 2*x-1), xi, Omega, alpha)p1 <- pmsn(c(2,1), xi, Omega, alpha)p2 <- pmsn(c(2,1), xi, Omega, alpha, abseps=1e-12, maxpts=10000)#rnd <- rmsn(10, xi, Omega, alpha)## use OP parameters to fix marginal shapes at given lambda values:op <- list(xi=c(0,1), Psi=matrix(c(2,2,2,3), 2, 2), lambda=c(5, -2))rnd <- rmsn(10, dp=op2dp(op,"SN"))# # use CP parameters to fix mean vector, variance matrix and marginal skewness:cp <- list(mean=c(0,0), var.cov=matrix(c(3,2,2,3)/3, 2, 2), gamma1=c(0.8, 0.4))dp <- cp2dp(cp, "SN")rnd <- rmsn(5, dp=dp)Multivariate skew-t distribution and skew-Cauchy distribution
Description
Probability density function, distribution function and random number generation for the multivariate skew-t (ST) andskew-Cauchy (SC) distributions.
Usage
dmst(x, xi=rep(0,length(alpha)), Omega, alpha, nu=Inf, dp=NULL, log=FALSE)pmst(x, xi=rep(0,length(alpha)), Omega, alpha, nu=Inf, dp=NULL, ...)rmst(n=1, xi=rep(0,length(alpha)), Omega, alpha, nu=Inf, dp=NULL)dmsc(x, xi=rep(0,length(alpha)), Omega, alpha, dp=NULL, log=FALSE)pmsc(x, xi=rep(0,length(alpha)), Omega, alpha, dp=NULL, ...)rmsc(n=1, xi=rep(0,length(alpha)), Omega, alpha, dp=NULL)Arguments
x | for |
xi | a numeric vector of length |
Omega | a symmetric positive-definite matrix of dimension |
alpha | a numeric vector of length |
nu | a positive value representing the degrees of freedom ofST distribution; does not need to be integer. Default value is |
dp | a list with three elements named |
n | a numeric value which represents the number of random vectors to bedrawn; default value is |
log | logical (default value: |
... | additional parameters passed to |
Details
Typical usages are
dmst(x, xi=rep(0,length(alpha)), Omega, alpha, nu=Inf, log=FALSE)dmst(x, dp=, log=FALSE)pmst(x, xi=rep(0,length(alpha)), Omega, alpha, nu=Inf, ...)pmst(x, dp=, ...)rmst(n=1, xi=rep(0,length(alpha)), Omega, alpha, nu=Inf)rmst(n=1, dp=)dmsc(x, xi=rep(0,length(alpha)), Omega, alpha, log=FALSE)dmsc(x, dp=, log=FALSE)pmsc(x, xi=rep(0,length(alpha)), Omega, alpha, ...)pmsc(x, dp=, ...)rmsc(n=1, xi=rep(0,length(alpha)), Omega, alpha)rmsc(n=1, dp=)
For efficiency reasons,rmst, rmsc make very limited checks on the validity of the arguments. For instance, failure to positive definiteness ofOmega would not be detected, and an uncontrolled crash occurs.Functionpmst requiresdmt from packagemnormt; the accuracy of its computation can be controlled via argument....
Value
A vector of density values (dmst anddmsc) or a singleprobability (pmst andpmsc) or a matrix of random points(rmst andrmsc).
Background
The family of multivariateSC distributions is the subset of theST family, obtained whennu=1. While in the univariate casethere are specialized functions for theSC distribution,dmsc,pmsc andrmsc simply make a call todmst,pmst, rmst with argumentnu set equal to 1.
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monograph series.
See Also
Examples
x <- seq(-4,4,length=15)xi <- c(0.5, -1)Omega <- diag(2)Omega[2,1] <- Omega[1,2] <- 0.5alpha <- c(2,2)pdf <- dmst(cbind(x,2*x-1), xi, Omega, alpha, 5)rnd <- rmst(10, xi, Omega, alpha, 6)p1 <- pmst(c(2,1), xi, Omega, alpha, nu=5)p2 <- pmst(c(2,1), xi, Omega, alpha, nu=5, abseps=1e-12, maxpts=10000)Conversion between parametrizations of a skew-elliptical distribution
Description
Convert direct parameters (DP) to centred parameters (CP) of a skew-elliptical distribution andvice versa.
Usage
dp2cp(dp, family, object = NULL, cp.type = "proper", upto = NULL) cp2dp(cp, family)dp2op(dp, family)op2dp(op, family)Arguments
dp | a vector (in the univariate case) or a list (in the multivariatecase) as described in |
cp | a vector or a list, in agreement with |
op | a vector or a list, in agreement with |
family | a characther string with the family acronym, as described in |
object | optionally, an S4 object of class |
cp.type | character string, which has effect only if |
upto | numeric value (in |
Value
Fordp2cp, a matching vector (in the univariate case) or a list (in the multivariate case) ofcp parameters. Forcp2dp andop2dp, a similar object ofdp parameters,provided the set of input parameters is in the admissible region.Fordp2op, a similar set ofop parameters.
Background
For a description of theDPparameters, see Section ‘Details’ ofmakeSECdistr. TheCP form of parameterization is cumulant-based. For a univariatedistribution, theCP components are the mean value (first cumulant), the standard deviation (square root of the 2nd cumulant), the coefficient of skewness (3rd standardized cumulant) and, for theST, the coefficient of excess kurtosis (4th standardized cumulant). For a multivariate distribution, there exists an extension based on the same logic; its components represent thevector mean value, the variance matrix, the vector of marginal coefficients ofskewness and, only for theST, the Mardia's coefficient of excesskurtosis. The pseudo-CP variant provides an ‘approximate form’ ofCP when not all required cumulants exist; however, this parameter setis not uniquely invertible toDP. The names of pseudo-CPcomponents printed in summary output are composed by adding a~ after the usual component name; for example, the first one is denotedmean~.
Additional information is provided by Azzalini and Capitanio (2014).Specifically, their Section 3.1.4 presentsCP in the univariateSN case, Section 4.3.4CP for theST case and the 'pseudo-CP' version. Section 5.2.3 presents the multivariateextension for theSN distribution, Section 6.2.5 for the multivariateST case. For a more detailed discussion, see Arellano-Valle & Azzalini (2013).
TheOP parameterization is very similar toDP, from which it differs only for the components which regulate dispersion (or scatter) and slant. Its relevance lies essentially in the multivariate case, where the components of the slant parameter can be interpreted component-wise andremain unaffected if marginalization with respect to some other components is performed. In the multivariateSN case, the components ofOP, denoted\xi, \Psi, \lambda, are associated to the expression of the densityfunction (5.30) of Azzalini & Capitanio (2014); see pp.128–131 for moreinformation. In the univariate case, the slant component ofDPand the one ofOP coincide, that is,\alpha=\lambda,Parameter\xi and other parameters which may exist with other families remain the same of theDP set. The termOP stands for‘original parameterization’ since this is, up to a negligible difference, the parameterization adopted by Azzalini & Dalla Valle (1996).
Details
While any choice of the components ofDP orOP isadmissible, this is not true forCP. An implication is that acall tocp2dp may fail with an error message"non-admissible CP"for certain input values. The most extreme case is represented by theSC family, for whichCP never exists; hence it makesto sense to callcp2dp withfamily="SC".
It is possible to call the functions withdp orcp having morecomponents than those expected for a given family as described above and inmakeSECdistr. In the univariate case, this means thatdporcp can be vectors of longer length than indicated earlier. Thisoccurrence is interpreted in the sense that the additional components afterthe first one are regarded as regression coefficients of aselm model,and they are transferred unchanged to the matching components of thetransformed parameter set; the motivation is given in Section 3.1.4 ofAzzalini and Capitanio (2014). In the multivariate case,dp[[1]] andcp[[1]] can be matrices instead of vectors; the rows beyond the firstone are transferred unchanged tocp[[1]] anddp[[1]],respectively.
References
Arellano-Valle, R. B. and Azzalini, A. (2013, available on-line 12 June 2011). The centred parameterization and related quantities of the skew-t distribution.J. Multiv. Analysis113, 73-90.
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
Azzalini, A. and Dalla Valle, A. (1996).The multivariate skew-normal distribution.Biometrika83, 715–726.
See Also
makeSECdistr,summary.SECdistr,sn.cumulants,
the ‘Note’ atsummary.selm for the reason whyCP is the default parameterization in that function and in related ones,
the ‘Examples’ atrmsn for use of theCPparameterization
Examples
# univariate casecp <- dp2cp(c(1, 2222, 3333, 2, 3), "SN")dp <- cp2dp(cp, "SN")# notice that the 2nd and the 3rd component remain unchanged## multivariate casedp3 <- list(xi=1:3, Omega=toeplitz(1/(1:3)), alpha=c(-3, 8, 5), nu=6)cp3 <- dp2cp(dp3, "ST")dp3.back <- cp2dp(cp3, "ST")#op3 <- dp2op(dp3, "ST")dp3back <- op2dp(op3,"ST")Skew-Cauchy Distribution
Description
Density function, distribution function, quantiles and randomnumber generation for the skew-Cauchy (SC) distribution.
Usage
dsc(x, xi = 0, omega = 1, alpha = 0, dp = NULL, log = FALSE)psc(x, xi = 0, omega = 1, alpha = 0, dp = NULL)qsc(p, xi = 0, omega = 1, alpha = 0, dp = NULL) rsc(n = 1, xi = 0, omega = 1, alpha = 0, dp = NULL)Arguments
x | vector of quantiles. Missing values ( |
p | vector of probabilities. Missing values ( |
xi | vector of location parameters. |
omega | vector of (positive) scale parameters. |
alpha | vector of slant parameters. |
dp | a vector of length 3 whose elements represent the parameters described above. If |
n | sample size. |
log | logical flag used in |
Value
density (dsc), probability (psc), quantile (qsc)or random sample (rsc) from the skew-Cauchy distribution with givenxi,omega andalpha parameters or from the extendedskew-normal iftau!=0
Details
Typical usages are
dsc(x, xi=0, omega=1, alpha=0, log=FALSE)dsc(x, dp=, log=FALSE)psc(x, xi=0, omega=1, alpha=0)psc(x, dp= )qsc(p, xi=0, omega=1, alpha=0)qsc(x, dp=)rsc(n=1, xi=0, omega=1, alpha=0)rsc(x, dp=)
Background
The skew-Cauchy distribution can be thought as a skew-t with tail-weightparameternu=1. In this case, closed-form expressions of the distribution function and the quantile function have been obtained byBehboodianet al. (2006).The key facts are summarized in Complement 4.2 of Azzalini and Capitanio (2014).A multivariate version of the distribution exists.
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-normal and Related Families. Cambridge University Press, IMS Monographs series.
Behboodian, J., Jamalizadeh, A., and Balakrishnan, N. (2006).A new class of skew-Cauchy distributions.Statist. Probab. Lett.76, 1488–1493.
See Also
Examples
pdf <- dsc(seq(-5,5,by=0.1), alpha=3)cdf <- psc(seq(-5,5,by=0.1), alpha=3)q <- qsc(seq(0.1,0.9,by=0.1), alpha=-2)p <- psc(q, alpha=-2) rn <- rsc(100, 5, 2, 5)Skew-Normal Distribution
Description
Density function, distribution function, quantiles and randomnumber generation for the skew-normal (SN) and the extended skew-normal (ESN) distribution.
Usage
dsn(x, xi=0, omega=1, alpha=0, tau=0, dp=NULL, log=FALSE)psn(x, xi=0, omega=1, alpha=0, tau=0, dp=NULL, engine, ...)qsn(p, xi=0, omega=1, alpha=0, tau=0, dp=NULL, tol=1e-8, solver="NR", ...) rsn(n=1, xi=0, omega=1, alpha=0, tau=0, dp=NULL)Arguments
x | vector of quantiles. Missing values ( |
p | vector of probabilities. Missing values ( |
xi | vector of location parameters. |
omega | vector of scale parameters; must be positive. |
alpha | vector of slant parameter(s); |
tau | a single value representing the ‘hidden mean’ parameter of theESN distribution; |
dp | a vector of length 3 (in theSN case) or 4 (in theESN case), whose components represent the individual parameters described above. If |
n | a positive integer representing the sample size. |
tol | a scalar value which regulates the accuracy of the result of |
log | logical flag used in |
engine | a character string which selects the computing engine;this is either |
solver | a character string which selects the numerical method used for solving the quantile equation; possible options are |
... | additional parameters passed to |
Value
density (dsn), probability (psn), quantile (qsn)or random sample (rsn) from the skew-normal distribution with givenxi,omega andalpha parameters or from the extendedskew-normal iftau!=0
Details
Typical usages are
dsn(x, xi=0, omega=1, alpha=0, log=FALSE)dsn(x, dp=, log=FALSE)psn(x, xi=0, omega=1, alpha=0, ...)psn(x, dp=, ...)qsn(p, xi=0, omega=1, alpha=0, tol=1e-8, ...)qsn(x, dp=, ...)rsn(n=1, xi=0, omega=1, alpha=0)rsn(x, dp=)
psn andqsn make use of functionT.Owenorbiv.nt.prob
Inqsn, the choicesolver="NR" selects the Newton-Raphson method for solving the quantile equation, while optionsolver="RFB"alternates a step ofregula falsi with one of bisection. The"NR" method is generally more efficient, but"RFB" is occasionally required in some problematic cases.
In version 1.6-2, the random number generation method forrsn haschanged; the so-called transformation method (also referred to as the‘additive representation’) has been adopted for all values oftau.Also, the code has been modified so that there is this form of consistency:providedset.seed() is reset similarly before calls, code likersn(5, dp=1:3) andrsn(10, dp=1:3), for instance, will start with the same initial values in the longer sequence as in the shorter sequence.
Background
The family of skew-normal distributions is an extension of the normalfamily, via the introdution of aalpha parameter which regulatesasymmetry; whenalpha=0, the skew-normal distribution reduces to the normal one. The density function of theSN distribution in the ‘normalized’ case havingxi=0 andomega=1 is2\phi(x)\Phi(\alpha x), if\phi and\Phi denote thestandard normal density and distribution function.An early discussion of the skew-normal distribution is given by Azzalini (1985); see Section 3.3 for theESN variant, up to a slight difference in the parameterization.
An updated exposition is provided in Chapter 2 of Azzalini and Capitanio (2014); theESN variant is presented Section 2.2. See Section 2.3 for an historical account. A multivariate version of the distribution is examined in Chapter 5.
References
Azzalini, A. (1985).A class of distributions which includes the normal ones.Scand. J. Statist.12, 171-178.
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
Functions used bypsn:T.Owen,biv.nt.prob
Related distributions:dmsn,dst,dmst
Examples
pdf <- dsn(seq(-3, 3, by=0.1), alpha=3)cdf <- psn(seq(-3, 3, by=0.1), alpha=3)q <- qsn(seq(0.1, 0.9, by=0.1), alpha=-2)r <- rsn(100, 5, 2, 5)qsn(1/10^(1:4), 0, 1, 5, 3, solver="RFB")Skew-t Distribution
Description
Density function, distribution function, quantiles and random number generation for the skew-t (ST) distribution.
Usage
dst(x, xi=0, omega=1, alpha=0, nu=Inf, dp=NULL, log=FALSE) pst(x, xi=0, omega=1, alpha=0, nu=Inf, dp=NULL, method=0, lower.tail=TRUE, log.p=FALSE, ...)qst(p, xi=0, omega=1, alpha=0, nu=Inf, tol=1e-08, dp=NULL, method=0, ...)rst(n=1, xi=0, omega=1, alpha=0, nu=Inf, dp=NULL)Arguments
x | vector of quantiles. Missing values ( |
p | vector of probabililities. |
xi | vector of location parameters. |
omega | vector of scale parameters; must be positive. |
alpha | vector of slant parameters. With |
nu | a single positive value representing the degrees of freedom;it can be non-integer. Default value is |
dp | a vector of length 4, whose elements represent location, scale(positive), slant and degrees of freedom, respectively. If |
n | a positive integer representing the sample size. |
log,log.p | logical; if |
tol | a scalar value which regulates the accuracy of the result of |
method | an integer value between |
lower.tail | logical; if |
... | additional parameters passed to |
Value
Density (dst), probability (pst), quantiles (qst) and random sample (rst) from the skew-t distribution with givenxi,omega,alpha andnu parameters.
Details
Typical usages are
dst(x, xi=0, omega=1, alpha=0, nu=Inf, log=FALSE)dst(x, dp=, log=FALSE)pst(x, xi=0, omega=1, alpha=0, nu=Inf, method=0, ...)pst(x, dp=, log=FALSE)qst(p, xi=0, omega=1, alpha=0, nu=Inf, tol=1e-8, method=0, ...)qst(x, dp=, log=FALSE)rst(n=1, xi=0, omega=1, alpha=0, nu=Inf)rst(x, dp=, log=FALSE)
Background
The family of skew-t distributions is an extension of the Student'st family, via the introduction of aalpha parameter which regulates skewness; whenalpha=0, the skew-t distribution reduces to the usual Student'st distribution. Whennu=Inf, it reduces to the skew-normal distribution. Whennu=1, it reduces to a form of skew-Cauchy distribution.See Chapter 4 of Azzalini & Capitanio (2014) for additional information. A multivariate version of the distribution exists; seedmst.
Details
For evaluation ofpst, and so indirectly ofqst, four different methods are employed.In all the cases, the actual computations are performed for the normalized valuesz=(x-xi)/omega).Method 1 consists in usingpmst with dimensiond=1.Method 2 appliesintegrate to the density functiondst.Method 3 again usesintegrate too but with a different integrand,as given in Section 4.2 of Azzalini & Capitanio (2003, full version ofthe paper).Method 4 consists in the recursive procedure of Jamalizadeh, Khosravi andBalakrishnan (2009), which is recalled in Complement 4.3 on Azzalini & Capitanio (2014); the recursion overnu starts from the explicit expression fornu=1 given bypsc.Method 5 is targeted to tail probabilities only, and it returnsNAsfor non-extremex values (those withabs(z)<=20); it is based on expressions given in Complement 4.4 of Azzalini and Capitanio (2014).Method 1 and 4 are only suitable for integer values ofnu.Method 4 becomes progressively less efficient asnu increases,because the value ofnu determines the number of nested calls, but the decay of efficiency is slower for larger values oflength(x).If the default argument valuemethod=0 is retained, an automatic choiceamong the above four methods is made, which depends on the values ofnu, alpha, z. The numerical accuracy of methods 1, 2 and 3 can be regulated via the... argument, while method 4 is conceptually exact, up to machine precision.
Ifqst is called withnu>1e4, the computation is transferred toqsn.
References
Azzalini, A. and Capitanio, A. (2003).Distributions generated by perturbation of symmetry with emphasis on a multivariate skew-t distribution.J.Roy. Statist. Soc. B65, 367–389.Full version of the paper athttps://arXiv.org/abs/0911.2342.
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-normal and Related Families. Cambridge University Press, IMS Monographs series.
Jamalizadeh, A., Khosravi, M., and Balakrishnan, N. (2009).Recurrence relations for distributions of a skew-t and a linearcombination of order statistics from a bivariate-t.Comp. Statist. Data An.53, 847–852.
See Also
Examples
pdf <- dst(seq(-4, 4, by=0.1), alpha=3, nu=5)rnd <- rst(100, 5, 2, -5, 8)q <- qst(c(0.25, 0.50, 0.75), alpha=3, nu=5)pst(q, alpha=3, nu=5) # must give back c(0.25, 0.50, 0.75)#p1 <- pst(x=seq(-3,3, by=1), dp=c(0,1,pi, 3.5))p2 <- pst(x=seq(-3,3, by=1), dp=c(0,1,pi, 3.5), method=2, rel.tol=1e-9)Extract the SEC error distribution from an object created byselm
Description
Given an object created by a call toselm, the function deliverstheSEC distribution representing the stochastic term of the fitted model
Usage
extractSECdistr(object, name, compNames)Arguments
object | an object of class |
name | an optional character string representing the name of the outcome distribution; if missing, a string is constructed from the |
compNames | in the multivariate case, an optional vector of character strings with the names of the components of the error distribution; if missing, one such vector is constructed from the |
Value
An object of classSECdistrMv orSECdistrUv, depending of the class ofobject.
Details
When the formula of the fitted model includes only the constant1,the returned object represents the fittedSEC distribution.If the formula includes additional terms, the linear predictor is eliminatedand the returned object corresponds to the error term of the model; hence the location parameterxi in theDP parameterizationis set to zero.
The returned object can be submitted to toolsavailable for objects created bymakeSECdistr, such assummary.SECdistr,conditionalSECdistr andand so on.
See Also
Examples
data(ais)m2 <- selm(log(Fe) ~ 1, family="ST", data=ais, fixed=list(nu=8))f2 <- extractSECdistr(m2)show(f2)#m4 <- selm(cbind(BMI, LBM) ~ 1, family="SN", data=ais)f4 <- extractSECdistr(m4)mean(f4)vcov(f4)Maximum-likelihood fitting of a univariate distribution from grouped data
Description
Maximum-likelihood fitting of a univariate distribution when the data arerepresented by a set of frequencies pertaining to given set of contiguous intervals.
Usage
fitdistr.grouped(breaks, counts, family, weights, trace = FALSE, wpar = NULL)Arguments
breaks | A numeric vector of strictly increasing values which identify a set of contiguous intervals on the real line.See ‘Details’ for additional information. |
counts | A vector of non-negative integers representing the number of observations falling in the intervals specified by |
family | A character string specifying the parametric family of distributions to be used for fitted. Admissible names are: |
weights | An alias for |
trace | A logical value which indicates whether intermediate evaluations of the optimization process are printed (default: |
wpar | An optional vector with initial values of the ‘working parameters’for starting the maximization of the log-likelihood function; see ‘Details’ for their description. |
Details
The original motivation of this function was fitting a univariateSN,ST orSC distribution from grouped data;its scope was later extended to include some other continuous distributions. The adopted name of the function reflects the broad similarity of its purpose with the one offitdistr, but there are substantial differences in the actual working of the two functions.
The parameter set of a givenfamily is the same as appearing in thecorrespondingd<basename> function, with the exception of the"t" distribution, for which a location and a scale parameter are included, besidesdf.
The range ofbreaks does not need to span the whole support of thechosenfamily of distributions, that is,(0, Inf) for"Weibull" and"gamma" families,(-Inf, Inf) for the other families.In fact, for the purpose of post-fitting plotting, an infiniterange(breaks) represents a complication, requiring anad hoc handling; so it is sensible to avoid it.However, at the maximum-likelihood fitting stage, the full support ofthe probability distribution is considered, with the following implications. Ifmax(breaks)=xR, say, andxR<Inf, then an additional interval(xR, Inf) is introduced, with valuecounts=0 assigned.A similar action is taken at the lower end: ifmin(breaks)=xL islarger than the infimum of the support of the distribution,an extra 0-counts interval is introduced as(0, xL) or(-Inf, xL), depending on the support of thefamily.
Maximum likelihood fitting is obtained by maximizing the pertaining multinomial log-likelihood using theoptim function withmethod="Nelder-Mead". For numerical convenience, the numerical search is performed using ‘working parameters’ in place of the originalones, with reverse conversion at the end. The working parameters coincidewith the original distribution parameters when they have unbounded range,while they are log-transformed in case of intrinsically positive parameters.This transformation applies to the parameters of the positive-valueddistributions ("gamma" and "Weibull"), all scale parameters anddfof the"t" distribution.
Value
An object of classfitdistr.grouped, whose components are described infitdistr.grouped-class.
Author(s)
Adelchi Azzalini
See Also
For methods pertaining to this class of objects, seefitdistr.grouped-class andplot.fitdistr.grouped; see alsodsn,dst,dsc,Distributions,dmultinom;see alsoselm for ungrouped data fitting and an exampleelaborated on below.
Examples
data(barolo)attach(barolo)A75 <- (reseller=="A" & volume==75)logPrice <- log(price[A75], 10) # used in documentation of 'selm'; see its fittingdetach(barolo)breaks<- seq(1, 3, by=0.25)f <- cut(logPrice, breaks = breaks)counts <- tabulate(f, length(levels(f))) logPrice.grouped <- fitdistr.grouped(breaks, counts, family='ST')summary(logPrice.grouped) # compare this fit with the ungrouped data fittingMethods for objects of class created byfitdistr.grouped
Description
A successful call to functionfitdistr.grouped creates an object of class, also namedfitdistr.grouped, for which a set of methods exist.The structure of an object of this class is described in section ‘Object components’.
Usage
## S3 method for class 'fitdistr.grouped'logLik(object, ...) ## S3 method for class 'fitdistr.grouped'coef(object, ...) ## S3 method for class 'fitdistr.grouped'vcov(object, ...) ## S3 method for class 'fitdistr.grouped'print(x, ...) ## S3 method for class 'fitdistr.grouped'summary(object, cor=FALSE, ...) ## S3 method for class 'fitdistr.grouped'fitted(object, full=FALSE, ...)Arguments
x,object | an object of class |
cor | logical (default= |
full | logical (default= |
... | further arguments passed to or from other methods. |
Object components
Components of an object of classfitdistr.grouped:
callthe matched call
familythe selected
familyof distributionslogLthe achieved maximum log-likelihood
parama vector of estimated parameters
vcovthe approximate variance-covariance matrix of the estimates
inputa list with the input quantities and some derived ones
opta list as returned by
optim
Author(s)
Adelchi Azzalini
References
Possolo, A., Merkatas, C. and Bodnar, O. (2019).Asymmetrical uncertainties.Metrologia 56, 045009.
See Also
the functionfitdistr.grouped, the plotting methodplot.fitdistr.grouped
Examples
data(barolo)attach(barolo)A75 <- (reseller=="A" & volume==75)logPrice <- log(price[A75], 10) # as used in selm documentation; see its fittingdetach(barolo)breaks <- seq(1, 3, by=0.25)f <- cut(logPrice, breaks = breaks)counts <- tabulate(f, length(levels(f))) fit.logPrice.gr <- fitdistr.grouped(breaks, counts, family='ST')summary(fit.logPrice.gr) # compare this fit with the ungrouped data fitting print(fit.logPrice.gr)coef(fit.logPrice.gr)vcov(fit.logPrice.gr)data.frame(intervals=levels(f), counts, fitted=format(fitted(fit.logPrice.gr)))full.intervals <- c("(-Inf, 1]", levels(f), "(3, Inf)")data.frame("full-range intervals" = full.intervals, "full-range counts" = c(0, counts, 0), "full-range fit" = fitted(fit.logPrice.gr, full=TRUE))sum(counts) - sum(fitted(fit.logPrice.gr)) sum(counts) - sum(fitted(fit.logPrice.gr, full=TRUE)) # must be "nearly 0" #---# Use first entry in Table 3 of Possolo et al. (2019) and do similar fitting# to the *probability* values, not observation countsafcrc59 <- 1.141breaks <- c(-Inf, afcrc59 - 0.033, afcrc59, afcrc59 + 0.037, Inf)prob <- c(0.16, 0.50, 0.84) cum.percent <- c(0, prob, 1)*100 fitSN <- fitdistr.grouped(breaks, counts=diff(cum.percent), family="SN") print(coef(fitSN))print(rbind(target=c(prob, 1)*100, fitted=cumsum(fitted(fitSN))), digits=5)# Note: given the nature of these data (i.e. probabilities, not counts), # there is no point to use vcov, logLik and summary on the fitted object.Four-number summary of a numeric vector
Description
Returns a quantile-based four-number summary of the input data
Usage
fournum(x, na.rm = TRUE, ...)Arguments
x | a numeric vector, maybe including |
na.rm | logical; if |
... | optional arguments passed to |
Details
Functionquantile is used to compute 7 octiles ofx, that is, quantiles of level(1:7)/8, denotedoct[1:7], and derive four summary quantities:
the median, which corresponds to
oct[4],the ‘(coefficient of) quartile deviation’ or semi-interquantile range:
(oct[6] - oct[2])/2;the Galton-Bowley measure of asymmetry, that is, skewness:
(oct[6] - 2 * oct[4] + oct[2])/(oct[6] - oct[2]);the Moors measure of kurtosis:
(oct[7] - oct[5] + oct[3] - oct[1])/(oct[6] - oct[2])
The term ‘coefficient of quartile deviation’ is adopted from theEncyclopedia of Statistical Sciences; see the references below.What is called Galton-Bowley measure here is often named ‘Bowley's measure’, but some sources attribute it to Francis Galton. For the Moors measure, see Moors (1988).
Value
a vector of length four containing the median, the quartile deviation, the Galton-Bowley measure and the Moors measure. However,ifx does not contain at least 8 values (after removingNAs),rep(NA,4) is returned.
Note
Computation of octiles makes real sense only iflength(x) issubstantially larger than 8.
Author(s)
Adelchi Azzalini
References
‘Quartile deviation, coefficient of’, in:Encyclopedia of Statistical Sciences, 2nd edition (2006). Editors: Samuel Kotz (Editor-in-Chief), Campbell B. Read, N. Balakrishnan, Brani Vidakovic. Volume 10, p.6743.
‘Skewness, Bowleys's measures of’, in:Encyclopedia of Statistical Sciences, 2nd edition (2006). Editors: Samuel Kotz (Editor-in-Chief), Campbell B. Read, N. Balakrishnan, Brani Vidakovic. Volume 12, p.7771-7773.
Moors, J.J.A. (1988). A quantile alternative for kurtosis.Source: Journal of the Royal Statistical Society. Series D (The Statistician), Vol. 37, pp. 25-32
See Also
Examples
fournum(datasets::rivers)Simulated sample from a skew-normal distribution
Description
A sample simulated from the SN(0,1,5) distribution with samplecoefficient of skewness inside the admissible range (-0.9952719, 0.9952719) for the skew-normal family butmaximum likelihood estimate on the frontier of the parameter space.
Usage
data(frontier)Format
A numeric vector of length 50.
Source
Generated by a run ofrsn(50, 0, 1, 5).
Examples
data(frontier, package="sn")fit <- selm(frontier ~ 1)plot(fit, which=2)#fit.p <- selm(frontier ~ 1, method="MPLE")plot(fit.p, which=2)Mapping of the (Galton-Bowley, Moors) measures to the (alpha,nu) parameters of a ST distribution
Description
Given a pair of (Galton-Bowley, Moors) measures of skewness and kurtosis for a given sample,galton_moors2alpha_nu delivers values (alpha,nu) such that a skew-t (ST) distribution with these slant and tail-weight parameter has its (Galton-Bowley, Moors) measures equal to the input values.Its simplified versiongalton2alpha uses only a Galton-Bowley measureto deliver aalpha value, assuming aSN distribution.These functions are mainly intended for internal package usage.
Usage
galton_moors2alpha_nu(galton, moors, quick = TRUE, move.in = TRUE, verbose = 0, abstol = 1e-04)galton2alpha(galton, move.in = TRUE)Arguments
galton | a numeric value, representing a Galton-Bowley measure |
moors | a numeric value, representing a Moors measure |
quick | a logical value; if |
move.in | if the input values |
verbose | a numeric value which regulates the amount of printed detail |
abstol | the tolerance value of the mapping, only relevant is |
Details
For background information about the Galton-Bowley's and the Moors measures, see the documentation offournum.The working of the mapping by described in Azzalini and Salehi (2020).
Value
forgalton_moors2alpha_nu, named vector of length two, with one or more descriptive attributes; forgalton2alpha, a singlealpha value.
Note
These functions are mainly intended for internal package usage.Specifically they are used byst.prelimFit.
Author(s)
Adelchi Azzalini
References
Azzalini, A. and Salehi, M. (2020).Some computational aspects of maximum likelihood estimation of the skew-t distribution.In:Computational and Methodological Statistics and Biostatistics,edited by Andriëtte Bekker, Ding-Geng Chen and Johannes T. Ferreira.Springer. DOI: 10.1007/978-3-030-42196-0
See Also
Examples
galton_moors2alpha_nu(0.5, 3, quick=FALSE) # input in the feasible areagalton_moors2alpha_nu(0.5, 3) # very similar output, much more quicklygalton_moors2alpha_nu(0.5, 0.5) # input outside the feasible areaBuild a skew-elliptically contoured distribution
Description
Build an object which identifies a skew-elliptically contoured distribution (SEC), in the univariate and in the multivariate case.The term ‘skew-elliptical distribution’ is a synonym ofSEC distribution.
Usage
makeSECdistr(dp, family, name, compNames)Arguments
dp | a numeric vector (in the univariate case) or a list (in themultivariate case) of parameters which identify the specific distributionwithin the named |
family | a character string which identifies the parametricfamily; currently, possible values are:"SN","ESN","ST","SC". See ‘Details’ for additional information. |
name | an optional character string with the name of the distribution.If missing, one is created. |
compNames | in the multivariate case, an optional vector of characterstrings with the names of the component variables; its length must be equal to the dimensionality of the distribution being generated. If missing and the first component of |
Details
Ifdp is a numeric vector, a univariate distribution is built. Alternatively, ifdp is a list, a multivariate distribution is built. In both cases, the required number of components ofdp depends onfamily: it must be3 for"SN" and"SC"; it must be4 for"ESN" and"ST".
In the univariate case, the first three components ofdp represent what for the specific distributions are denotedxi (location),omega (scale, positive) andalpha (slant); see functionsdsn,dst,dsc for theirdescription. The fourth component, when it exists, represents eithertau (hidden variable mean) for"ESN" ornu (degrees of freedom) for"ST". The names of the individual parameters are attachedto the components ofdp in the returned object.
In the multivariate case,dp is a list with components having similar role as in the univariate case, butxi=dp[[1]] andalpha=dp[[3]] are now vectors and the scale parameterOmega=dp[[2]] is a symmetric positive-definite matrix. For a multivariate distribution of dimension 1 (which can be created, although a warning message is issued),Omega corresponds to the square ofomega in the univariate case. Vectorsxi andalpha must be of lengthncol(Omega).See also functionsdmsn,dmst anddmsc.The fourth component, when it exists, is a scalar with the same role as in the univariate case.
In the univariate casealpha=Inf is allowed, but in the multivariatecase all components of the vectoralpha must be finite.
An object built by this function operates according to the S4 protocol.
Value
In the univariate case, an object of classSECdistrUv;in the multivariate case, an object of classSECdistrMv.SeeSECdistrUv-class andSECdistrMv-classfor their description.
Background
For background information, see Azzalini and Capitanio (2014), specificallyChapters 2 and 4 for univariate cases, Chapters 5 and 6 for multivariatecases; Section 6.1 provides a general formulation ofSECdistributions.
If the slant parameteralpha is0 (or a vector of0's,in the multivariate case), the distribution is of classical elliptical type.
TheESN distribution is included here as a members of theSEC class, with a very slight extension of the original definitionof this class, since the only difference is the non-zero truncation pointof the unobserved component of the(d+1)-dimensionalEC variable.
Author(s)
Adelchi Azzalini
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
The description of classesSECdistrUv-class andSECdistrMv-class
plot.SECdistr for plotting andsummary.SECdistr for summaries
Related functionsdsn,dst,dsc,dmsn,dmst,dp2cp
FunctionsaffineTransSECdistr andconditionalSECdistr to manipulate objects of classSECdistrMv-class
FunctionextractSECdistr to extract objects of classSECdistrMv-class andSECdistrUv-class representing theSEC distribution of aselm fit
Examples
f1 <- makeSECdistr(dp=c(3,2,5), family="SN", name="First-SN")show(f1)summary(f1)plot(f1)plot(f1, probs=c(0.1, 0.9))#f2 <- makeSECdistr(dp=c(3, 5, -4, 8), family="ST", name="First-ST")f9 <- makeSECdistr(dp=c(5, 1, Inf, 0.5), family="ESN", name="ESN,alpha=Inf")#dp0 <- list(xi=1:2, Omega=diag(3:4), alpha=c(3, -5))f10 <- makeSECdistr(dp=dp0, family="SN", name="SN-2d", compNames=c("u1", "u2"))#dp1 <- list(xi=1:2, Omega=diag(1:2)+outer(c(3,3),c(2,2)), alpha=c(-3, 5), nu=6)f11 <- makeSECdistr(dp=dp1, family="ST", name="ST-2d", compNames=c("t1", "t2"))Build an object representing a SUN distribution
Description
Build an object which identifies a Unified Skew-Normal distribution (SUN) within this parametric family.TheSUN family is essentially equivalent to some other parametric families examined in the literature, notably the Closed Skew-Normal.
Usage
makeSUNdistr(dp, name, compNames, HcompNames, drop = TRUE)Arguments
dp | a list of parameters as described at |
name | an optional character string with the name of the distribution. If missing, one is created. |
compNames | an optional vector of character strings with the names of the component variables; its length must be equal to the dimensionality |
HcompNames | an optional vector of character strings with the names of the hidden component variables; its length must be equal to the dimensionality component |
drop | a logical value (default: |
Details
The argumentdp is a list, whose components are described atSUNdistr-base; see especially the ‘Details’ there.In this respect, there is no difference between the univariate and theunivariate case, differently from the similar commandmakeSECdistr.
If the argumentsname,compNames andHcompNamesare missing, they are composed from the supplied arguments.
ASUNdistr-class object operates according to the S4 protocol.
Value
An object ofSUNdistr-class
Note
The present structure and user interface of this function, and of other ones related to theSUN distribution, must be considered experimental, and they might possibly change in the future.
Author(s)
Adelchi Azzalini
See Also
Basic information on the SUN distributionSUNdistr-base,the description of the classSUNdistr-class,
Related methods:show.SUNdistr for displaying the object constituents,plot.SUNdistr for plotting,mean.SUNdistr for the mean value,vcov.SUNdistr for the variance matrix,summary.SUNdistr for various summary quantities
FunctionsSUNdistr-op manipulate objects created by this function, producing newSUNdistr-class objects
Examples
xi <- c(1, 0, -1)Omega <- matrix(c(2,1,1, 1,3,1, 1,1,4), 3, 3)Delta <- matrix(c(0.72,0.20, 0.51,0.42, 0.88, 0.94), 3, 2, byrow=TRUE)Gamma <- matrix(c(1, 0.8, 0.8, 1), 2, 2)dp3 <- list(xi=xi, Omega=Omega, Delta=Delta, tau=c(-0.5, 0), Gamma=Gamma)sun3 <- makeSUNdistr(dp=dp3, name="SUN3", compNames=c("x", "w", "z"))show(sun3)vech, tr and other matrix operators
Description
vech and other matrix operators
Usage
vech(A)vech2mat(v)duplicationMatrix(n)tr(x)blockDiag(...)Arguments
A | a (symmetric) square numeric matrix. |
v | a numeric vector such that |
n | a positive integer number; default is |
x | a square numeric matrix. |
... | an abitrary numer of matrices or objects coercible into matrices. |
Value
a vector in case ofvech, a scalar in case oftr, otherwise a matrix.
Details
For a square matrixA,vech(A) returns the vector formedby the lower triangular portion of the matrix, including the diagonal; usually, this only makes sense for a symmetric matrix of numeric values. Ifv=vech(M) whereM is a symmetric numeric matrix,vech2mat(v) performs the inverse operation and returns the original matrixM; this explain the requirement onlength(v).
For a positive integern,D=duplicationMatrix(n) is a matrix of dimension(n^2, n*(n+1)/2) such thatD %*% vech(M) returns thevec-form of a symmetric matrixM of ordern, that is, the vector which stacks the columns ofM;for more information, see Section 3.8 of Magnus and Neudecker (1988).
For a square numeric matrixx,tr(x) returns its trace.
blockDiag(...) creates a block-diagonal matrix from a set of matrices or objects coercible into matrices. Generally, this is useful only fornumeric objects.
Author
Adelchi Azzalini; the original Octave code ofduplicationMatrix is by Kurt Hornik.
References
Magnus, Jan R. and Neudecker, Heinz (1988).Matrix differentialcalculus with application in statistics and econometrics. Wiley series in probability and statistics.
Examples
M <- toeplitz(1:4)v <- vech(M)vech2mat(v) - MD <- duplicationMatrix(ncol(M))# D %*% vech(M) - as.vector(M), must be a one-column matrix of 0str(outer(1:4,2:5))blockDiag(M[1:2,], 1:2, diag(5:6))The mode of a skew-elliptically contoured (SEC) distribution
Description
Compute compute the mode of a univariate or multivariateSECdistribution.
Usage
modeSECdistr(dp, family, object=NULL)Arguments
dp | a numeric vector (in the univariate case, for class |
family | a character string which identifies the parametricfamily among those admissible for classes |
object | an object of class |
Value
a numeric vector
Background
The mode is obtained through numerical maximization.In the multivariate case, the problem is reduced to a one-dimensional search using Propositions 5.14 and 6.2 of the reference below.
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
makeSECdistr andextractSECdistrfor additional information and for constructing a suitableobject,
SECdistrUv-class andSECdistrMv-classfor methodsmean andvcov which compute the mean (vector) and the variance (matrix) of theobject distribution
Examples
dp3 <- list(xi=1:3, Omega=toeplitz(1/(1:3)), alpha=c(3,-1,2), nu=5)st3 <- makeSECdistr(dp3, family="ST", name="ST3", compNames=c("U", "V", "W"))mode1 <- modeSECdistr(dp3, "ST")mode2 <- modeSECdistr(object=st3) # the same of mode1Packagesn: overview of the structure and main commands
Description
The package provides facilities to build and manipulate probability distributions of the skew-normal (SN) and some related families, notably the skew-t (ST) and the ‘unified skew-normal’ (SUN) families. For theSN,ST and skew-Cauchy (SC) families,statistical methods are made available for data fitting and model diagnostics, in the univariate and the multivariate case.
Two main sides
The package comprises two main sides: one side provides facilities for the pertaining probability distributions; the other one deals with relatedstatistical methods.Underlying formulation, parameterizations of distributions and terminology are in agreement with the monograph of Azzalini and Capitanio (2014).
Probability side
There are two layers of support for the probability distributions of interest.At the basic level, there exist functions which follow the classicalR scheme for distributions.In addition, there exists facilities to build an object which incapsulates a probability distribution and then certain operations can be be performed on such an object;these probability objects operate according to the S4 protocol. The two schemes are described next.
- ClassicalR scheme
The following functions work similary to
{d,p,q,r}normand otherRfunctions for probability distributions:skew-normal (SN): functions
{d,p,q,r}snfor the univariate case, functions{d,p,r}msnfor the multivariate case, where in both cases the ‘Extended skew-normal’ (ESN) variant form is included;skew-
t(ST): functions{d,p,q,r}stfor theunivariate case, functions{d,p,r}mstfor the multivariate case;skew-Cauchy (SC): functions
{d,p,q,r}scfor the univariate case, functions{d,p,r}mscfor the multivariate case.
In addition to the usual specification of their parameters as a sequence ofindividual components, a parameter set can be specified as a single
dpentity, namely a vector in the univariate case, a list in the multivariatecase;dpstands for ‘Direct Parameters’ (DP).Another set of parameters is in use, denoted Centred Parameters (CP),which are more convenient for interpretability, since they correspond tofamiliar quantifies, such as the mean and the standard deviation. Conversion from the
dpparameter set to the correspondingCPparameters can be accomplished using the functiondp2cp,while functioncp2dpperforms the inverse transformation.TheSUN family is mainly targeted to the multivariate context, and this is reflected in the organization of the pertaining functions, although univariateSUN distributions are supported.Density, distribution function and random numbers are handled by
{d,p,r}sun. Mean value, variance matrix and Mardia's measuresof multivariate skewness and kurtosis are computed bysun{Mean,Vcov,Mardia}.In addition, one can introduce a user-specified density function using
dSymmModulatedanddmSymmModulated, in the univariate and themultivariate case, respectively. These densities are of the‘symmetry-modulated’ type, also called ‘skew-symmetric’, whereone can specify the base density and the modulation factor with high degree offlexibility. Random numbers can be sampled using the corresponding functionsrSymmModulatedandrmSymmModulated. In the bivariate case,a dedicated plotting function exists.- Probability distribution objects:SEC families
Function
makeSECdistrcan be used to build a ‘SECdistribution’ object representing a member of a specified parametric family(among the typesSN, ESN, ST, SC) with a givendpparameterset. This object can be used for various operations such as plotting orextraction of moments and other summary quantities. Another way of constructing aSEC distribution object is viaextractSECdistrwhich extracts suitable components of an object produced by functionselmto be described below.Additional operations on these objects are possible in the multivariate case,namely
marginalSECdistrandaffineTransSECdistrfor marginalization and affine trasformations. For the multivariateSN family only (but includingESN),conditionalSECdistrperforms a conditioning on the values taken on by some components of the multivariate variable.- Probability distribution objects: theSUN family
Function
makeSUNdistrcan be used to build aSUNdistribution object representing a member of theSUN parametric family.This object can be used for various operations such as plotting orextraction of moments and other summary quantities.Moreover there are several trasformation operations which can be performedon aSUN distribution object, or two such objects in some cases:computing a (multivariate) marginal distribution, a conditional distribution(on given values of some components or on one-sided intervals), an affinetrasformation, a convolution (that is, the distribution of the sum of twoindependent variables), and joining two distributions under assumption ofindependence.
Statistics side
The main function for data fitting is represented byselm, which allowsto specify a linear regression model for the location parameter, similarly to functionlm, but assuming askew-elliptical distributionof the random component;this explains the nameselm=(se+lm). Allowed types of distributionsareSN (but notESN),ST andSC.The fitted distribution is univariate or multivariate, depending on the natureof the response variable of the posited regression model. The model fittingmethod is either maximum likelihood or maximum penalized likelihood; the latter option effectively allows the introduction of a prior distribution on the slant parameter of the error distribution, hence leading to a ‘maximum a posteriori’ estimate.
Once the fitting process has been accomplished, an object of class eitherselm (for univariate response) ormselm (for multivariate response) is produced.A number of ‘methods’ are available for these objects:show,plot,summary,coef,residuals,logLik and others.For univariateselm-class objects, univariate and bivariate profilelog-likelihood functions can be obtained; apredict method also exists.These methods are built following the S4 protocol; however, the user must notbe concerned with the choice of the adopted protocol (unless this is wished).
The actual fitting process invoked viaselm is actually performed by aset of lower-level procedures. These are accessible for direct call, if so wished, typically for improved efficiency, at the expense of a little additional programming effort. Similarly, functions to compute the Fisher information matrix are available, in the expected and the observed form (with some restrictions depending on the selected distribution).
TheextractSECdistr function extracts the fittedSEC distribution fromselm-class andmselm-class objects, henceproviding a bridge with the probability side of the package.
The facilities for statistical work do not support theSUN family.
Additional material
Additional material is available in the section ‘User guides, package vignettes and other documentation’accessible from the front page of the documentation. See especially the documentpkg_sn-intro.pdf
Author
Adelchi Azzalini.Please send comments, error reportset cetera to the author, whose web page ishttp://azzalini.stat.unipd.it/.
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
Plotting methods for classesSECdistrUv andSECdistrMv
Description
Plotting methods for classesSECdistrUv andSECdistrMv
Usage
## S4 method for signature 'SECdistrUv'plot(x, range, probs, main, npt = 251, ...)## S4 method for signature 'SECdistrMv'plot(x, range, probs, npt, landmarks = "auto", main, comp, compLabs, data = NULL, data.par = NULL, gap = 0.5, ...)Arguments
x | an object of class |
range | in the univariate case, a vector of length 2 which defines the plotting range; in the multivariate case, a matrix with two rows where each column defines the plotting range of the corresponding component variable. If missing, a sensible choice is made. |
probs | a vector of probability values. In the univariate case, thecorresponding quantiles are plotted on the horizontal axis; it can beskipped by setting |
npt | a numeric value or vector (in the univariate and in the multivariate case, respectively) to assign the number of evaluation pointsof the distribution, on an equally-spaced grid over the |
landmarks | a character string which affects the placement of somelandmark values in the multivariate case, that is, the origin, the modeand the mean (or its substitute pseudo-mean), which are all aligned.Possible values: |
main | a character string for main title; if missing, one is builtfrom the available ingredients. |
comp | a subset of the vector |
compLabs | a vector of character strings or expressions used to denotethe variables in the plot; if missing, |
data | an optional set of data of matching dimensionity of |
data.par | an optional list of graphical parameters used for plotting |
gap | a numeric value which regulates the gap between panels of amultivariate plot when |
... | additional graphical parameters |
Value
an invisible list. In the univariate case the list has three components:the input object representing the distribution and two numeric vectors with the coordinates of the plotted density values. In the multivariate case, the first element of the list is the input object representing the distribution and all subsequent list elements are lists with components of the panels comprising the matrix plot; the elements of these sub-lists are: the vectors ofx andy coordinates, the names of the variables, the density values at the(x,y) points, a vector of the density levels of the curves appearing in each panel plot, with the corresponding approximateprobability content as a vector attribute.
Details
For univariate density plots,probs are used to compute quantilesfrom the appropriate distribution, and these are superimposed to the plot ofthe density function, unlessprobs=NULL. In the multivariate case,each bivariate plot is constructed as a collection of contour curves,one curve for each probability level; consequently,probs cannot be missing orNULL. The level of the density contour lines are chosen so that each curve circumscribes a region with the quoted probability, to a good degree of approssimation; for additional information, see Azzalini and Capitanio (2014), specifically Complement 5.2 and p.179, and references therein.
Methods
signature(x = "SECdistrUv")Plot an object
xof classSECdistrUv.signature(x = "SECdistrMv")Plot an object
xof classSECdistrMv.
Author(s)
Adelchi Azzalini
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
makeSECdistr,summary.SECdistr,dp2cp
Examples
# d=1f1 <- makeSECdistr(dp=c(3,2,5), family="SC", name="Univariate Skew-Cauchy")plot(f1)plot(f1, range=c(-3,40), probs=NULL, col=4)# # d=2Omega2 <- matrix(c(3, -3, -3, 5), 2, 2) f2 <- makeSECdistr(dp=list(c(10,30), Omega=Omega2, alpha=c(-3, 5)), family="sn", name="SN-2d", compNames=c("x1","x2"))plot(f2) x2 <- rmsn(100, dp=slot(f2,"dp"))plot(f2, main="Distribution 'f2'", probs=c(0.5,0.9), cex.main=1.5, col=2, cex=0.8, compLabs=c(expression(x[1]), expression(log(z[2]-beta^{1/3}))), data=x2, data.par=list(col=4, cex=0.6, pch=5))Plotting method for classSUNdistr
Description
Plotting method for classSUNdistr
Usage
## S4 method for signature 'SUNdistr'plot(x, range, nlevels = 8, levels, npt, main, comp, compLabs, gap = 0.5, ...)Arguments
x | an object of class |
range | in the univariate case, a vector of length 2 which defines the plotting range; in the multivariate case, a matrix with two rows where each column defines the plotting range of the corresponding component variable. If missing, a sensible choice is made. |
nlevels | number of contour levels desirediff levels is not supplied. |
levels | numeric vector of levels at which to draw contour lines. |
npt | a numeric value or vector (in the univariate and in the multivariate case, respectively) to assign the number of evaluation pointsof the distribution, on an equally-spaced grid over the |
main | a character string for main title; if missing, one is builtfrom the available ingredients. |
comp | an optional integer vector representing the subset of the vector |
compLabs | a vector of character strings or expressions used to labelthe variables in the plot; if missing, |
gap | a numeric value which regulates the gap between panels of amultivariate plot when |
... | additional graphical parameters |
Details
For univariate density plots,probs are used to compute quantilesfrom the appropriate distribution, and these are superimposed to the plot ofthe density function, unlessprobs=NULL. In the multivariate case,each bivariate plot is constructed as a collection of contour curves,one curve for each probability level; consequently,probs cannot be missing orNULL. The level of the density contour lines are chosen so that each curve circumscribes a region with the quoted probability, to a good degree of approssimation; for additional information, see Azzalini and Capitanio (2014), specifically Complement 5.2 and p.179, and references therein.
Value
an invisible list. In the univariate case the list has three components:the input object representing the distribution and two numeric vectors with the coordinates of the plotted density values. In the multivariate case, the first element of the list is the input object representing the distribution and all subsequent list elements are lists with components of the panels comprising the matrix plot; the elements of these sub-lists are: the vectors ofx andy coordinates, the names of the variables, the density values at the(x,y) points, a vector of the density levels of the curves appearing in each panel plot.
Author(s)
Adelchi Azzalini
See Also
Examples
xi <- c(1, 0, -1)Omega <- matrix(c(2,1,1, 1,3,1, 1,1,4), 3, 3)Delta <- matrix(c(0.72,0.20, 0.51,0.42, 0.88, 0.94), 3, 2, byrow=TRUE)Gamma <- matrix(c(1, 0.8, 0.8, 1), 2, 2)dp3 <- list(xi=xi, Omega=Omega, Delta=Delta, tau=c(-0.5, 0), Gamma=Gamma)sun3 <- makeSUNdistr(dp=dp3, name="SUN3", compNames=c("x", "w", "z"))plot(sun3, npt=rep(51,3))p <- plot(sun3, comp=2:3, compLabs=c(expression(x[2]), expression(x[3])))# str(p)Plot an object generated byfitdistr.grouped
Description
The object produced byfitdistr.grouped is plotted in the formof an histogram of the input original observed frequencies with superimposed the fitted parametric density function.
Usage
## S3 method for class 'fitdistr.grouped'plot(x, freq = FALSE, col = "grey90", border = "grey80", pdfcol = "blue", main, sub = NULL, xlab, ylab, xlim, ylim, axes = TRUE, labels = FALSE, ...)Arguments
x | a |
freq | logical; if |
col | a colour to be used to fill the bars. |
border | the colour of the border around the bars. |
pdfcol | the colour of the fitted parametric distribution. |
main,sub,xlab,ylab | these arguments to |
xlim,ylim | the range of |
axes | logical, indicating if axes should be drawn. |
labels | logical or character. Additionally draw labels on top of bars, if not FALSE; if TRUE, draw the counts or rounded densities; if labels is a character, draw itself. |
... | further graphical parameters to title and axis. |
Details
The function builds onplot.histogram, but some lesser features have been dropped.
Value
A list with the following components
hist | an object of class |
x,y | the values used for plotting the fitted parametric distribution. |
Author(s)
Adelchi Azzalini
See Also
fitdistr.grouped for generating a suitable object,plot.histogram for the related more general function.
Examples
data(barolo)attach(barolo)A75 <- (reseller=="A" & volume==75)logPrice <- log(price[A75],10) # as used in selm documentation; see its fittingdetach(barolo)breaks<- seq(1, 3, by=0.25)f <- cut(logPrice, breaks = breaks)freq <- tabulate(f, length(levels(f))) logPrice.grouped <- fitdistr.grouped(breaks, freq, family='ST')plot(logPrice.grouped)Diagnostic plots forselm fits
Description
Diagnostic plots for objects of classselmandmselm generated by a call to functionselm
Usage
## S4 method for signature 'selm'plot(x, param.type="CP", which = c(1:4), caption, panel = if (add.smooth) panel.smooth else points, main = "", ask = prod(par("mfcol")) < length(which) && dev.interactive(), ..., id.n = 3, labels.id = names(x@residuals.dp), cex.id = 0.75, identline = TRUE, add.smooth = getOption("add.smooth"), label.pos = c(4, 2), cex.caption = 1) ## S4 method for signature 'mselm'plot(x, param.type="CP", which, caption, panel = if (add.smooth) panel.smooth else points, main = "", ask = prod(par("mfcol")) < length(which) && dev.interactive(), ..., id.n = 3, labels.id = names(x@residuals.dp), cex.id = 0.75, identline = TRUE, add.smooth = getOption("add.smooth"), label.pos = c(4, 2), cex.caption = 1)Arguments
x | an object of class |
param.type | a character string which selects the type of residualsto be used for some of of the plots; possible values are: |
which | if a subset of the plots is required, specify a subset of |
caption | a vector of character strings with captions to appear above the plots. |
panel | panel function. The useful alternative to |
main | title to each plot, in addition to the above caption. |
ask | logical; if |
... | other parameters to be passed through to plotting functions. |
id.n | number of points to be labelled in each plot, starting with themost extreme. |
labels.id | vector of labels, from which the labels for extreme pointswill be chosen. |
cex.id | magnification of point labels. |
identline | logical indicating if an identity line should be added toQQ-plot and PP-plot (default: |
add.smooth | logical indicating if a smoother should be added to most plots; see also |
label.pos | positioning of labels, for the left half and right half of the graph respectively, for plots 1-3. |
cex.caption | controls the size of |
Details
The meaning ofparam.type is described indp2cp. However, for these plot only the first parametercomponent is relevant, which affects the location of the residuals; the othercomponents are not computed. Moreover, forQQ-plot andPP-plot,DP-residuals are used irrespectively ofparam.type; see Section ‘Background’.
Valueswhich=1 andwhich=2 have adifferent effect for object of class"selm" and class"mselm".In the univariate case,which=1 plots the residual values versus thefitted values ifp>1, wherep denotes the number of covariatesincluding the constant; ifp=1, a boxplot of the response is produced.Valuewhich=2 produces an histogram of the residuals with superimposedthe fitted curve, whenp>1; ifp=1, a similar plot is generatedusing the response variable instead of the residuals. Default value forwhich is1:4.
In the multivariate case,which=1 is feasible only ifp=1 and itdisplays the data scatter with superimposed the fitted distribution. Valuewhich=2 produces a similar plot but for residuals instead ofdata. Default value for codewhich is2:4 ifp>1, otherwisec(1,3,4).
Valuewhich=3 produces a QQ-plot, both in the univariate and in themultivariate case; the difference is that the squares of normalized residualsand suitably defined Mahalanobis distances, respectively, are used in the twocases. Similarly,which=4 produces a PP-plot, working in a similarfashion.
Background
Healy-type graphical diagnostics, in the form of QQ- and PP-plots, for the multivariate normal distribution have been extended to the skew-normaldistribution by Azzalini and Capitanio (1999, section 6.1), and subsequently to the skew-t distribution in Azzalini and Capitanio (2003). A brief explanation in the univariateSN case is provided in Section 3.1.1 of Azzalini and Capitanio (2014); see also Section 3.1.6.For the univariateST case, see p.102 and p.111 of the monograph.The multivariate case is discussed in Section 5.2.1 as for theSNdistribution, in Section 6.2.6 as for theST distribution.
Author(s)
Adelchi Azzalini
References
Azzalini, A. and Capitanio, A. (1999).Statistical applications of the multivariate skew normal distribution.J.Roy.Statist.Soc. B61, 579-602. Full-length version available athttps://arXiv.org/abs/0911.2093
Azzalini, A. and Capitanio, A. (2003).Distributions generated by perturbation of symmetry with emphasis ona multivariate skewt distribution.J.Roy. Statist. Soc. B65, 367-389.Full-length version available athttps://arXiv.org/abs/0911.2342
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
Examples
data(wines)#m10 <- selm(flavanoids ~ 1, family="SN", data=wines, subset=(wine=="Barolo"))plot(m10)plot(m10, which=c(1,3)) # fig 3.1 and 3.2(a) of Azzalini and Capitanio (2014)#m18 <- selm(acidity ~ sugar + nonflavanoids + wine, family="SN", data=wines)plot(m18)plot(m18, param.type="DP")#m28 <- selm(cbind(acidity, alcohol) ~ sugar + nonflavanoids + wine, family="SN", data=wines)plot(m28, col=4) # data(ais)m30 <- selm(cbind(RCC, Hg, Fe) ~ 1, family="SN", data=ais)plot(m30, col=2, which=2)# multiple plots on the same sheet par(mfcol=c(2,2)) plot(m30, which=1:3)par(mfcol=c(1,1))The distribution of the product of two jointly normal ort variables
Description
Consider the productW=X_1 X_2 from a bivariate random variable(X_1, X_2) having joint normal or Student'st distribution, with 0 location and unit scale parameters. Functions are provided for the distribution function ofW in the normal and thet case, and the quantile function for thet case.
Usage
pprodn2(x, rho)pprodt2(x, rho, nu)qprodt2(p, rho, nu, tol=1e-5, trace=0)Arguments
x | a numeric vector |
p | a numeric vector of probabilities |
rho | a scalar value representing the correlation (or the matchingterm in thet case when correlation does not exists) |
nu | a positive scalar representing the degrees of freedom |
tol | the desired accuracy (convergence tolerance),passed to function |
trace | integer number for controlling tracing information, passed on to |
Details
Functionpprodt2 implements formulae in Theorem 1 of Wallgren (1980).Corresponding quantiles are obtained byqprodt2 by solving the pertaining non-linear equations with the aid ofuniroot,one such equation for each element ofp.
Functionpprodn2 implements results for the central case inTheorem 1 of Aroian et al. (1978).
Value
a numeric vector
Author(s)
Adelchi Azzalini
References
Aroian, L.A., Taneja, V.S, & Cornwell, L.W. (1978).Mathematical forms of the distribution of the product of two normal variables.Communications in statistics. Theory and methods, 7, 165-172
Wallgren, C. M. (1980).The distribution of the product of two correlatedt variates.Journal of the American Statistical Association, 75, 996-1000
See Also
Examples
p <- pprodt2(-3:3, 0.5, 8)qprodt2(p, 0.5, 8)Predict method for selm-class objects
Description
Predicted values based on a model object produced byselm with univariate response.
Usage
## S3 method for class 'selm'predict(object, newdata, param.type = "CP", interval = "none", level = 0.95, na.action = na.pass, ...)Arguments
object | an object of class |
newdata | an optional data frame in which to look for variables withwhich to predict. If omitted, the fitted values are used. |
param.type | a character string with the required parameterization; it must be one of |
interval | type of interval calculation among |
level | tolerance/confidence level (default value is |
na.action | function determining what should be done with missing values in newdata. The default is to predict |
... | not used, only there for compatibility reasons. |
Details
Predicted values are obtained by evaluating the regression function in the dataframenewdata (which defaults tomodel.frame(object)).Settinginterval other than"none" produces computation ofconfidence or prediction (tolerance) intervals at the specified level.
Ifnewdata is omitted the predictions are based on the data used for the fit.
The action taken in case of missing data is regulated by argumentna.action, along the lines of functionpredict.lm.
A detailed description of the methodology underlyingpredict.selmis available in the technical note of Azzalini (2016).
Value
a vector of predictions (ifinterval="none") or a matrix ofpredictions and bounds with column namesfit,lwr, andupr, ifinterval is set.
Author(s)
Adelchi Azzalini
References
Azzalini, A. (2016). Derivation of various types of intervals from aselm object.Technical note distributed with the documentation of theR packagesn, in fileselm-intervals.pdf within section ‘User guide, package vignettes and other documentation’.
See Also
makeSECdistr for theCP/DP parameterizations,
predict.lm for usage ofna.action
Examples
data(barolo)attach(barolo) A75 <- (reseller=="A" & volume==75)detach(barolo)m3 <- selm(log(price, 10) ~ age, data=barolo[A75,], family="ST")Profile log-likelihood function of selm-class objects
Description
One- or two-dimensional profile (penalized) log-likelihood function of aselm fit and corresponding confidence interval or regions
Usage
## S3 method for class 'selm'profile(fitted, param.type, param.name, param.values, npt, opt.control = list(), plot.it = TRUE, log = TRUE, levels, trace = FALSE, ...)Arguments
fitted | an object of class |
param.type | a character string with the required parameterization; it must be either |
param.name | either a single character string or a vector of two such terms with the name(s) of the parameter(s) for which the profile log-likelihood is required; these names must match those appearing in |
param.values | in the one-parameter case, a numeric vector with the values where the log-likelihood must be evaluated; in the two-parametercase, a list of two such vectors used to build a grid of coordinates of points. Their range must identify an interval or a rectangle whichincludes theMLE orMPLE obtained by |
npt | in case the vector or any of the vectors of argument |
opt.control | an optional list passed as argument |
plot.it | a logical value; if |
log | a logical value (default: |
levels | a single probability value (in the one-parameter case) or a vector of such values (in the two-parameter case) for which the confidenceinterval or region is requited; see ‘Details’ for more information. |
trace | a logical value (default: |
... | optional graphical parameters passed to the plotting functions. |
Details
For each chosen point of the parameter(s) to be profiled, the log-likelihood is maximized with respect to the remaining parameters.The optimization process is accomplished using theoptimoptimization function, withmethod="BFGS". This step can be regulated bythe user viaopt.control which is passed tooptim ascontrol argument, apart from elementfnscale whose use isreserved.
If the originalfitted object included a fixed parameter value, this is kept fixed here. If the estimation method was"MPLE",that choice carries on here; in case the penalty function was user-defined,it must still be accessible.
For plotting purposes and also in the numerical output, the deviancefunctionD is used, namely
D = 2\left[\max(\log L) - \log L\right]
whereL denotes the likelihood.
The range ofparam.values must enclose the maximum (penalized)likelihood estimates (MLE orMPLE) by an adequate extent such that suitable confidence intervals or regions can be established fromstandard asymptotic theory. If this condition does not hold, the functionstill proceeds, but no confidence interval or region is delivered.For theSN family andDP parameterization, the asymptotictheory is actually non-standard near the important point\alpha=0, but the correspondence with the regular case of theCP parameterization, still allows to derive confidence regions using standardprocedures; for additional information, see Section 3.1.6 ofAzzalini and Capitanio (2014).When theMLE occurs on the frontier of the parameter space, a message is issued and no confidence interval is produced, while in thetwo-parameter case the plot is not labelled with probability values, but onlywith deviance levels.
Value
An invisible list whose components, described below, are partly different in the one- and the two-parameter cases.
call | the calling statement |
<param1> | values of the first parameter |
<param2> | values of the second parameter (in the two-parameter case) |
logLik | numeric vector or matrix of the profile log-likelihood values |
confint | in the one-parameter case, the confidence interval |
level | in the one-parameter case, the confidence level |
deviance.contour | in the two-parameter case, a list of lists whoseelements identify each curve of the contour plot |
Warnings
This function is experimental and changes in future versionsof the package may occur. Users should not rely on the persistence of the same user interface or the same name(s).
It is a known fact that, in some critical situations, peculiar outcomesare produced.
Author(s)
Adelchi Azzalini
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
makeSECdistr for theCP/DP parameterizations,
optim for itscontrol argument
Examples
data(ais, package="sn")m1 <- selm(log(Fe) ~ BMI + LBM, family = "sn", data = ais)pll <- profile(m1, "dp", param.name="alpha", param.val=c(-3,2))profile(m1, "cp", param.name="gamma1", param.val=seq(-0.7, 0.4, by=0.1))# in the next example, we reduce the grid points to save execution timepll <- profile(m1, "cp", param.name=c("(Intercept.CP)", "gamma1"), param.val = list(c(1.5, 4), c(-0.8, 0.5)), npt=c(11,11) )Residuals and fitted values fromselm fits
Description
residuals andfitted methods for classes"selm" and"mselm".
Usage
## S4 method for signature 'selm'residuals(object, param.type = "CP", ...)## S4 method for signature 'mselm'residuals(object, param.type = "CP", ...)## S4 method for signature 'selm'fitted(object, param.type = "CP", ...)## S4 method for signature 'mselm'fitted(object, param.type = "CP", ...)Arguments
object | an object of class |
param.type | a character string which indicates the required type of parameter type; possible values are |
... | not used, included for compatibility with the generic method. |
Value
a numeric vector (forselm-class objects) or a matrix(formselm-class objects).
Note
The possible options ofparam.type are described in thedocumentation ofdp2cp; their corresponding outcomes differ by an additive constant only. With the"CP" option (that is,the ‘centred parametrization’), the residuals are centred around 0, at least approximately; this is a reason for setting"CP" as the default option. For more information, see the ‘Note’ in the documentation ofsummary.selm.
Author(s)
Adelchi Azzalini
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
dp2cp,summary.selm,selm function,selm-class
Examples
data(wines, package="sn")m5 <- selm(acidity ~ phenols + wine, family="SN", data=wines)residuals(m5)residuals(m5, "dp")fitted(m5, "dp")#m12 <- selm(cbind(acidity, alcohol) ~ phenols + wine, family="SN", data=wines)residuals(m12)## see other examples at function selmStandard deviation
Description
Thesd function from thestats is replaced by a newmethod in order to introduce a separate method to deal with objects of classSECdistrUv. The functionsd.default is an alias of the originalfunctionsd.
Usage
sd(x, ...) ## Default S3 method:sd(x, na.rm = FALSE, ...)Arguments
x | a numeric vector, matrix or data frame. |
na.rm | logical. Should missing values be removed? |
... | further arguments passed to or from other methods. |
See Also
Fitting linear models with skew-elliptical error term
Description
Functionselm fits alinearmodelwithskew-elliptical error term. The term ‘skew-elliptical distribution’ is an abbreviated equivalent of skew-elliptically contoured (SEC) distribution.The function works for univariate and multivariate response variables.
Usage
selm(formula, family = "SN", data, weights, subset, na.action, start = NULL, fixed.param = list(), method = "MLE", penalty=NULL, model = TRUE, x = FALSE, y = FALSE, contrasts = NULL, offset, ...)Arguments
formula | an object of class |
family | a character string which selects the parametric familyofSEC type assumed for the error term. It must be one of |
data | an optional data frame containing the variables inthe model. If not found in |
weights | a numeric vector of weights associated to individualobservations. Weights are supposed to represent frequencies, hence must benon-negative integers (not all 0) and |
subset | an optional vector specifying a subset of observationsto be used in the fitting process. It works like the same parameterin |
na.action | a function which indicates what should happenwhen the data contain |
start | a vector (in the univariate case) or a list (in the multivariate case) of initialDP values for searching the parameter estimates. See ‘Details’ about a choice ofstart to be avoided. If |
fixed.param | a list of assignments of parameter values which mustbe kept fixed in the estimation process. Currently, there only two types of admissible constraint: one is toset |
method | a character string which selects the estimation method to beused for fitting. Currently, two options exist: |
penalty | a character string which denotes the penalty function to besubtracted to the log-likelihood function, when |
model,x,y | logicals. If |
contrasts | an optional list. See the |
offset | this can be used to specify ana priori knowncomponent to be included in the linear predictor during fitting. Thisshould be |
... | optional control parameters, as follows.
|
Details
By default,selm fits the selected model by maximumlikelihood estimation (MLE), making use of some numericaloptimization method. Maximization is performed in oneparameterization, usuallyDP, and then the estimates are mapped toother parameter sets,CP and pseudo-CP; seedp2cp for more information on parameterizations. These parameter transformations are carried out trasparently to the user. The observed information matrix is used to obtain the estimated variance matrix of theMLE's and from this the standard errors. Background information onMLE in the context ofSEC distributions is provided by Azzalini and Capitanio (2014); see specifically Chapter 3, Sections 4.3, 5.2, 6.2.5–6. For additionalinformation, see the original research work referenced therein as well asthe sources quoted below.
Although the density functionof SEC distributions are expressed usingDP parameter sets, the methods associated to the objects createdby this function communicate, by default, their outcomes in theCPparameter set, or its variant form pseudo-CP whenCPdoes not exist; the ‘Note’ atsummary.selm explains why. A more detailed discussion is provided by Azzalini and Capitanio (1999, Section 5.2) and Arellano-Valle and Azzalini (2008, Section 4), for the univariate and the multivariate SN case, respectively; an abriged account is available in Sections 3.1.4–6 and 5.2.3 of Azzalini and Capitanio (2014). For the ST case, see Arellano-Valle and Azzalini (2013).
There is a known open issue which affects computation of the informationmatrix of the multivariate skew-normal distribution when the slantparameter\alpha approaches the null vector; see p.149 ofAzzalini and Capitanio (2014). Consequently, if a model withmultivariate response is fitted withfamily="SN" and the estimatealpha of\alpha is at the origin or neary so, theinformation matrix and the standard errors are not computed and awarning message is issued. In this unusual circumstance, a simplework-around is to re-fit the model withfamily="ST", which willwork except in remote cases when (i) the estimated degrees of freedomnu diverge and (ii) stillalpha remains at the origin.
The optional argumentfixed.param=list(alpha=0) imposes theconstraint\alpha=0 in the estimation process; in the multivariate case, the expression is interpreted in the sense that all the components of vector\alpha are zero, which implies symmetry of theerror distribution, irrespectively of the parameterization subsequently adopted for summaries and diagnostics.When this restriction is selected, the estimation method cannot beset to"MPLE". Under the constraint\alpha=0,iffamily="SN", the model is fitted similarly tolm, exceptthat hereMLE is used for estimation of the covariance matrix. Iffamily="ST" orfamily="SC", a symmetric Student'st or Cauchy distribution is adopted.
Under the constraint\alpha=0, the location parameter\xicoincides with the mode and the mean of the distribution, when the latterexists. In addition, when the covariance matrix of aSTdistribution exists, it differs from\Omega only by a multiplicativefactor. Consequently, the summaries of a model of this sort automaticallyadopt theDP parametrization.
The other possible form of constraint allows to fix the degrees offreedom whenfamily="ST". The two constraints can be combined writing, for instance,fixed.param=list(alpha=0, nu=6).The constraintnu=1 is equivalent to selectfamily="SC".In practice, an expression of typefixed.param=list(..) can beabbreviated tofixed=list(..).
Since version 1.6.0, a new initialization procedure has been introducedfor the casefamily="ST", which adopts the method proposed by Azzalini & Salehi (2020), implemented in functionsst.prelimFit andmst.prelimFit.Correspondingly, thestart argument can now be of different type,namely a character with possible values"M0","M2" (detault in the univariate case) and"M3" (detault in the multivariate case).The choice"M0" selects the older method, in use prior to version1.6.0. For more information, see Azzalini & Salehi (2020).
In some cases, especially for small sample size, theMLE occurs onthe frontier of the parameter space, leading toDP estimates withabs(alpha)=Inf or to a similar situation in the multivariate case or in an alternative parameterization. Such outcome is regared by many asunsatisfactory; surely it prevents using the observed information matrix tocompute standard errors. This problem motivates the use of maximum penalizedlikelihood estimation (MPLE), where the regular log-likelihoodfunction\log~L is penalized by subtracting an amountQ, say, increasingly large as|\alpha| increases. Hence the function which is maximized at the optimization stage is now\log\,L~-~Q. Ifmethod="MPLE" andpenalty=NULL, the default functionQpenalty is used,which implements the penalization:
Q(\alpha) = c_1 \log(1 + c_2 \alpha_*^2)
wherec_1 andc_2 are positive constants, whichdepend on the degrees of freedomnu in theST case,
\alpha_*^2 = \alpha^\top \bar\Omega \alpha
and\bar\Omega denotes the correlation matrix associated to the scale matrixOmega described in connection withmakeSECdistr. In the univariate case\bar\Omega=1,so that\alpha_*^2=\alpha^2. Further information onMPLE and this choice of the penalty function is given in Section 3.1.8 and p.111 of Azzalini and Capitanio (2014); for a more detailed account, see Azzalini and Arellano-Valle (2013) and references therein.
It is possible to change the penalty function, to be declared via the argumentpenalty. For instance, if the calling statement includespenalty="anotherQ", the user must have defined
anotherQ <- function(alpha_etc, nu = NULL, der = 0)
with the following arguments.
alpha_etc: in the univariate case, a single valuealpha;in the multivariate case, a two-component list whose first component isthe vectoralpha, the second one is matrix equal tocov2cor(Omega).nu: degrees of freedom, only relevant iffamily="ST".der: a numeric value which indicates the required order ofderivation; ifder=0(default value), only the penaltyQneeds to be retuned by the function; ifder=1,attr(Q, "der1")must represent thefirst order derivative ofQwith respect toalpha; ifder=2, alsoattr(Q, "der2")must be assigned, containingthe second derivative (only required in the univariate case).
This function must return a single numeric value, possibly with requiredattributes when is called withder>1.Sincesn imports functionsgrad andhessian from packagenumDeriv, one can rely on them for numerical evaluation of the derivatives, if they are not available in an explicit form.
This penalization scheme allows to introduce a prior distribution\pi for\alpha by settingQ=-\log\pi, leading to a maximuma posteriori estimate in the stated sense. SeeQpenalty for more information and an illustration.
The actual computations are not performed withinselm which only sets-up ingredients for work ofselm.fit and other functionsfurther below this one. Seeselm.fit for more information.
Value
an S4 object of classselm ormselm, depending on whetherthe response variable of the fitted model is univariate or multivariate;these objects are described in theselm class.
Cautionary notes
The first of these notes applies to the stagepreceding theuse ofselm and related fitting procedures. Before fitting a model ofthis sort, consider whether you have enough data for this task. In this respect, the passage below taken from p.63 of Azzalini and Capitanio (2014) is relevant.
“Before entering technical aspects, it is advisable to underline a qualitative effect of working with a parametric family which effectively is regulated by moments up to the third order. The implication is that the traditional rule of thumb by which a sample size is small up to ‘aboutn = 30’, and then starts to become ‘large’,while sensible for a normal population or other two-parameter distribution, is not really appropriate here. To give an indication of a new threshold is especially difficult, because the value of\alpha also has a role here. Under thiscaveat, numerical experience suggests that ‘aboutn = 50’ may be a more appropriate guideline in this context.”
The above passage referred to the univariate SN context. In the multivariate case, increase the sample size appropriately, especially so with theST family.This is not to say that one cannot attempt fitting these models with small or moderate sample size. However, one must be aware of the implications and not be surprised if problems appear.
The second cautionary note refers instead to the outcome of a call toselm and related function, or the lack of it.The estimates are obtained by numerical optimization methods and, asusual in similar cases, there is no guarantee that the maximum of theobjective function is achieved. Consideration of model simplicityand of numerical experience indicate that models withSN errorterms generally produce more reliable results compared to those with theST family. Take into account that models involving a traditional Student'st distribution with unknown degrees of freedom can already be problematic; the presence of the (multivariate) slant parameter\alpha in theST family cannot make things any simpler. Consequently, care must be exercised, especially so if one works with the (multivariate)ST family. Consider re-fitting a model with different starting values and, in theST case, building the profile log-likelihood for a range of\nu values; functionprofile.selm can be useful here.
Details on the numerical optimization which has produced objectobj can be extracted withslot(obj, "opt.method"); inspection of thiscomponent can be useful in problematic cases.# Be aware that occasionallyoptim andnlminb declare successful# completion of a regular minimization problem at a point where the Hessian # matrix is not positive-definite.
Author(s)
Adelchi Azzalini
References
Arellano-Valle, R. B., and Azzalini, A. (2008).The centred parametrization for the multivariate skew-normal distribution.J. Multiv. Anal.99, 1362–1382.Corrigendum:100 (2009), 816.
Arellano-Valle, R. B., and Azzalini, A. (2013, available online 12 June 2011).The centred parametrization and related quantities for the skew-t distribution.J. Multiv. Anal.113, 73–90.
Azzalini, A. and Capitanio, A. (1999).Statistical applications of the multivariate skew normal distribution.J.Roy.Statist.Soc. B61, 579–602. Full-length version available athttps://arXiv.org/abs/0911.2093
Azzalini, A. and Arellano-Valle, R. B. (2013, available online 30 June 2012). Maximum penalized likelihood estimation for skew-normal and skew-t distributions.J. Stat. Planning & Inference143, 419–433.
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
Azzalini, A. and Salehi, M. (2020).Some computational aspects of maximum likelihood estimation of the skew-t distribution. InComputational and Methodological Statistics and Biostatistics, edited by A. Bekker, Ding-Geng Chen and Johannes T. Ferreira, pp.3-28. Springer Nature Switzerland.
See Also
selm-classfor classes"selm"and"mselm",summary.selmfor summaries,plot.selmfor plots,residuals.selmfor residuals and fitted valuesthe generic functions
coef,logLik,vcov,profile,confint,predictthe underlying function
selm.fitand those further downthe selection of a penalty function of the log-likelihood, such as
Qpenaltythe function
extractSECdistrto extract theSECerror distribution from an object returned byselmthe broad underlying logic and a number of ingredients are like in function
lm
Examples
data(ais)m1 <- selm(log(Fe) ~ BMI + LBM, family="SN", data=ais)print(m1)summary(m1)s <- summary(m1, "DP", cov=TRUE, cor=TRUE)plot(m1)plot(m1, param.type="DP")logLik(m1)coef(m1)coef(m1, "DP")var <- vcov(m1)#m1a <- selm(log(Fe) ~ BMI + LBM, family="SN", method="MPLE", data=ais)m1b <- selm(log(Fe) ~ BMI + LBM, family="ST", fixed.param=list(nu=8), data=ais)#data(barolo)attach(barolo)A75 <- (reseller=="A" & volume==75)logPrice <- log(price[A75],10) m <- selm(logPrice ~ 1, family="ST", opt.method="Nelder-Mead")summary(m)summary(m, "DP")plot(m, which=2, col=4, main="Barolo log10(price)")# cfr Figure 4.7 of Azzalini & Capitanio (2014), p.107detach(barolo)#-----# examples with multivariate response#m3 <- selm(cbind(BMI, LBM) ~ WCC + RCC, family="SN", data=ais)plot(m3, col=2, which=2)summary(m3, "dp")coef(m3)coef(m3, vector=FALSE)#data(wines)m28 <- selm(cbind(chloride, glycerol, magnesium) ~ 1, family="ST", data=wines)dp28 <- coef(m28, "DP", vector=FALSE) pcp28 <- coef(m28, "pseudo-CP", vector=FALSE) # the next statement takes a little more time than othersplot(m28)#m4 <- selm(cbind(alcohol,sugar)~1, family="ST", data=wines)m5 <- selm(cbind(alcohol,sugar)~1, family="ST", data=wines, fixed=list(alpha=0))print(1 - pchisq(2*as.numeric(logLik(m4)-logLik(m5)), 2)) # test for symmetryClassesselm andmselm of objects created by functionselm
Description
A successful call to functionselm creates an object ofeither of these classes, having a structure described in section‘Slots’. A set of methods for these classes of objects exist, listed insection ‘Methods’.
Objects from the class
An object can be created by a successful call to functionselm.
Slots
call:the calling statement.
family:the parametric family of skew-ellitically contoured distributed (SEC) type.
logL:log-likelihood or penalized log-likelihood valueachieved at the end of the maximization process.
method:estimation method (
"MLE"or"MPLE").param:estimated parameters, for various parameterizations.
param.var:approximate variance matrices of the parameter estimates, for various parameterizations.
size:a numeric vector with size of various components.
fixed.param:a vector of parameters which have been keptfixed in the fitting process, if any.
residuals.dp:residual values, for DP-type parameters.
fitted.values.dp:fitted values, for DP-type parameters.
control:a list with control parameters.
input:a list of selected input values.
opt.method:a list with details on the optimization method.
Methods
coef | signature(object = "selm"): ... |
logLik | signature(object = "selm"): ... |
plot | signature(x = "selm"): ... |
show | signature(object = "selm"): ... |
summary | signature(object = "selm"): ... |
residuals | signature(object = "selm"): ... |
fitted | signature(object = "selm"): ... |
vcov | signature(object = "selm"): ... |
weights | signature(object = "selm"): ... |
profile | signature(fitted = "selm"): ... |
confint | signature(object = "selm"): ... |
predict | signature(object = "selm"): ... |
coef | signature(object = "mselm"): ... |
logLik | signature(object = "mselm"): ... |
plot | signature(x = "mselm"): ... |
show | signature(object = "mselm"): ... |
summary | signature(object = "mselm"): ... |
residuals | signature(object = "mselm"): ... |
fitted | signature(object = "mselm"): ... |
vcov | signature(object = "mselm"): ... |
weights | signature(object = "mselm"): ... |
Note
Seedp2cp for a description of possible parameter sets.WhenlogLik is used on an object obtained using the MPLE estimationmethod, the value reported is actually thepenalized log-likelihood.
Author(s)
Adelchi Azzalini
See Also
See alsoselm function,plot.selm,summary.selm,dp2cp
Examples
data(ais)m1 <- selm(log(Fe) ~ BMI + LBM, family="SN", data=ais)summary(m1)plot(m1)logLik(m1)res <- residuals(m1)fv <- fitted(m1)# data(wines, package="sn")m2 <- selm(alcohol ~ malic + phenols, data=wines)#m12 <- selm(cbind(acidity, alcohol) ~ phenols + wine, family="SN", data=wines)coef(m12)cp <- coef(m12, vector=FALSE)dp <- coef(m12, "DP", vector=FALSE)plot(m12)plot(m12, which=2, col="gray60", pch=20)Fitting functions forselm models
Description
A call toselm activates a call toselm.fit andfrom here to some other function which actually performs the parametersearch, among those listed below. These lower-level functions can be called directly for increased efficiency, at the expense of some more programming effort and lack of methods for the returned object.
Usage
selm.fit(x, y, family = "SN", start = NULL, w, fixed.param = list(), offset = NULL, selm.control=list())sn.mple(x, y, cp = NULL, w, penalty = NULL, trace = FALSE, opt.method = c("nlminb", "Nelder-Mead", "BFGS", "CG", "SANN"), control = list()) st.mple(x, y, dp = NULL, w, fixed.nu = NULL, symmetr = FALSE, penalty = NULL, trace = FALSE, opt.method = c("nlminb", "Nelder-Mead", "BFGS", "CG", "SANN"), control = list()) msn.mle(x, y, start = NULL, w, trace = FALSE, opt.method = c("nlminb", "Nelder-Mead", "BFGS", "CG", "SANN"), control = list())msn.mple(x, y, start = NULL, w, trace = FALSE, penalty = NULL, opt.method = c("nlminb", "Nelder-Mead", "BFGS", "CG", "SANN"), control = list()) mst.mple(x, y, start = NULL, w, fixed.nu = NULL, symmetr=FALSE, penalty = NULL, trace = FALSE, opt.method = c("nlminb", "Nelder-Mead", "BFGS", "CG", "SANN"), control = list())Arguments
x | a full-rank design matrix with the first column of all 1's. |
y | a vector or a matrix of response values such that |
family | a character string which selects the parametric family ofdistributions assumed for the error term of the regression model. It must one of |
start,dp,cp | a vector or a list of initial parameter values,depeding whether |
w | a vector of non-negative integer weights of length equal to |
fixed.param | a list of assignments of parameter values to be keptfixed during the optimization process. Currently, there is only one suchoption, namely |
penalty | an optional character string with the name of the penaltyfunction of the log-likelihood; default value |
offset | this can be used to specify ana priori knowncomponent to be included in the linear predictor during fitting. Thisshould be |
trace | a logical value which regulates printing of successive calls to the target function; default value is |
fixed.nu | a positive value to keep fixed the parameter |
symmetr | a logical flag indicating whether a contraint of symmetry isimposed on the slant parameter; default is |
opt.method | a character string which selects the optimization methodwithin the set |
selm.control | a list whose components regulate the working of |
control | a list of control items passed to the optimization function. |
Details
A call toselm produces a call toselm.fit whichselects the appropriate function amongsn.mple,st.mple,msn.mle,msn.mple,mst.mple, depending on thearguments of the calling statement. In the adopted scheme for function names,msn refers to a multivariate skew-normal distribution andmst refers to a multivariate skew-t distribution, whilemle andmple refers to maximum likelihood and maximumpenalized likelihood estimation, respectively.Of these functions,sn.mple works inCP space; the othersin theDP space. In all cases, a correspondig mapping to the alternative parameter space is performed before exitingselm.fit,in addition to the selected parameter set.
The components ofselm.control are as follows:
method: the estimation method,"MLE"or"MPLE".penalty: a string with the name of the penalty function.info.type: a string with the name of the information matrix,"observed"or"expected"; currently fixed at "observed".opt.method: a character string which selects the optimizationmethod.opt.control: a list of control parameters ofopt.method.
Functionmsn.mle, forMLE estimation of linear models withSN errors, is unchanged from version 0.4-x of the package. Functionmsn.mple is similar tomsn.mle but allows to introducea penalization of the log-likelihood; whenpenalty=NULL, a call tomsn.mle is more efficient.Functionssn.mple andmst.mple work likesn.mle andmst.mle in version 0.4-x if the argumentpenalty is not set or it is set toNULL, except thatmst.mple does nothandle a univariate response (usest.mple for that).
Value
A list whose specific components depend on the named function.Typical components are:
call | the calling statement |
dp | vector or list of estimatedDP parameters |
cp | vector or list of estimatedCP parameters |
logL | the maximized (penalized) log-likelihood |
aux | a list with auxiliary output values, depending on the function |
opt.method | a list produced by the numerical |
Background
Computational aspects of maximum likelihood estimation for univariateSN distributions are discussed in Section 3.1.7 of Azzalini andCapitanio (2014). The working ofsn.mple follows these lines; maximization is performed in theCP space. All other functionsoperate on theDP space.
The technique underlyingmsn.mle is based on a partial analytical maximization, leading implicitly to a form of profile log-likelihood.This scheme is formulated in detail in Section 6.1 of Azzalini and Capitanio (1999) and summarized in Section 5.2.1 of Azzalini and Capitanio (2014). The same procedure is not feasible when one adoptsMPLE; hence functionmsn.mple has to maximize over a larger parameter space.
When the SN family is fitted with the constraintalpha=0, this amountsto adopt a classical linear model with Gaussian distributional assumption.The correspondingMLE's are the same as those produced bylm, except that the denominator the of theMLE variance (matrix) has the ‘uncorrected’ form.In the multivariate case, the covariance matrix ofMLE is computed using expression (10) in Section 15.8 of Magnus and Neudecker (2007).
Maximization of the univariateST log-likelihood is speeded-up by using the expressions of the gradient given by DiCiccio and Monti (2011),reproduced with inessential variants in Section 4.3.3 of Azzalini and Capitanio (2014).
The working ofmst.mple is based on a re-parameterization described in Section 5.1 of Azzalini and Capitanio (2003). The expressions of the corresponding log-likelihood derivatives are given in Appendix B of the full version of the paper.
Author(s)
Adelchi Azzalini
References
Azzalini, A. and Capitanio, A. (1999).Statistical applications of the multivariate skew normal distribution.J.Roy.Statist.Soc. B61, 579–602. Full-length version available athttps://arXiv.org/abs/0911.2093
Azzalini, A. and Capitanio, A. (2003).Distributions generated by perturbation of symmetry with emphasis ona multivariate skewt distribution.J.Roy. Statist. Soc. B65, 367–389.Full-length version available athttps://arXiv.org/abs/0911.2342
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
DiCiccio, T. J. and Monti, A. C. (2011). Inferential aspects of the skewt-distribution.Quaderni di Statistica13, 1–21.
Magnus, J. R. and Neudecker, H. (2007).Matrix Differential Calculus with Applications in Statistics and Econometrics, third edition. John Wiley & Sons.
See Also
selm for a comprehensive higher level fitting function,Qpenalty for specification of a penalty function
Examples
data(wines, package="sn")X <- model.matrix(~ phenols + wine, data=wines)fit <- msn.mle(x=X, y=cbind(wines$acidity, wines$alcohol), opt.method="BFGS")fit <- st.mple(x=X, y = wines$acidity, fixed.nu=4, penalty="Qpenalty")Cumulants of univariate skew-normal and skew-t distributions
Description
Compute cumulants of univariate (extended) skew-normal and skew-t distributions up to a given order.
Usage
sn.cumulants(xi=0, omega=1, alpha=0, tau=0, dp=NULL, n=4) st.cumulants(xi=0, omega=1, alpha=0, nu=Inf, dp=NULL, n=4)Arguments
xi | location parameters (numeric vector). |
omega | scale parameters (numeric vector, positive). |
alpha | slant parameters (numeric vector). |
tau | hidden mean parameter (numeric scalar). |
nu | degrees of freedom (numeric scalar, positive); the default valueis |
dp | a vector containing the appropriate set of parameters. If |
n | maximal order of the cumulants. For |
Value
A vector of lengthn or a matrix withn columns, in case the input values are vectors.
Background
See Sections 2.1.4, 2.2.3 and 4.3.1 of the reference below
Author(s)
Adelchi Azzalini
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
Examples
sn.cumulants(omega=2, alpha=c(0, 3, 5, 10), n=5)sn.cumulants(dp=c(0, 3, -8), n=6)st.cumulants(dp=c(0, 3, -8, 5), n=6) # only four of them are computedst.cumulants(dp=c(0, 3, -8, 3))Expected and observed Fisher information forSN andST distributions
Description
Computes Fisher information for parameters of simple sample havingskew-normal (SN) or skew-t (ST) distribution orfor a regression model with errors term having such distributions, in theDP andCP parametrizations.
Usage
sn.infoUv(dp=NULL, cp=NULL, x=NULL, y, w, penalty=NULL, norm2.tol=1e-06) sn.infoMv(dp, x=NULL, y, w, penalty=NULL, norm2.tol=1e-06, at.MLE=TRUE)st.infoUv(dp = NULL, cp = NULL, x = NULL, y, w, fixed.nu = NULL, symmetr = FALSE, penalty = NULL, norm2.tol = 1e-06) st.infoMv(dp, x = NULL, y, w, fixed.nu = NULL, symmetr = FALSE, penalty = NULL, norm2.tol = 1e-06)Arguments
dp,cp | direct or centred parameters, respectively; one of themcan be a non- |
x | an optional matrix which represents the design matrix of a regression model |
y | a numeric vector (for |
w | an optional vector of weights (only meaningful for the observed information, hence if |
fixed.nu | an optional numeric value which declares a fixed value of thedegrees of freedom, |
symmetr | a logical flag which indicates whether a symmetry condition of the distribution is being imposed; default is |
penalty | a optional character string with the name of the penalty function used in the call to |
norm2.tol | for the observed information case, the Mahalanobis squareddistance of the score function from 0 is evaluated; if it exceeds |
at.MLE | a logical flag; if |
Value
a list containing the following components:
dp,cp | one of the two arguments is the one supplied on input; the other one matches the previous one in the alternative parametrization. |
type | the type of information matrix: "observed" or "expected". |
info.dp,info.cp | matrices of Fisher (observed or expected) information in the two parametrizations. |
asyvar.dp,asyvar.cp | inverse matrices of Fisher information in the twoparametrizations, when available; See ‘Details’ for additionalinformation. |
aux | a list containing auxiliary elements, depending of the selected function and the type of computation. |
Details
If the observed information matrix is required,dp orcp shouldrepresent the maximum likelihood estimates (MLE) for the giveny,otherwise the information matrix may fail to be positive-definite and itwould be meaningless anyway. Therefore, the squared Mahalobis norm of the score vector is evaluated and compared withnorm2.tol. If it exceeds this threshold, this is taken as an indication that the suppliedparameter list is not at theMLE and a warning message is issued.The returned list still includesinfo.dp andinfo.cp, but in this case these represent merely the matrices of second derivatives;asyvar.dp andasyvar.cp are set toNULL.
Background
The information matrix for the the univariateSN distribution inthe two stated parameterizations in discussed in Sections 3.1.3–4 of Azzalini and Capitanio (2014). For the multivariate distribution, Section 5.2.2 of this monograph summarizes briefly the findings of Arellano-Valle and Azzalini (2008).
References
Arellano-Valle, R. B. (2010).The information matrix of the multivariate skew-t distribution.Metron,LXVIII, 371–386.
Arellano-Valle, R. B., and Azzalini, A. (2008).The centred parametrization for the multivariate skew-normal distribution.J. Multiv. Anal.99, 1362–1382.Corrigendum:100 (2009), 816.
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
DiCiccio, T. J. and Monti, A. C. (2011). Inferential aspects of the skewt-distribution.Quaderni di Statistica13, 1–21.
See Also
Examples
infoE <- sn.infoUv(dp=c(0,1,5)) # expected informationset.seed(1); rnd <- rsn(100, dp=c(0, 1, 3))fit <- selm(rnd~1, family="SN")infoO <- sn.infoUv(cp=coef(fit), y=rnd) # observed information#data(wines)X <- model.matrix(~ pH + wine, data=wines)fit <- sn.mple(x=X, y=wines$alcohol)infoE <- sn.infoUv(cp=fit$cp, x=X)infoO <- sn.infoUv(cp=fit$cp, x=X, y=wines$alcohol)Spreading grouped data over intervals
Description
Assuming thatcounts represents the frequencies of observationsfalling into intervals identified bybreaks, the function returnsa vector of values obtained by uniformly spreading each group of data over the pertaining interval.
Usage
spread.grouped(breaks, counts, shift = "centre")Arguments
breaks | A numeric vector of strictly increasing finite values which identify a set of contiguous intervals on the real line. |
counts | A vector of non-negative integers representing the number of observations falling in the intervals specified by |
shift | a character string which regulates the positioning of the constructed points within a given interval, with possible values |
Value
A numeric vector of lengthsum(counts) of values withinrange(breaks).
Author(s)
Adelchi Azzalini
See Also
fitdistr.grouped
Examples
breaks <- c(10, 12, 15, 20)counts <- c(3, 2, 4)spread.grouped(breaks, counts)spread.grouped(breaks, counts, "l")Compute preliminary estimates for a linear model with ST-distributed error term
Description
For a univariate or multivariate linear model where the error termis assumed to have skew-t (ST) distribution and the location parameteris a linear function of a set of explanatory values, the functions computepreliminary estimates to be used as initial values for a subsequentmaximization of the likelihood function.These functions are mainly intended for internal package use.
Usage
st.prelimFit(x, y, w, quick = TRUE, verbose = 0, max.nu = 30, SN=FALSE)mst.prelimFit(x, y, w, quick = TRUE, verbose = 0, max.nu = 30, SN=FALSE)Arguments
x | design matrix of numeric values. It may be missing;if present, the first column must contain all 1's. |
y | vector of observations of length |
w | a vector of non-negative integer weights of length |
quick | logical value which regulates whether a very quick estimate is produced (default value |
verbose | an integer value which regulates the amount of messagesprinted out; default value is 0. |
max.nu | threshold for the estimated degrees of freedom |
SN | logical value (default value: |
Details
The underlying methodology is the one presented by Azzalini and Salehi (2020).In its essence, it is based on the selection of parameter values achieving the best matching between certain quantile-based summaries of the ST distributionand the corresponding empirical quantities for the sample or, in the presence of explanatory variables, the same quantities computed from the residuals after fitting a median regression.
Argumentquick selects whether the above-described matching is performedin a quick or in an accurate way. Since the output values of this function are intended to be only initial values for subsequent likelihood maximization,this explains the default optionquick=TRUE. Other possible valuesareFALSE andNULL; the latter simply setsalpha=0andnu=10.
Since the methodology hinges on some selected sample quantiles, it can occasionally be spoiled by poor behaviour of these basic quantiles,especially for small or moderate sample sizes.The more visible effect of such situation is a very large value of theestimated degrees of freedom, which then hampers rather than help asubsequent likelihood maximization. It is therefore appropriate to setan upper limitmax.nu to this component.
Argumentx may be missing. In this case, a one-column matrix withall elements 1 is created.
Value
A call tost.prelimFit returns a list with these components:
dp | a vector of estimates in the DP parameterization |
residuals | a vector of residual values |
logLik | the corresponding log-likelihood value |
A call tomst.prelimFit returns a list with these components:
dp | a list with the estimates in the DP parameterization |
shrink.steps | the number of shrinking steps applied to the original estimate of the scale matrix to obtain an admissible matrix |
dp.matrix | a numeric matrix formed by the component-wise DP estimates |
logLik | the corresponding log-likelihood value |
Note
These functions are mainly intended to be called byselm,but they could be of interest for people developing their own procedures.
Author(s)
Adelchi Azzalini
References
Azzalini, A. and Salehi, M. (2020).Some computational aspects of maximum likelihood estimation of the skew-t distribution.In:Computational and Methodological Statistics and Biostatistics,edited by Andriëtte Bekker, Ding-Geng Chen and Johannes T. Ferreira.Springer. DOI: 10.1007/978-3-030-42196-0
See Also
selm and eitherdst ordmst for explanation of the DP parameters
Examples
data(barolo)attach(barolo) A75 <- (reseller=="A" & volume==75) log.price <- log(price[A75], 10)prelimFit <- st.prelimFit(y=log.price)detach(barolo)# data(ais)attach(ais)prelim32 <- mst.prelimFit(y=cbind(BMI, LBM), x=cbind(1, Ht, Wt))detach(ais)Summary of aSEC distribution object
Description
Produce a summary of an object of class either"SECdistrUv" or"SECdistrMv", which refer to a univariate or amultivariateSEC distribution, respectively. Both types ofobjects can be produced bymakeSECditr.
Usage
## S4 method for signature 'SECdistrUv'summary(object, cp.type = "auto", probs)## S4 method for signature 'SECdistrMv'summary(object, cp.type = "auto")Arguments
object | an object of class |
cp.type | a character string to select the required variance ofCP parameterization; possible values are |
probs | in the univariate case, a vector of probabilities for which the corresponding quantiles are required. If missing, the vector |
Details
For a description of theDP,CP and pseudo-CP parameter sets included in the returned object, seedp2cp.
Theaux slot of the returned object includes other summary quantities,as described next. In the univariate case, the reported quantile-based measures of skewness andkurtosis refer to the Bowley and Moors measures, respectively;see Groeneveld (2006) and Moors (1988) for their specifications.In the multivariate case, the Mardia's measures of skewness and kurtosisare computed from the expressions given on p.153 and p.178 ofAzzalini and Capitanio (2014).
In the univariate case,delta is a simple transformation of the slant parameteralpha; it takes values in(-1, 1). In the multivariate case,delta is a vector with componentsof similar type; they correspond to the matching terms of the univariate components. Thealpha* anddelta* coefficients are univariatecomprehensive summary quantities of slant; see pp.132-3 of Azzalini and Capitanio (2014) for their expressions. These quantitiesplay an important role inSEC distributions; for instance,the Mardia's measures of multivariare skewness and kurtosis dependon the vector of slant parameters only viadelta* or, equivalently,viaalpha*.
The mode, which is unique for all these distributions, is computed by a numerical line search between theDP location and theCPlocation (or the pseudo-DP location, when the latter doesexists). This line search is univariate also in the multivariate case, using Propositions 5.14 and 6.2 of Azzalini and Capitanio (2014);see also Problem 5.14.
The examples below illustrate how extract various components fromaux and other slots of the returned object.
Value
A list with the following components:
family | name of the family within theSEC class, character |
dp | |
name | the name of the distribution, character string |
compNames | in the multivariate case the names of the components,a character vector |
cp | |
cp.type | the name of the selected variant of theCP set |
aux | a list with auxiliary ingredients (mode, coefficients of skewness and kurtosis, in the parametric and non-parametric variants, and more); see Section ‘Details’ for more information. |
The list itemsdp andcp are vectors ifclass(object) isSECdistrUv (univariate distribution); they are lists ifclass(object) isSECdistrMv (multivariate distribution).
Author(s)
Adelchi Azzalini
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
Moors, I. J. A. (1988). A quantile alternative for kurtosis.The Statistician37, 25-32.
Groeneveld, R. A. (2006). Skewness, Bowley's measures of. In volume12, 7771-3, ofEncyclopedia of Statistical Sciences, 2nd edition, edited by Kotz et al. Wiley, New York.
See Also
makeSECdistr for building aSEC distribution
extractSECdistr for extracting aSECdistribution from aselm fit
methodsmean andsd for computing the mean and the standard deviation ofSECdistrUv-class objects,methodsmean andvcov for computing the mean vector and the variance matrix ofSECdistrMv-class objects
modeSECdistr for computing the mode directly
Examples
f3 <- makeSECdistr(dp=c(3,2,5), family="SC")summary(f3)s <- summary(f3, probs=(1:9)/10)print(slotNames(s)) print(names(slot(s,"aux"))) # the components of the 'aux' slotslot(s, "aux")$mode # the same of modeSECdistr(object=f3)slot(s, "aux")$q.measures # quantile-based measures of skewness and kurtosis#dp3 <- list(xi=1:3, Omega=toeplitz(1/(1:3)), alpha=c(-3, 8, 5), nu=6)st3 <- makeSECdistr(dp=dp3, family="ST", name="ST3", compNames=c("U", "V", "W"))s <- summary(st3)dp <- slot(s, "dp") # the same of slot(st3, "dp")slot(s, "cp")$var.cov # the same of vcov(st3)slot(s, "aux")$delta.star # comprehensive coefficient of shapeslot(s, "aux")$mardia # Mardia's measures of skewness and kurtosis#dp2 <- list(xi=rep(0,2), Omega=matrix(c(2,2,2,4),2,2), alpha=c(3,-5), tau=-1)esn2 <- makeSECdistr(dp=dp2, family="ESN", name="ESN-2d")summary(esn2)Classessummary.SECdistrMv andsummary.SECdistrUv
Description
Summaries of objects of classesSECdistrMv andSECdistrUv
Objects from the Class
Objects can be created by calls of typesummary(object) whenobject is of class either"SECdistrMv" or"SECdistrUv".
Slots
family:A character string which representsthe parametric family ofSEC type
dp:Object of class
"list"or"vector"for"SECdistrMv"and"SECdistrUv", respectivelyname:Object of class
"character"with the name of distributioncompNames:For
"SECdistrMv"objects, a character vector with names of the components of the multivariate distributioncp:Object of class
"list"or"vector"for"SECdistrMv"and"SECdistrUv", respectivelycp.type:a character string of theCP version
aux:A list of auxiliary quantities
Methods
- show
signature(object = "summary.SECdistrMv"): ...- show
signature(object = "summary.SECdistrUv"): ...
Author(s)
Adelchi Azzalini
See Also
summary.SECdistrMv,summary.SECdistrUv,
Summary of aSUN distribution object
Description
Produce a summary of an object of class"SUNdistr"
Usage
## S4 method for signature 'SUNdistr'summary(object, ...)Arguments
object | an object of class |
... | optional arguments passed to |
Value
An S4-object with the following slots:
dp | the parameters of the distrbution, a list |
name | the name of the distribution, a character string |
compNames | the names of the components, a character vector |
HcompNames | the names of the hidden components, a character vector |
mean | the mean value, a vector |
var.cov | the variance-covariance matrix |
gamma1 | the marginal indices of asymmetry, a vector |
cum3 | the third order cumulants, a three-dimensional array |
mardia | the Mardia's measures of multivariate asymmetry andskewness, a vector of length two |
Author(s)
Adelchi Azzalini
References
Arellano-Valle, R. B. and Azzalini, A. (2021).Some properties of the unified skew-normal distribution.Statistical Papers,doi:10.1007/s00362-021-01235-2andarXiv:2011.06316
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
makeSUNdistr for building aSUN distribution object
methodsmean andvcov for computing the mean vector and the variance matrix ofSUNdistr-class objects
Examples
Omega <- matrix(c(5, 1, 1, 6), 2, 2)Delta <- matrix(c(0.30, 0.50, 0.50, 0.85), 2, 2, byrow=TRUE)Gamma <- matrix(c(1, 0.18, 0.18, 1), 2, 2)tau <- c(0.4, -0.8)dp2 <- list(x=c(1, 0), Omega=Omega, Delta=Delta, tau=tau, Gamma=Gamma)sun2 <- makeSUNdistr(dp=dp2, name="SUN2", compNames=c("u", "v"))s <- summary(sun2)Classsummary.SUNdistr
Description
Summaries of objects of classesSUNdistr
Objects from the Class
Objects can be created by calls of typesummary(object) whenobject is of class"SUNdistr".
Slots
dp:a list of parameters
- name
the name of the distribution, a character string
- compNames
the names of the components, a character vector
- HcompNames
the names of the hidden components, a character vector
- mean
the mean value, a vector
- var.cov
the variance-covariance matrix
- gamma1
the marginal indices of asymmetry, a vector
- cum3
the third order cumulants, a three-dimensional array
- mardia
the Mardia's measures of multivariate asymmetry andskewness, a vector of length two
Methods
- show
signature(object = "summary.SUNdistr"): ...
Author(s)
Adelchi Azzalini
See Also
Summarizingselm fits
Description
summary method for class"selm" and"mselm".
Usage
## S4 method for signature 'selm'summary(object, param.type = "CP", cov = FALSE, cor = FALSE)## S4 method for signature 'mselm'summary(object, param.type = "CP", cov = FALSE, cor = FALSE)Arguments
object | an object of class |
param.type | a character string which indicates the required type of parameter type; possible values are |
cov | a logical value, to indicate if an estimate of the variance and covariance matrix of the estimates is required (default: |
cor | a logical value, to indicate if an estimate of the correlation matrix of the estimates is required (default: |
Value
An S4 object of classsummary.selm with 12 slots.
call: | the calling statement. |
family: | the parametric family of skew-ellitically contoured distributed (SEC) type. |
logL: | the maximized log-likelihood or penalized log-likelihood value |
method: | estimation method ( |
param.type: | a characer string with the chosen parameter set. |
param.table: | table of parameters, std.errors and z-values |
fixed.param: | a list of fixed parameter values |
resid: | residual values |
control: | a list with control parameters |
aux: | a list of auxiliary quantities |
size: | a numeric vector with various lengths and dimensions |
boundary: | a logical value which indicates whether the estimates are on the boundary of the parameter space |
Note
There are two reasons why the default choice ofparam.type isCP. One is the the easier interpretation of cumulant-based quantitiessuch as mean value, standard deviation, coefficient of skewness.
The other reason is more technical and applies only to cases when the estimate of the slant parameteralpha of theSN distribution is close to the origin: standard asymptotic distribution theory of maximum likelihood estimates (MLE's) does not apply in this case and thecorresponding standard errors are not trustworthy. The problem is especialy severe at\alpha=0 but to some extent propagates to its vicinity. Ifd=1, adoption ofCP leads to MLE's with regular asymptotic distribution across the parameter space, including\alpha=0. Ford>1 and\alpha=0, the problem is still unsolved at the present time, which is the reason whyselm issues a warningmessage when the MLE is in the vicinity of\alpha=0; see ‘Details’ ofselm. For background information, see Sections 3.1.4–6 and 5.2.3 of Azzalini and Capitanio (2014) and references therein.
This problem does not occur with the theSC and theSTdistribution (unless its tail-weight parameternu diverges, that is, when we are effectively approaching theSN case).
Author(s)
Adelchi Azzalini
References
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
See Also
selm function,selm (andmselm) class,plot.selm,dp2cp
Examples
data(wines, package="sn")m5 <- selm(acidity ~ phenols + wine, family="SN", data=wines)summary(m5)summary(m5, "dp")s5 <- summary(m5, "dp", cor=TRUE, cov=TRUE)dp.cor <- slot(s5, "aux")$param.corcov2cor(vcov(m5, "dp")) # the same## m6 <- selm(acidity ~ phenols + wine, family="ST", data=wines) # boundary!?#m12 <- selm(cbind(acidity, alcohol) ~ phenols + wine, family="SN", data=wines)s12 <- summary(m12)coef(m12, 'dp')coef(m12, "dp", vector=FALSE)## see other examples at function selmSymmetry-modulated distributions
Description
Symmetry-modulated distributions, univariate and multivariate,AKA skew-symmetric distributions
Usage
dSymmModulated(x, xi=0, omega=1, f0, G0, w, par.f0, par.G0, odd="check", log=FALSE, ...)rSymmModulated(n=1, xi=0, omega=1, f0, G0, w, par.f0, par.G0, odd="check", ...) dmSymmModulated(x, xi, Omega, f0, G0, w, par.f0, par.G0, odd="check", log=FALSE, ...) rmSymmModulated(n=1, xi, Omega, f0, G0, w, par.f0, par.G0, odd="check", ...)plot2D.SymmModulated(range, npt=rep(101,2), xi=c(0,0), Omega, f0, G0, w, par.f0, par.G0, odd="check", ...)Arguments
x | a vector of coordinates where the density must be evaluated; for multivariate densities, evaluated by |
xi | a numeric vector representing the location parameter; if must have length 1 for |
omega | a positive value representing the scale parameter. |
f0 | a character string denoting the symmetric density to be modulated;admissible values for |
G0 | a character string denoting the symmetric distribution used in themodulating factor; admissible values are |
w | the name (not as a character string) of a user-defined function which satisfies the condition |
par.f0,par.G0 | parameters required by |
odd | a character string, with possible values |
log | logical (default: |
n | an integer value (default: |
Omega | a symmetric positive-definite matrix which regulates the dependence structure of |
range | a two-column matrix whose column-wise range is taken as theplotting intervals on the coordinated axes forming a bivariate grid ofpoints over which the density is plotted. |
npt | a numeric vector with two elements representing the number of equally-spaced points on each axis spanning the |
... | optional parameters regulating the function |
Value
FordSymmModulated,rSymmModulated anddmSymmModulated, a numeric vector; fordmSymmModulated a matrix, unlessn=1.
Forplot2D.SymmModulated an invisible list containing thexandy coordinates forming the grid over which the densitypdfhas been evaluated for plotting.
Background
In the univariate case, start from symmetric density functionf_0, such thatf_0(z)=f_0(-z) for allz, and ‘modulate’ it in the form
f(z) = 2\, f_0(z)\, G_0\{w(z)\}
whereG_0 is a univariate symmetric (about 0) distribution function andw(z)is a real-valued odd function, hence satisfying the conditionw(-z)=-w(z);then $f(z)$ is a proper density function wich integrates to 1. A subsequent location and scale transformation applied tof(z) delivers the final density. Specifically, ifZ denotes a univariate random variable with densityf(z), then the computed density pertains to the transformed variable
\xi + \omega Z.
In the multivariate case, the scheme is similar, with natural adaptation.Densityf_0 is nowd-dimensional, whileG_0 is still univariate. The conditionsf_0(z)=f_0(-z) andw(-z)=-w(z) refer to ad-dimensional vectorz. Given ad \times d symmetric positive-definite matrix\Omega, we extract the the square roots\omega of the diagonal element of\Omega and correspondingly obtain the scale-free matrix
\bar\Omega = \mathrm{diag}(\omega)^{-1}\, \Omega\, \mathrm{diag}(\omega)^{-1}
which is used to regulate the dependence structure off_0(z) and so off(z). IfZ is multivariate random variable with densityf(z), then the final distribution refers to
\xi + \mathrm{diag}(\omega)\,Z
where\xi is ad-dimensional vector of location parametes.
This construction was put forward by Azzalini and Capitanio (2003). An essentially equivalent formulation has been presented by Wang et al. (2004).A summary account is available in Section 1.2 of Azzalini and Capitanio(2014); this includes, inter alia, an explanation of why the term‘symmetry-modulated’ distributions is preferred to ‘skew-symmetric’ distributions.
Random number generation is based on expression (1.11a) ofAzzalini and Capitanio (2014).
Details
FunctionsdSymmModulated andrSymmModulated deal with univariate distributions, for computing densities and generating random rumbers, respectively;dmSymmModulated andrmSymmModulated act similarly for multivariate distributions. For the bivariate case only,plot2D.SymmModulated computes a density over a grid ofcoordinates and produces acontour plot.
The distribution names used inf0 andG0 have, in the univariatecase, the same meaning as described in theDistributionspage, with the following exceptions, to achive symmetry about 0:"uniform" denotes a uniform distribution over the interval(-1, 1);"beta" denotes the a symmetric Beta distribution withsupport over the interval(-1, 1) and a common value of the shape parameters.
In the multivariate case, the available options"normal" and"t"forf0 refer to densities computed bydmnorm anddmt with 0 location and correlation matrix\bar\Omega, implied by\Omega. ArgumentG0 has the same meaning as in the univariate case.
Options"beta" and"t" forf0 andG0 require thespecification of a shape parameter, via the argumentspar.f0 andpar.G0, respectively. For"beta" the parameter representsthe common value of the shape parameters ofBeta;for"t", it representsdf ofTDist anddmt.
Functionw most be of the formw <- function(z, ...) where... are optional additional parameters andz represents valuedof the standardized form of the density; in the univariate case,x andz are related byz=(x-xi)/omega and an analogous fact holds inthe multivariate setting. The function must satisfy the conditionw(-z)=-w(z). It is assumed that the function is vectorized and, in themultivariate case, it will be called withz representing a matrix withd columns, ifd denotes the dimensionality of the randomvariable.
Argumentodd regulates the behaviour with respect to the conditionw(-z)=-w(z). If its value is"assume", the condition is justassumed to hold, and no action is taken. If the value is"check"(deafult), alimited check is performed; namely, in case of densities,the check is at 0 and the suppliedx points, while for random numbers the check is at 0 and the generated points.The value"force" ensures that the condition is satisfied by actually constructing a modified version of the user-supplied functionw, such that the required condition is enforced.
Author(s)
Adelchi Azzalini
References
Arellano-Valle, R. B., Gómez, H. W. and Quintana, F. A. (2004).A new class of skew-normal distributions.Comm. Stat., Theory &Methods,58, 111-121.
Azzalini, A. and Capitanio, A. (2003).Distributions generated by perturbation of symmetry with emphasis on a multivariate skew-t distribution.J.Roy. Statist. Soc. B65, 367–389.Full version of the paper athttps://arXiv.org/abs/0911.2342
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
Wang, J., Boyer, J. and Genton, M. G. (2004).A skew-symmetric representation of multivariate distributions.Statistica Sinica,14, 1259-1270.
See Also
Distributions,Beta,TDist,dmnorm,dmt,contour
Examples
x <- seq(2, 13, length=45)wLinear <- function(z, lambda) lambda*zy <- dSymmModulated(x, 5, 2, f0="normal", G0="normal", w=wLinear, lambda=3)# the same of dsn(x, 5, 2, 3), up to negligible numerical differences#wSGN <- function(z, lambda) z*lambda[1]/sqrt(1 + lambda[2]*z^2) y <- dSymmModulated(x, 5, 2, f0="normal", G0="normal", w=wSGN, lambda=c(3,5))# SGN distribution of Arellano-Valle et al. (2004)#wST <- function(z, lambda, nu) lambda*z*sqrt((nu+1)/(nu+z^2))y <- rSymmModulated(n=100, 5, 2, f0="t", G0="t", w=wST, par.f0=8, par.G0=9, lambda=3, nu=8)# equivalent to rst(n=100, 5, 2, 3, 8) #wTrigs <- function(z, p, q) sin(z * p)/(1 + cos(z * q))x <- seq(-1, 1, length=51)y <- dSymmModulated(x, 0, 1, f0="beta", G0="logistic", w=wTrigs, par.f0=2, par.G0=NULL, p=5, q=0.5) plot(x, y, type="l") # univariate analogue of the bivariate distribution on pp.372-3 of# Azzalini & Capitanio (2003) #range <- cbind(c(-3,3), c(-3,3))wMvTrigs <- function(z, p, q) sin(z %*% p)/(1 + cos(z %*% q))plot2D.SymmModulated(range, xi=c(0,0), Omega=diag(2), f0="normal", G0="normal", w=wMvTrigs, par.f0=NULL, par.G0=NULL, p=c(2,3), q=c(1,1), col=4)# w(.) as in (1.6) of Azzalini & Capitanio (2014, p.4) and plot as in # bottom-right panel of their Figure 1.1.Piedmont wines data
Description
Data refer to chemical properties of 178 specimens of three types of wine produced in the Piedmont region of Italy.
Usage
data(wines)Format
A data frame with 178 observations on the following 28 variables.
wine | wine name (categorical, levels:Barbera,Barolo,Grignolino) |
alcohol | alcohol percentage (numeric) |
sugar | sugar-free extract (numeric) |
acidity | fixed acidity (numeric) |
tartaric | tartaric acid (numeric) |
malic | malic acid (numeric) |
uronic | uronic acids (numeric) |
pH | pH (numeric) |
ash | ash (numeric) |
alcal_ash | alcalinity of ash (numeric) |
potassium | potassium (numeric) |
calcium | calcium (numeric) |
magnesium | magnesium (numeric) |
phosphate | phosphate (numeric) |
cloride | chloride (numeric) |
phenols | total phenols (numeric) |
flavanoids | flavanoids (numeric) |
nonflavanoids | nonflavanoid phenols (numeric) |
proanthocyanins | proanthocyanins (numeric) |
colour | colour intensity (numeric) |
hue | hue (numeric) |
OD_dw | OD_{280}/OD_{315} of diluted wines (numeric) |
OD_fl | OD_{280}/OD_{315} of flavanoids (numeric) |
glycerol | glycerol (numeric) |
butanediol | 2,3-butanediol (numeric) |
nitrogen | total nitrogen (numeric) |
proline | proline (numeric) |
methanol | methanol (numeric) |
Details
The data represent 27 chemical measurements on each of 178 wine specimens belonging to three types of wine produced in the Piedmont region of Italy. The data have been presented and examined by Forinaet al. (1986) and were freely accessible from thePARVUS web-site until it was active.These data or, more often, a subset of them are now available from variousplaces, including someR packages. The present dataset includes all variables available on thePARVUS repository, which arethe variables listed by Forinaet al. (1986) with the exception of ‘Sulphate’. Moreover, it reveals the undocumented fact thatthe original dataset appears to include also the vintage year;see the final portion of the ‘Examples’ below.
Source
Forina, M., Lanteri, S. Armanino, C., Casolino, C., Casale, M. and Oliveri, P.V-PARVUS 2008: an extendible package of programs for esplorative data analysis, classification and regression analysis. Dip. Chimica e Tecnologie Farmaceutiche ed Alimentari, Università di Genova, Italia. Web-site (not accessible as of 2014): ‘http://www.parvus.unige.it’
References
Forina M., Armanino C., Castino M. and Ubigli M. (1986). Multivariate data analysis as a discriminating method of the origin of wines.Vitis25, 189–201.
Examples
data(wines)pairs(wines[,c(2,3,16:18)], col=as.numeric(wines$wine))#code <- substr(rownames(wines), 1, 3)table(wines$wine, code)#year <- as.numeric(substr(rownames(wines), 6, 7))table(wines$wine, year)# coincides with Table 1(a) of Forina et al. (1986)Function\log(2\,\Phi(x)) and its derivatives
Description
The functionlog(2*pnorm(x)) and its derivatives, including inverse Mills ratio.
Usage
zeta(k, x)Arguments
k | an integer number between 0 and 5. |
x | a numeric vector. Missing values ( |
Details
Fork between 0 and 5, the derivative of orderk of\log(2\,\Phi(x)) is evaluated, where\Phi(x) denotes theN(0,1) cumulative distribution function.The derivative of orderk=0 refers to the function itself.Ifk is not an integer within0,..., 5,NULL is returned.
Value
a vector representing thek-th order derivative evaluated atx.
Background
The computation fork>1 is reduced to the casek=1, making useof expressions given by Azzalini and Capitanio (1999); see especially Section 4 of the full-length version of the paper. The main facts are summarized in Section 2.1.4 of Azzalini and Capitanio (2014).
For numerical stability, the evaluation ofzeta(1,x) whenx < -50 makes use of the asymptotic expansion (26.2.13) of Abramowitz and Stegun (1964).
zeta(1,-x) equalsdnorm(x)/pnorm(-x) (in principle, apart fromthe above-mentioned asymptotic expansion), called theinverse Mills ratio.
References
Abramowitz, M. and Stegun, I. A., editors (1964).Handbook of Mathematical Functions. Dover Publications.
Azzalini, A. and Capitanio, A. (1999).Statistical applications of the multivariate skew normal distribution.J.Roy.Statist.Soc. B61, 579–602. Full-length version available athttps://arXiv.org/abs/0911.2093
Azzalini, A. with the collaboration of Capitanio, A. (2014).The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
Examples
y <- zeta(2,seq(-20,20,by=0.5))#for(k in 0:5) curve(zeta(k,x), from=-1.5, to=5, col = k+2, add = k > 0)legend(3.5, -0.5, legend=as.character(0:5), col=2:7, lty=1)