Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
NotificationsYou must be signed in to change notification settings

manueleleonelli/extrememix

Repository files navigation

extrememix implements Bayesian estimation of extreme value mixturemodels, estimating the threshold over which a Generalized Paretodistribution can be assumed as well as high quantiles and other measuresof interest in extreme value theory.

Installation

The packageextrememix can be installed from GitHub using the command

# install.packages("devtools")#devtools::install_github("manueleleonelli/extrememix")

and loaded in R with

library(extrememix)library(ggplot2)

An applied analysis

We consider therainfall dataset reporting the monthly maxima dailyrainfall (in mm) recorded at the Retiro station in the city of Madridbetween 1985 and 2020. The data consists of 414 monthly maxima since 18months were discarded in which no rain was observed.

data("rainfall")ggplot(data=data.frame(rainfall), aes(x=rainfall))+  geom_histogram(binwidth=2*length(rainfall)^(-1/3)*IQR(rainfall),colour="black",fill="white")+ theme_bw()

The data histogram shows that the maximum rainfall observed in a day isaround 50mm. Although there are some extreme observations, thedistribution appears to be right-bounded.

We start our analysis fitting the GGPD model (gamma bulk/GPD tail) usingthe functionfggpd. We specify the number of iterations, the burn-inand the thinning via the optionsit,burn andthin, respectively.The prior distribution, the starting values and the variances of theproposal distributions are automatically chosen, but these can be set bythe user (see below).

rainfall_ggpd<- fggpd(rainfall,it=50000,burn=10000,thin=40)rainfall_ggpd

Theprint method for the objectmodel1 gives us an overview of theestimation process, stating the model fitted, its log-likelihood, theposterior estimate of the shape parameter$\xi$ of the GPD and theprobability that the distribution is right-unbounded. Additional detailscan be gathered using thesummary function.

summary(rainfall_ggpd)#>       estimate lower_ci upper_ci#> xi       -0.14    -0.27     0.05#> sigma     8.83     6.86    11.80#> u        13.97     9.05    14.88#> mu       16.33    14.35    20.40#> eta       1.16     0.96     1.36

Thesummary reports the posterior estimates as well as 95% posteriorcredibility intervals of the models’ parameters. The threshold isestimated at 12.88 and the GPD is therefore estimated over a proportionof the data equal to 0.4541063.

extrememix includes the functioncheck_convergence which reports thetraceplot and the auto-correlation plot for the estimated 0.99 quantile.It can be used as a quick check to ensure convergence of the estimationalgorithm. Other R packages can be used for more in-depth analyses.According to the output, the estimation of the quantile is stable andtherefore it is likely that the chain reached convergence.

check_convergence(rainfall_ggpd)

As an alternative model we consider the MGPD (mixture of gammas bulk/GPDtail) with 2 mixture components. It can be fitted using the functionfmgpd, which needs as input also the number of componentsk. In thiscase we fully specify the model and the estimation procedure by alsogiving the starting values, the variances and the prior distribution.

The starting values can be set creating a list with entriesxi,sigma,u,mu,eta andw. The proposal variances can be setcreating a list with entriesxi,sigma,u,mu andw (for eachmixture component the parameters$\mu$ and$\eta$ are sampled jointly).The prior distribution can be set creating a list with entriesu (avector with the mean and standard deviation of the prior normaldistribution for$u$),mu_mu (a vector with the prior means of theGamma distributions for$\mu$),mu_eta (a vector with the prior shapesof the Gamma distributions for$\mu$),eta_mu (a vector with the priormeans of the Gamma distributions for$\eta$) andeta_eta (a vectorwith the prior shapes of the Gamma distributions for$\eta$).

start<-list(xi=0.2,sigma=5,u= quantile(rainfall,0.9),mu= c(4,10),eta= c(1,4),w= c(0.5,0.5))var<-list(xi=0.001,sigma=1,u=2,mu= c(0.1,0.1),w=0.1)prior<-list(u= c(22,5),mu_mu= c(4,16),mu_eta= c(0.001,0.001),eta_mu= c(1,4),eta_eta= c(0.001,0.001))rainfall_mgpd<- fmgpd(rainfall,k=2,it=50000,burn=10000,thin=40,start=start,var=var,prior=prior)

The summary below shows that an MGPD model is not actually requiredsince the estimate of the weight of one of the two components is zero.

rainfall_mgpd#> EVMM with 2 Mixtures of Gamma bulk. LogLik -1441.789#> xi estimated as  -0.1348735#> Probability of unbounded distribution  0.05894106summary(rainfall_mgpd)#>       estimate lower_ci upper_ci#> xi       -0.13    -0.26     0.04#> sigma     8.70     6.82    11.96#> u        13.97     9.10    14.95#> mu1       0.00     0.00     0.00#> mu2      16.30    14.19    18.87#> eta1      0.00     0.00     0.00#> eta2      1.16     0.98     1.34#> w1        0.00     0.00     0.00#> w2        1.00     1.00     1.00

Let’s anyway check the convergence of the algorithm to ensure theestimation process went ok.

check_convergence(rainfall_mgpd)

Since the MGPD model has one weight equal to zero, a GGPD isrecommended. We can anyway check that this is the case using modelselection criteria.extrememix implements the AIC, AICc, BIC, DIC andWAIC criteria in the equally-named functions.

rbind(c(BIC(rainfall_ggpd),BIC(rainfall_mgpd)),c(DIC(rainfall_ggpd),DIC(rainfall_mgpd)),c(WAIC(rainfall_ggpd),WAIC(rainfall_mgpd)))#>          [,1]     [,2]#> [1,] 2913.918 2931.784#> [2,] 2891.035 2892.770#> [3,] 2897.241 2896.992

For simplicity, here we considered three model selection criteria. BIC,DIC and WAIC all favor the GGPD. As already noticed in the literature,the use of WAIC is recommended and indeed it selects the GGPD model.

We therefore next investigate the use of the GGPD model to assessrainfall in the city of Madrid. Theplot method gives an overview ofthe model reporting the histogram of the distributions of$\xi$ and$\sigma$, a plot of the quantiles and a plot of the predictivedistribution.

plot(rainfall_ggpd)

The predictive distribution can be further obtained using the functionpred. The plot shows that the model gives a faithful description ofthe tail of the data.

pred(rainfall_ggpd)

In extreme value analysis there are many measures that are used toquantify risk: quantiles (implemented inquant), return levels (inreturn_level), Value-at-Risk (inVaR), Expected shortfall (inES)and Tail VaR (inTVaR). For instance here we compute the returnlevels, i.e. the value that is expected to be equaled or exceeded onaverage once every interval of time (T).

return_level(rainfall_ggpd)#>       Level estimate lower_ci upper_ci empirical#>  [1,]    20    30.47    28.24    32.87     30.20#>  [2,]    25    31.93    29.58    34.49     32.44#>  [3,]    30    33.05    30.65    35.84     34.05#>  [4,]    40    34.79    32.33    37.92     34.87#>  [5,]    50    36.06    33.52    39.57     35.27#>  [6,]    60    37.09    34.50    40.88     35.80#>  [7,]    70    37.93    35.23    42.33     36.65#>  [8,]    80    38.64    35.84    43.68     37.02#>  [9,]    90    39.26    36.37    44.65     37.39#> [10,]   100    39.83    36.79    45.50     37.71#> [11,]   150    41.79    38.46    48.61     38.52#> [12,]   200    43.15    39.53    50.89     38.87#> [13,]   250    44.17    40.35    52.62     41.13plot(return_level(rainfall_ggpd))

From the output we can see that we expect a rainfall of 30.42 mms to beequaled or exceeded every 20 months. The width of the credibilityintervals can be chosen with thecred input and the values at which tocompute the return levels can be chosen withvalues.

As a different measure, we consider next the expected shortfall, definedas the expected value in the q% of the worst cases. For instance, thecode below computes the 1% expected shortfall.

ES(rainfall_ggpd,values=1)#>      ES_Level estimate lower_ci upper_ci empirical#> [1,]        1    44.26    40.35    53.45     42.12plot(ES(rainfall_ggpd,values=1))

In other words, conditional on observing a value above the 0.99quantile, the expected rainfall is equal to 44.46mms.

Since we selected a uniquevalues the plotting method reports theposterior histogram of the estimated quantity.

To conclude the analysis, we can further estimate what is the largestpossible rainfall that could ever be observed in the city of Madrid,since we observed that$\xi$ was often estimated as negative. This canbe done with the functionupper_bound.

upper_bound(rainfall_ggpd)#> Probability of unbounded distribution:  0.052#> Estimated upper bound at  74.07  with probability  0.948#>  Credibility interval at  0.95 %: ( 53.79 , 321.48 )plot(upper_bound(rainfall_ggpd),xlim= c(20,400))#> Upper Bound, with probability  0.948

The maximum rainfall that could be observed in Madrid is estimated as74.07. Furthermore, since in the posterior sample there are some valuesof$\xi$ which are positive, we have a non-zero probability that thedistribution is right-unbounded. The limits of the histogram are setwith the inputxlim.

About

No description, website, or topics provided.

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp