
gkwdist implements theGeneralized Kumaraswamy(GKw) distribution family and its seven nested sub-models forbounded continuous data on\((0,1)\).All functions are implemented inC++ via RcppArmadillofor maximum computational efficiency.
Key Features: - Seven flexible distributions forproportions, rates, and bounded data - Standard R distribution API:d*,p*,q*,r* -Analytical log-likelihood, gradient, and Hessian functions -Allfunctions implemented in C++ for optimal performance - 10-50×faster than equivalent R implementations - Numerically stable forextreme parameter values
# Install from GitHub# install.packages("devtools")devtools::install_github("evandeilton/gkwdist")| Distribution | Code | Parameters | Functions |
|---|---|---|---|
| GeneralizedKumaraswamy | gkw | \(\alpha, \beta,\gamma, \delta, \lambda\) | dgkw,pgkw,qgkw,rgkw,llgkw,grgkw,hsgkw |
| Beta-Kumaraswamy | bkw | \(\alpha, \beta,\gamma, \delta\) | dbkw,pbkw,qbkw,rbkw,llbkw,grbkw,hsbkw |
kkw | \(\alpha, \beta,\delta, \lambda\) | dkkw,pkkw,qkkw,rkkw,llkkw,grkkw,hskkw | |
| ExponentiatedKumaraswamy | ekw | \(\alpha, \beta,\lambda\) | dekw,pekw,qekw,rekw,llekw,grekw,hsekw |
| McDonald (BetaPower) | mc | \(\gamma,\delta, \lambda\) | dmc,pmc,qmc,rmc,llmc,grmc,hsmc |
| Kumaraswamy | kw | \(\alpha,\beta\) | dkw,pkw,qkw,rkw,llkw,grkw,hskw |
| Beta | beta_ | \(\gamma,\delta\) | dbeta_,pbeta_,qbeta_,rbeta_,llbeta,grbeta,hsbeta |
Distribution Functions (C++ implementation with Rinterface):
d*() — Probability density function (PDF)p*() — Cumulative distribution function (CDF)q*() — Quantile function (inverse CDF)r*() — Random number generationAnalytical Functions for Maximum Likelihood (C++implementation):
All analytical functions use the signature:function(par, data) wherepar is a numericvector of parameters.
ll*(par, data) — Negative log-likelihood:\[-\ell(\boldsymbol{\theta}; \mathbf{x}) =-\sum_{i=1}^n \log f(x_i; \boldsymbol{\theta})\]
gr*(par, data) — Negative gradient (negative scorevector):\[-\nabla_{\boldsymbol{\theta}}\ell(\boldsymbol{\theta}; \mathbf{x})\]
hs*(par, data) — Negative Hessian matrix:\[-\nabla^2_{\boldsymbol{\theta}}\ell(\boldsymbol{\theta}; \mathbf{x})\]
Note: These functions returnnegative values to facilitate direct use withoptimization routines likeoptim(), which performminimization by default.
Notation: All parameters are strictly positive(\(\alpha, \beta, \gamma, \delta, \lambda >0\)); support\(x \in (0, 1)\).The beta function is\(B(a,b) =\Gamma(a)\Gamma(b)/\Gamma(a+b)\), and the regularized incompletebeta function is\(I_z(a,b) =B_z(a,b)/B(a,b)\) where\(B_z(a,b) =\int_0^z t^{a-1}(1-t)^{b-1}dt\).
Parameters:\(\alpha,\beta, \gamma, \delta, \lambda > 0\)
PDF:
\[f_{\text{GKw}}(x; \alpha, \beta, \gamma,\delta, \lambda) = \frac{\lambda \alpha \beta}{B(\gamma, \delta)}x^{\alpha-1} (1-x^\alpha)^{\beta-1}\left(1-(1-x^\alpha)^\beta\right)^{\gamma\lambda-1}\left(1-\left(1-(1-x^\alpha)^\beta\right)^\lambda\right)^{\delta-1}\]
CDF:
\[F_{\text{GKw}}(x; \alpha, \beta, \gamma,\delta, \lambda) = I_{\left(1-(1-x^\alpha)^\beta\right)^\lambda}(\gamma,\delta)\]
Quantile: Numerical inversion of the CDF viaroot-finding algorithms.
Moments: Analytical expressions not available inclosed form. Numerical integration or simulation methods required.
Relationship: Special case of GKw with
PDF:
\[f_{\text{BKw}}(x; \alpha, \beta, \gamma,\delta) = \frac{\alpha \beta}{B(\gamma, \delta)} x^{\alpha-1}(1-x^\alpha)^{\beta-1} \left(1-(1-x^\alpha)^\beta\right)^{\gamma-1}\left(1-\left(1-(1-x^\alpha)^\beta\right)\right)^{\delta-1}\]
Simplifying:
\[f_{\text{BKw}}(x; \alpha, \beta, \gamma,\delta) = \frac{\alpha \beta}{B(\gamma, \delta)} x^{\alpha-1}(1-x^\alpha)^{\beta-1} \left(1-(1-x^\alpha)^\beta\right)^{\gamma-1}(1-x^\alpha)^{\beta(\delta-1)}\]
\[= \frac{\alpha \beta}{B(\gamma, \delta)}x^{\alpha-1} (1-x^\alpha)^{\beta\delta-1}\left(1-(1-x^\alpha)^\beta\right)^{\gamma-1}\]
CDF:
\[F_{\text{BKw}}(x; \alpha, \beta, \gamma,\delta) = I_{1-(1-x^\alpha)^\beta}(\gamma, \delta)\]
Quantile: Numerical inversion via root-finding. Forthe inverse:
\[u = I_y(\gamma, \delta) \quad\text{where} \quad y = 1-(1-x^\alpha)^\beta\]
Solving for\(x\):
\[x = \left(1-\left(1-I_u^{-1}(\gamma,\delta)\right)^{1/\beta}\right)^{1/\alpha}\]
Relationship: Special case of GKw with
PDF:
\[f_{\text{KKw}}(x; \alpha, \beta, \delta,\lambda) = \alpha \beta \delta \lambda \, x^{\alpha-1}(1-x^\alpha)^{\beta-1} \left(1-(1-x^\alpha)^\beta\right)^{\lambda-1}\left(1-\left(1-(1-x^\alpha)^\beta\right)^\lambda\right)^{\delta-1}\]
CDF:
\[F_{\text{KKw}}(x; \alpha, \beta, \delta,\lambda) = 1 -\left(1-\left(1-(1-x^\alpha)^\beta\right)^\lambda\right)^\delta\]
Quantile (closed-form):
\[Q_{\text{KKw}}(p; \alpha, \beta, \delta,\lambda) = \left(1 - \left(1 -\left(1-(1-p)^{1/\delta}\right)^{1/\lambda}\right)^{1/\beta}\right)^{1/\alpha}\]
Moments: Analytical expressions not available inclosed form.
Relationship: Special case of GKw with
PDF:
\[f_{\text{EKw}}(x; \alpha, \beta,\lambda) = \lambda \alpha \beta \, x^{\alpha-1} (1-x^\alpha)^{\beta-1}\left(1-(1-x^\alpha)^\beta\right)^{\lambda-1}\]
CDF:
\[F_{\text{EKw}}(x; \alpha, \beta,\lambda) = \left(1-(1-x^\alpha)^\beta\right)^\lambda\]
Quantile (closed-form):
\[Q_{\text{EKw}}(p; \alpha, \beta,\lambda) =\left(1-\left(1-p^{1/\lambda}\right)^{1/\beta}\right)^{1/\alpha}\]
Moments:
\[\mathbb{E}(X^r) = \lambda\sum_{k=0}^{\infty} \frac{(-1)^k \binom{\lambda}{k+1}}{k+1} \cdot \beta\, B\left(1 + \frac{r}{\alpha}, (k+1)\beta\right)\]
where the binomial coefficient is generalized:
Relationship: Special case of GKw with
PDF:
\[f_{\text{MC}}(x; \gamma, \delta,\lambda) = \frac{\lambda}{B(\gamma, \delta)} x^{\gamma\lambda-1}(1-x^\lambda)^{\delta-1}\]
CDF:
\[F_{\text{MC}}(x; \gamma, \delta,\lambda) = I_{x^\lambda}(\gamma, \delta)\]
Quantile:
\[Q_{\text{MC}}(p; \gamma, \delta,\lambda) = \left(I_p^{-1}(\gamma,\delta)\right)^{1/\lambda}\]
where\(I_p^{-1}(\gamma, \delta)\)is the inverse regularized incomplete beta function (quantile functionof the Beta distribution).
Moments:
\[\mathbb{E}(X^r) = \frac{B(\gamma +r/\lambda, \delta)}{B(\gamma, \delta)}\]
which is valid for\(r/\lambda >-\gamma\).
Relationship: Special case of GKw with
PDF:
\[f_{\text{Kw}}(x; \alpha, \beta) = \alpha\beta \, x^{\alpha-1} (1-x^\alpha)^{\beta-1}\]
CDF:
\[F_{\text{Kw}}(x; \alpha, \beta) = 1 -(1-x^\alpha)^\beta\]
Quantile (closed-form):
\[Q_{\text{Kw}}(p; \alpha, \beta) =\left(1-(1-p)^{1/\beta}\right)^{1/\alpha}\]
Moments:
\[\mathbb{E}(X^r) = \beta \, B\left(1 +\frac{r}{\alpha}, \beta\right) = \frac{\beta \, \Gamma(1+r/\alpha) \,\Gamma(\beta)}{\Gamma(1+r/\alpha+\beta)}\]
which is valid for\(r/\alpha >-1\).
Special Cases:
\[\mathbb{E}(X) = \frac{\beta \,\Gamma(1+1/\alpha) \,\Gamma(\beta)}{\Gamma(1+1/\alpha+\beta)}\]
\[\mathbb{E}(X^2) = \frac{\beta \,\Gamma(1+2/\alpha) \,\Gamma(\beta)}{\Gamma(1+2/\alpha+\beta)}\]
\[\text{Var}(X) = \mathbb{E}(X^2) -\left(\mathbb{E}(X)\right)^2\]
Relationship: Special case of GKw with
PDF:
\[f_{\text{Beta}}(x; \gamma, \delta) =\frac{1}{B(\gamma, \delta)} x^{\gamma-1} (1-x)^{\delta-1}\]
CDF:
\[F_{\text{Beta}}(x; \gamma, \delta) =I_x(\gamma, \delta)\]
Quantile:
\[Q_{\text{Beta}}(p; \gamma, \delta) =I_p^{-1}(\gamma, \delta)\]
Moments:
\[\mathbb{E}(X^r) = \frac{B(\gamma+r,\delta)}{B(\gamma, \delta)} = \frac{\Gamma(\gamma+r) \,\Gamma(\gamma+\delta)}{\Gamma(\gamma) \,\Gamma(\gamma+\delta+r)}\]
\[\mathbb{E}(X) =\frac{\gamma}{\gamma+\delta}\]
\[\text{Var}(X) =\frac{\gamma\delta}{(\gamma+\delta)^2(\gamma+\delta+1)}\]
GKw(α, β, γ, δ, λ) / \ λ = 1 γ = 1 / \ BKw(α, β, γ, δ) KKw(α, β, δ, λ) | | α = β = 1 δ = 1 | | MC(γ, δ, λ) EKw(α, β, λ) | | λ = 1 γ = δ = 1 | | Beta(γ, δ) Kw(α, β)Note: The Beta distribution is obtained from MC bysetting\(\lambda = 1\), or from GKw bysetting\(\alpha = \beta = \lambda =1\). The Kumaraswamy distribution is obtained from EKw by setting\(\lambda = 1\), or from GKw by setting\(\gamma = \delta = \lambda = 1\).
library(gkwdist)# Set parametersalpha<-2; beta<-3; gamma<-1.5; delta<-2; lambda<-1.2x<-seq(0.01,0.99,length.out =100)# Densitydens<-dgkw(x, alpha, beta, gamma, delta, lambda)# CDFcdf<-pgkw(x, alpha, beta, gamma, delta, lambda)# Quantilesq<-qgkw(c(0.25,0.5,0.75), alpha, beta, gamma, delta, lambda)print(q)# Random generationset.seed(123)random_sample<-rgkw(1000, alpha, beta, gamma, delta, lambda)# Visualizationpar(mfrow =c(1,2))plot(x, dens,type ="l",lwd =2,col ="blue",main ="GKw PDF",xlab ="x",ylab ="Density")grid()plot(x, cdf,type ="l",lwd =2,col ="red",main ="GKw CDF",xlab ="x",ylab ="F(x)")grid()library(gkwdist)par(mfrow =c(1,1),mar =c(3,3,2,2))x<-seq(0.001,0.999,length.out =500)# Compute densities for all familiesd_gkw<-dgkw(x,2,3,1.5,2,1.2)d_bkw<-dbkw(x,2,3,1.5,2)d_kkw<-dkkw(x,2,3,2,1.2)d_ekw<-dekw(x,2,3,1.5)d_mc<-dmc(x,2,3,1.2)d_kw<-dkw(x,2,5)d_beta<-dbeta_(x,2,3)# Plot comparisonplot(x, d_gkw,type ="l",lwd =2,col ="black",ylim =c(0,max(d_gkw, d_bkw, d_kkw, d_ekw, d_mc, d_kw, d_beta)),main ="Distribution Family Comparison",xlab ="x",ylab ="Density")lines(x, d_bkw,lwd =2,col ="red")lines(x, d_kkw,lwd =2,col ="blue")lines(x, d_ekw,lwd =2,col ="green")lines(x, d_mc,lwd =2,col ="purple")lines(x, d_kw,lwd =2,col ="orange")lines(x, d_beta,lwd =2,col ="brown")legend("topright",legend =c("GKw","BKw","KKw","EKw","MC","Kw","Beta"),col =c("black","red","blue","green","purple","orange","brown"),lwd =2,cex =0.85)grid()library(gkwdist)# Generate synthetic data from Kumaraswamy distributionset.seed(2024)n<-2000true_alpha<-2.5true_beta<-3.5data<-rkw(n, true_alpha, true_beta)# Get starting valuespar_ini<-gkwgetstartvalues(data,family ="kw",n_starts =2)# Optimization using BFGS with analytical gradientfit<-optim(par = par_ini,fn = llkw,# Negative log-likelihood functiongr = grkw,# Negative gradient (negative score function)data = data,method ="BFGS",hessian =TRUE)# Standard errors from observed information matrixse<-sqrt(diag(solve(fit$hessian)))# 95% Confidence intervalsci<-cbind(Lower = fit$par-1.96* se,Estimate = fit$par,Upper = fit$par+1.96* se)rownames(ci)<-c("alpha","beta")cat("True parameters: ", true_alpha, true_beta,"\n")cat("Estimated parameters:", fit$par,"\n\n")print(ci)library(gkwdist)# Using fitted model from Example 3x_grid<-seq(0.001,0.999,length.out =200)# Fitted densityfitted_dens<-dkw(x_grid, fit$par[1], fit$par[2])# True density (for comparison)true_dens<-dkw(x_grid, true_alpha, true_beta)# Diagnostic plothist(data,breaks =30,probability =TRUE,col ="lightgray",border ="white",main ="Kumaraswamy Distribution Fit",xlab ="Data",ylab ="Density")lines(x_grid, fitted_dens,col ="red",lwd =2,lty =1)lines(x_grid, true_dens,col ="blue",lwd =2,lty =2)legend("topright",legend =c("Observed Data","Fitted Model","True Model"),col =c("gray","red","blue"),lwd =c(10,2,2),lty =c(1,1,2))grid()library(gkwdist)# Generate data from Exponentiated Kumaraswamyset.seed(456)n<-1500data<-rekw(n,alpha =2,beta =3,lambda =1.5)# Define competing modelsmodels<-list(Beta =list(nll =function(par)llbeta(par, data),gr =function(par)grbeta(par, data),start =gkwgetstartvalues(data,family ="beta"),npar =2 ),Kw =list(nll =function(par)llkw(par, data),gr =function(par)grkw(par, data),start =gkwgetstartvalues(data,family ="kw"),npar =2 ),EKw =list(nll =function(par)llekw(par, data),gr =function(par)grekw(par, data),start =gkwgetstartvalues(data,family ="ekw"),npar =3 ),MC =list(nll =function(par)llmc(par, data),gr =function(par)grmc(par, data),start =gkwgetstartvalues(data,family ="mc"),npar =3 ))# Fit all models using optim with analytical gradientsfits<-lapply(models,function(m) {optim(par = m$start,fn = m$nll,gr = m$gr,method ="BFGS")})# Extract log-likelihoods and compute information criterialoglik<-sapply(fits,function(f)-f$value)k<-sapply(models,`[[`,"npar")results<-data.frame(Model =names(models),Coefs =sapply(fits,function(f)paste0(round(f$par,3),collapse ="|")),LogLik =round(loglik,2),nPar = k,AIC =round(-2* loglik+2* k,2),BIC =round(-2* loglik+ k*log(n),2))# Sort by AICresults<- results[order(results$AIC), ]print(results,row.names =FALSE)cat("\nBest model by AIC:", results$Model[1],"\n")library(gkwdist)# Generate data from EKwset.seed(789)n<-1000data<-rekw(n,alpha =2,beta =3,lambda =1.5)params<-c(2,3,1.5)# Compute analytical functions at true parametersnll<-llekw(params, data)neg_score<-grekw(params, data)neg_hess<-hsekw(params, data)# Fisher information matrix (negative of negative Hessian = Hessian)fisher<--neg_hess# Asymptotic standard errorsse<-sqrt(diag(solve(fisher)))names(se)<-c("alpha","beta","lambda")cat("Negative log-likelihood:", nll,"\n")cat("\nNegative score vector (should be close to zero at true params):\n")print(neg_score)cat("\nNegative Hessian matrix:\n")print(neg_hess)cat("\nFisher Information matrix:\n")print(fisher)cat("\nAsymptotic standard errors:\n")print(se)library(gkwdist)# Generate data and fit modelset.seed(101)n<-2000data<-rkw(n,alpha =2,beta =3)# Fit using optimfit<-optim(par =c(1,1),fn =function(par)llkw(par, data),gr =function(par)grkw(par, data),method ="BFGS")# Theoretical quantilesp<-ppoints(n)theoretical_q<-qkw(p, fit$par[1], fit$par[2])# Empirical quantilesempirical_q<-sort(data)# Q-Q plotplot(theoretical_q, empirical_q,xlab ="Theoretical Quantiles",ylab ="Empirical Quantiles",main ="Q-Q Plot: Kumaraswamy Distribution",pch =19,col =rgb(0,0,1,0.5))abline(0,1,col ="red",lwd =2,lty =2)grid()library(gkwdist)# Generate data from GKwset.seed(2025)n<-10000true_params<-c(alpha =2,beta =2.5,gamma =1.5,delta =2,lambda =1.3)data<-rgkw(n, true_params[1], true_params[2], true_params[3], true_params[4], true_params[5])# Get starting valuespar_ini<-gkwgetstartvalues(data,family ="gkw",n_starts =5)# Fit using optim with analytical gradientfit_gkw<-optim(par = par_ini,fn = llgkw,# Negative log-likelihoodgr = grgkw,# Negative gradientdata = data,method ="BFGS",hessian =TRUE,control =list(maxit =1000))# Resultsse<-sqrt(diag(solve(fit_gkw$hessian)))estimates<-data.frame(Parameter =names(true_params),True = true_params,Estimate = fit_gkw$par,SE = se)print(estimates,digits =4)library(gkwdist)# Generate dataset.seed(999)n<-10000data<-rkw(n,alpha =2,beta =3)par<-c(2,3)# Negative analytical gradientneg_grad_analytical<-grkw(par, data)# Numerical gradient (using finite differences)neg_grad_numerical<- numDeriv::grad(func =function(p)llkw(p, data),x = par)# Comparecomparison<-data.frame(Parameter =c("alpha","beta"),Analytical = neg_grad_analytical,Numerical = neg_grad_numerical,Difference =abs(neg_grad_analytical- neg_grad_numerical))print(comparison,digits =8)The C++ implementation provides substantial performance gains:
library(microbenchmark)# Generate large datasetn<-10000data<-rkw(n,2,3)# Compare log-likelihood computationbenchmark<-microbenchmark(R_sum_log_d =-sum(log(dkw(data,2,3))),# Manual negative log-likelihoodCpp_ll =llkw(c(2,3), data),# C++ negative log-likelihoodtimes =100)print(benchmark)plot(benchmark)# Typical results: C++ implementation is 10-50× fasterWhy C++ Implementation Matters:
| Data Characteristics | Recommended Distribution | Rationale |
|---|---|---|
| Unimodal, symmetric | Beta | Parsimony; well-studied properties |
| Unimodal, asymmetric | Kumaraswamy | Closed-form CDF and quantile |
| Bimodal or U-shaped | GKw orBKw | Maximum flexibility |
| Extreme skewness | KKw orEKw | Fine control over tail behavior |
| J-shaped (monotonic) | Kw orBeta | With appropriate parameter values |
| Power transformations | McDonald | Explicit power parameter |
| Unknown shape | GKw | Fit and test nested models |
Model Selection Workflow:
Cordeiro, G. M., & de Castro, M. (2011). Anew family of generalized distributions.Journal of StatisticalComputation and Simulation, 81(7), 883-898.doi:10.1080/00949650903530745
Carrasco, J. M. F., Ferrari, S. L. P., & Cordeiro, G.M. (2010). A new generalized Kumaraswamy distribution.arXiv:1004.0911.arxiv.org/abs/1004.0911
Kumaraswamy, P. (1980). A generalizedprobability density function for double-bounded random processes.Journal of Hydrology, 46(1-2), 79-88.doi:10.1016/0022-1694(80)90036-0
Jones, M. C. (2009). Kumaraswamy’s distribution:A beta-type distribution with some tractability advantages.Statistical Methodology, 6(1), 70-81.doi:10.1016/j.stamet.2008.04.001
Lemonte, A. J., & Cordeiro, G. M. (2013). Anextended Lomax distribution.Statistics, 47(4), 800-816.doi:10.1080/02331888.2011.568119
Cordeiro, G. M., & Lemonte, A. J. (2011).The β-Birnbaum–Saunders distribution: An improved distribution forfatigue life modeling.Computational Statistics & DataAnalysis, 55(3), 1445-1461.doi:10.1016/j.csda.2010.10.007
McDonald, J. B. (1984). Some generalizedfunctions for the size distribution of income.Econometrica,52(3), 647-663.doi:10.2307/1913469
Cordeiro, G. M., & Brito, R. S. (2012). Thebeta power distribution.Brazilian Journal of Probability andStatistics, 26(1), 88-112.doi:10.1214/10-BJPS124
citation("gkwdist")@Manual{gkwdist2025,title = {gkwdist: Generalized Kumaraswamy Distribution Family},author = {José Evandeilton Lopes},year = {2025},note = {R package},url = {https://github.com/evandeilton/gkwdist}}José Evandeilton Lopes
LEG - Laboratory of Statistics and Geoinformation
PPGMNE - Graduate Program in Numerical Methods in Engineering
Federal University of Paraná (UFPR), Brazil
Email:evandeilton@gmail.com
MIT License. SeeLICENSE file for details.