| Title: | Bayesian Quantile Regression Models for Complex Survey DataAnalysis |
| Version: | 0.1.4 |
| Description: | Provides Bayesian quantile regression models for complex survey data under informative sampling using survey-weighted estimators. Both single- and multiple-output models are supported. To accelerate computation, all algorithms are implemented in 'C++' using 'Rcpp', 'RcppArmadillo', and 'RcppEigen', and are called from 'R'. See Nascimento and Gonçalves (2024) <doi:10.1093/jssam/smae015> and Nascimento and Gonçalves (2025, in press)https://academic.oup.com/jssam. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.2 |
| LinkingTo: | Rcpp, RcppArmadillo, RcppEigen |
| Imports: | Rcpp, stats, graphics, methods, pracma, ggplot2, rlang,posterior |
| Suggests: | MASS, knitr, rmarkdown, grDevices, testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| SystemRequirements: | BLAS, LAPACK |
| URL: | https://github.com/torodriguezt/bayesQRsurvey |
| BugReports: | https://github.com/torodriguezt/bayesQRsurvey/issues |
| NeedsCompilation: | yes |
| Packaged: | 2025-10-18 21:02:55 UTC; Tomas |
| Author: | Tomás Rodríguez Taborda [aut, cre], Johnatan Cardona Jiménez [aut], Marcus L. Nascimento [aut], Kelly Cristina Mota Gonçalves [aut] |
| Maintainer: | Tomás Rodríguez Taborda <torodriguezt@unal.edu.co> |
| Depends: | R (≥ 3.5.0) |
| Repository: | CRAN |
| Date/Publication: | 2025-10-22 19:10:10 UTC |
bayesQRsurvey: Bayesian Weighted Quantile Regression for complex survey designswith EM and MCMC Algorithm
Description
The bayesQRsurvey package provides Bayesian quantile regression methods for complexsurvey designs with two main functions:
Details
bqr.svy(): Bayesian methods for estimating quantile regression modelsusing MCMC methodsmo.bqr.svy(): Bayesian approach to multiple-output quantile regressionusing EM algorithm
Main functions
bqr.svyFits Bayesian quantile regression for multiple quantiles using MCMC methods (ALD, Score, Approximate)
mo.bqr.svyFits Bayesian quantile regression for multiple quantiles using EM algorithm
plotStandard plot method for bqr.svy objects
summaryUnified summary method for all bayesQRsurvey model objects
priorUnified interface for creating prior distributions
MCMC Methods
The bqr.svy function can estimate three types of models, where the quantile regressioncoefficients are defined at the super-population level, and their estimators arebuilt upon the survey weights.:
ALD (Asymmetric Laplace Distribution): Uses asymmetric Laplace likelihood
Score: Uses score-based approach
Approximate: Uses approximate methods for faster computation
EM Algorithm
Implements a Bayesian approach to multiple-output quantile regression for complexsurvey data analysis.
Author(s)
Marcus L. Nascimento, Kelly Cristina Mota Goncalves,Johnatan Cardona Jimenez, Tomas Rodriguez Taborda
References
Yu, K. and Moyeed, R. A. (2001). Bayesian quantile regression.Statistics & Probability Letters, 54(4), 437-447.
Kozumi, H. and Kobayashi, G. (2011). Gibbs sampling methods for Bayesianquantile regression.Journal of Statistical Computation and Simulation,81(11), 1565-1578.
See Also
Useful links:
Report bugs athttps://github.com/torodriguezt/bayesQRsurvey/issues
Children anthropometric data
Description
The dataset consists of 1007 observations and 6 variables of children aged between 0 and 60 months from the Central-West of Brazil. The data is a subset of the 2006 Brazilian National Demographic Health Survey of Women and Children (NDHS), which is a complex survey where the sampling units were selected in two stages: primary sampling units (PSUs) comprised census tracts, and secondary sampling units (SSUs) consisted of households. Tract selection in each stratum was designed to ensure a minimum number of blood samples, based on the prevalence of vitamin A deficiency in children.
Usage
data("Anthro")Format
The data frame has the following components:
wgt : weight
hgt : height
ruc : rural-urban classification (urban = 1 and rural = 2)
sex : boy = 1 and girl = 2
age : age in months
dweight : sampling weights
Source
2006 Brazilian National Demographic Health Survey of Women and Children (NDHS) published by the Brazilian Institute of Geography and Statistics.
Bayesian quantile regression for complex survey data
Description
bqr.svy implements Bayesian methods for estimating quantile regression modelsfor complex survey data analysis regarding single (univariate) outputs. Toimprove computational efficiency, the Markov Chain Monte Carlo (MCMC) algorithmsare implemented in 'C++'.
Usage
bqr.svy( formula, weights = NULL, data = NULL, quantile = 0.5, method = c("ald", "score", "approximate"), prior = NULL, niter = 50000, burnin = 0, thin = 1, verbose = TRUE, estimate_sigma = FALSE)Arguments
formula | a symbolic description of the model to be fit. |
weights | an optional numerical vector containing the survey weights. If |
data | an optional data frame containing the variables in the model. |
quantile | numerical scalar or vector containing quantile(s) of interest (default=0.5). |
method | one of |
prior | a |
niter | number of MCMC draws. |
burnin | number of initial MCMC draws to be discarded.(default = 0) |
thin | thinning parameter, i.e., keep every keepth draw (default=1). |
verbose | logical flag indicating whether to print progress messages (default=TRUE). |
estimate_sigma | logical flag indicating whether to estimate the scale parameterwhen method = "ald" (default=FALSE and |
Details
The bqr.svy function can estimate three types of models, where the quantile regressioncoefficients are defined at the super-population level, and their estimators are built uponthe survey weights.
"ald"– The asymmetric Laplace distribution as working likelihood."score"– A score based likelihood function."approximate"– A pseudolikelihood function based on a Gaussian approximation.
Value
An object of class"bqr.svy", containing:
beta | Posterior mean estimates of regression coefficients. |
draws | Posterior draws from the MCMC sampler. |
accept_rate | Average acceptance rate (if available). |
warmup,thin | MCMC control parameters used during sampling. |
quantile | The quantile(s) fitted. |
prior | Prior specification used. |
formula,terms,model | Model specification details. |
runtime | Elapsed runtime in seconds. |
method | Estimation method |
estimate_sigma | Logical flag indicating whether the scale parameter |
References
Nascimento, M. L. & Gonçalves, K. C. M. (2024).Bayesian Quantile Regression Models for Complex Survey Data Under Informative Sampling.Journal of Survey Statistics and Methodology, 12(4), 1105–1130.doi:10.1093/jssam/smae015
Examples
# Generate population dataset.seed(123)N <- 10000x1_p <- runif(N, -1, 1)x2_p <- runif(N, -1, 1)y_p <- 2 + 1.5 * x1_p - 0.8 * x2_p + rnorm(N)# Generate sample datan <- 500z_aux <- rnorm(N, mean = 1 + y_p, sd = .5)p_aux <- 1 / (1 + exp(2.5 - 0.5 * z_aux))s_ind <- sample(1:N, n, replace = FALSE, prob = p_aux)y_s <- y_p[s_ind]x1_s <- x1_p[s_ind]x2_s <- x2_p[s_ind]w <- 1 / p_aux[s_ind]data <- data.frame(y = y_s, x1 = x1_s, x2 = x2_s, w = w)# Basic usage with default method ('ald') and priors (vague)fit1 <- bqr.svy(y ~ x1 + x2, weights = w, data = data)# Specify informative priorsprior <- prior( beta_x_mean = c(2, 1.5, -0.8), beta_x_cov = diag(c(0.25, 0.25, 0.25)), sigma_shape = 1, sigma_rate = 1)fit2 <- bqr.svy(y ~ x1 + x2, weights = w, data = data, prior = prior)# Specify different methodsfit_score <- bqr.svy(y ~ x1 + x2, weights = w, data = data, method = "score")fit_approx <- bqr.svy(y ~ x1 + x2, weights = w, data = data, method = "approximate")Multiple-Output Bayesian quantile regression for complex survey data
Description
mo.bqr.svy implements a Bayesian approach to multiple-output quantile regressionfor complex survey data analysis. The method builds a quantile region based ona directional approach. To improve computational efficiency, an Expectation-Maximization (EM)algorithm is implemented instead of the usual Markov Chain Monte Carlo (MCMC).
Usage
mo.bqr.svy( formula, weights = NULL, data = NULL, quantile = 0.5, prior = NULL, U = NULL, gamma_U = NULL, n_dir = NULL, epsilon = 1e-06, max_iter = 1000, verbose = FALSE, estimate_sigma = FALSE)Arguments
formula | a symbolic description of the model to be fit. |
weights | an optional numerical vector containing the survey weights. If |
data | an optional data frame containing the variables in the model. |
quantile | numerical scalar or vector containing quantile(s) of interest (default=0.5). |
prior | a |
U | an optional |
gamma_U | an optional list with length equal to |
n_dir | numerical scalar corresponding to the number of directions (if |
epsilon | numerical scalar indicating the convergence tolerance for the EM algorithm (default = 1e-6). |
max_iter | numerical scalar indicating maximum number of EM iterations (default = 1000). |
verbose | logical flag indicating whether to print progress messages (default=FALSE). |
estimate_sigma | logical flag indicating whether to estimate the scale parameterwhen method = "ald" (default=FALSE and |
Value
An object of class"mo.bqr.svy" containing:
call | The matched call |
formula | The model formula |
terms | The terms object |
quantile | Vector of fitted quantiles |
prior | List of priors used for each quantile |
fit | List of fitted results for each quantile, each containing one sub-list per direction |
coefficients | Coefficients organized by quantile |
sigma | List of scale parameters by quantile and direction.If |
n_dir | Number of directions |
U | Matrix of projection directions ( |
Gamma_list | List of orthogonal complement bases, one per direction |
n_obs | Number of observations |
n_vars | Number of covariates |
response_dim | Dimension of the response |
estimate_sigma | Logical flag indicating whether the scale parameter |
References
Nascimento, M. L. & Gonçalves, K. C. M. (2024).Bayesian Quantile Regression Models for Complex Survey Data Under Informative Sampling.Journal of Survey Statistics and Methodology, 12(4), 1105–1130.doi:10.1093/jssam/smae015
Examples
library(MASS)# Generate population dataset.seed(123)N <- 10000data <- mvrnorm(N, rep(0, 3), matrix(c(4, 0, 2, 0, 1, 1.5, 2, 1.5, 9), 3, 3))x_p <- as.matrix(data[, 1])y_p <- data[, 2:3] + cbind(rep(0, N), x_p)# Generate sample datan <- 500z_aux <- rnorm(N, mean = 1 + y_p, sd = 0.5)p_aux <- 1 / (1 + exp(2.5 - 0.5 * z_aux))s_ind <- sample(1:N, n, replace = FALSE, prob = p_aux)y_s <- y_p[s_ind, ]x_s <- x_p[s_ind, ]w <- 1 / p_aux[s_ind]data_s <- data.frame(y1 = y_s[, 1], y2 = y_s[, 2], x1 = x_s, w = w)# Basic usage with default priors when U and gamma_U are givenfit1 <- mo.bqr.svy( cbind(y1, y2) ~ x1, weights = w, data = data_s, quantile = c(0.1, 0.2), U = matrix(c(0, 1, 1/sqrt(2), 1/sqrt(2)), 2), gamma_U = list(c(1, 0), c(1/sqrt(2), -1/sqrt(2))))# Basic usage with default priors when n_dir is givenfit2 <- mo.bqr.svy( cbind(y1, y2) ~ x1, weights = w, data = data_s, quantile = c(0.1, 0.2), n_dir = 2)Plot Method for Bayesian Weighted Quantile Regression
Description
Plot method for objects of classbqr.svy produced bybqr.svy().It can display fitted quantile curves, coefficient–quantile profiles,MCMC trace plots, and posterior densities.
Usage
## S3 method for class 'bqr.svy'plot( x, y = NULL, type = c("fit", "quantile", "trace", "density"), predictor = NULL, tau = NULL, which = NULL, add_points = TRUE, combine = TRUE, show_ci = FALSE, ci_probs = c(0.1, 0.9), at = NULL, grid_length = 200, points_alpha = 0.4, point_size = 1.5, line_size = 1.2, main = NULL, use_ggplot = TRUE, theme_style = c("minimal", "classic", "bw", "light"), color_palette = c("viridis", "plasma", "set2", "dark2"), add_h0 = FALSE, add_ols = FALSE, ols_fit = NULL, ols_weights = NULL, ...)## S3 method for class 'bwqr_fit'plot(x, ...)## S3 method for class 'bwqr_fit_multi'plot(x, ...)Arguments
x | Object of class |
y | Ignored (S3 signature). |
type | One of |
predictor | (fit) Name of a numeric predictor; if |
tau | Quantile(s) to plot; must appear in |
which | (quantile/trace/density) Coefficient name or index to display.The default is the first coefficient associated with the first variable in the model. |
add_points | (fit) Logical; overlay observed data points. |
combine | (fit) Logical; if multiple |
show_ci | (fit) Logical; draw credible bands. |
ci_probs | (fit) Length-2 numeric vector with lower/upper probabilitiesfor credible bands. |
at | (fit) Named list of fixed values for non- |
grid_length | (fit) Integer; number of points in the predictor grid. |
points_alpha | (fit) Point transparency in |
point_size | (fit) Point size. |
line_size | (fit/quantile) Line width for fitted/summary lines. |
main | Optional main title. |
use_ggplot | Logical; if |
theme_style | (ggplot) One of |
color_palette | (ggplot) One of |
add_h0 | (quantile) Logical; add a horizontal reference at |
add_ols | (quantile) Logical; add the OLS estimate (dotted line) for theselected coefficient. |
ols_fit | (quantile) Optional precomputed |
ols_weights | (quantile) Optional numeric vector of weights when fittingOLS internally (length must match |
... | Accepted for compatibility; ignored by internal plotting code. |
Details
Supported plot types:
type = "fit": Fitted quantile curves versus a singlenumeric predictor. Optionally overlay observed points and crediblebands. Other covariates can be held fixed viaat.type = "quantile": A single coefficient as a functionof the quantile\tau. Optionally add a reference line at 0 andthe corresponding OLS estimate.type = "trace": MCMC trace for one selectedcoefficient at a chosen\tau.type = "density": Posterior density for one selectedcoefficient at a chosen\tau.
Notes:
taumust be included inx$quantile. IfNULL, allavailable quantiles in the object are used.For
type = "fit",predictormust be a numeric column inthe original model. IfNULL, the first numeric predictor(different from the response) is chosen automatically.For
type = "fit",atis a named list(list(var = value, ...)) used to fix other covariates whileplotting versuspredictor. Provide valid levels for factors.When
use_ggplot = TRUE, a ggplot object is returned and theappearance is controlled bytheme_styleandcolor_palette. Otherwise, base graphics are used and thefunction returnsinvisible(NULL).
Value
invisible(NULL) for base R graphics, or a ggplot object ifuse_ggplot = TRUE.
Examples
data(mtcars)fit <- bqr.svy(mpg ~ wt + hp + cyl, data = mtcars, quantile = c(0.5, 0.75), method = "ald", niter = 20000, burnin = 10000, thin = 5)plot(fit, type = "fit", predictor = "wt", show_ci = TRUE)plot(fit, type = "quantile", which = "wt", add_h0 = TRUE, add_ols = TRUE)plot(fit, type = "trace", which = "wt", tau = 0.5)plot(fit, type = "density", which = "wt", tau = 0.5)Print methods for bayesQRsurvey model objects
Description
print.bayesQRsurvey is an S3 method that prints the content of an S3 object of classbqr.svy ormo.bqr.svy to the console.
Usage
## S3 method for class 'bqr.svy'print(x, digits = 3, ...)## S3 method for class 'mo.bqr.svy'print(x, digits = 3, max_rows = NULL, ...)Arguments
x | An object of class |
digits | Integer specifying the number of decimal places to print. Defaults to |
... | Additional arguments that are passed to the generic |
max_rows | Optional integer indicating the maximum number of coefficient rowsto display for each quantile. If |
Examples
set.seed(123)N <- 10000x1_p <- runif(N, -1, 1)x2_p <- runif(N, -1, 1)y_p <- 2 + 1.5 * x1_p - 0.8 * x2_p + rnorm(N)# Generate sample datan <- 500z_aux <- rnorm(N, mean = 1 + y_p, sd = .5)p_aux <- 1 / (1 + exp(2.5 - 0.5 * z_aux))s_ind <- sample(1:N, n, replace = FALSE, prob = p_aux)y_s <- y_p[s_ind]x1_s <- x1_p[s_ind]x2_s <- x2_p[s_ind]w <- 1 / p_aux[s_ind]data <- data.frame(y = y_s, x1 = x1_s, x2 = x2_s, w = w)# Fit a modelfit1 <- bqr.svy(y ~ x1 + x2, weights = w, data = data, niter = 2000, burnin = 500, thin = 2)print(fit1)Create prior for Bayesian quantile regression models for complex survey data
Description
prior creates prior distributions for both single (bqr.svy) and multiple-output(mo.bqr.svy) Bayesian quantile regression models for complex survey data.
Usage
prior( beta_x_mean = NULL, beta_x_cov = NULL, sigma_shape = 0.001, sigma_rate = 0.001, beta_y_mean = NULL, beta_y_cov = NULL)Arguments
beta_x_mean | vector of prior means for the regression coefficients. (default = NULL). |
beta_x_cov | prior covariance matrix for the regression coefficients. (default = NULL). |
sigma_shape | shape parameter for inverse Gamma prior for |
sigma_rate | rate parameter for inverse Gamma prior for |
beta_y_mean | prior means for the coefficients related to the variables that emerge from the product between the orthogonal basis and the outputs(default = NULL). |
beta_y_cov | prior covariance matrix for the coefficients related to the variables that emerge from the product between the orthogonal basis and the outputs.(default = NULL). |
Details
The functionprior builds prior distributions for the three methods implemented in the functionbqr.svy and for the multiple-output quantile regression implemented in the functionmo.bqr.svy.Every nonspecified prior parameter will get the default value.
method = "ald"in functionbqr.svyallow the specification of hyperparametersbeta_x_mean,beta_x_cov,sigma_shape, andsigma_rate.method = "score"in functionbqr.svyallow the specification of hyperparametersbeta_x_meanandbeta_x_cov.method = "approximate"in functionbqr.svyallow the specification of hyperparametersbeta_x_meanandbeta_x_cov.In function
mo.bqr.svy, the specification of hyperparametersbeta_x_mean,beta_x_cov,sigma_shape,sigma_rate,beta_y_mean, andbeta_y_covare allowed.
Value
An object of class"prior".
See Also
Examples
# Simulate dataset.seed(123)n <- 200x1 <- rnorm(n, 0, 1)x2 <- runif(n, -1, 1)w <- runif(n, 0.5, 2) # survey weightsy1 <- 2 + 1.5*x1 - 0.8*x2 + rnorm(n, 0, 1)y2 <- 1 + 0.5*x1 - 0.2*x2 + rnorm(n, 0, 1)data <- data.frame(y1 = y1, y2 = y2, x1 = x1, x2 = x2, w = w)# Define a general informative priorprior_general <- prior( beta_x_mean = c(2, 1.5, -0.8), beta_x_cov = diag(c(0.25, 0.25, 0.25)), sigma_shape = 3, sigma_rate = 2, beta_y_mean = 1, beta_y_cov = 0.25)# Estimate the model parameters with informative priorfit_ald <- bqr.svy(y1 ~ x1 + x2, weights = w, data = data, prior = prior_general, method = "ald")fit_scr <- bqr.svy(y1 ~ x1 + x2, weights = w, data = data, prior = prior_general, method = "score")fit_apx <- bqr.svy(y1 ~ x1 + x2, weights = w, data = data, prior = prior_general, method = "approximate")# Multiple-output methodfit_mo <- mo.bqr.svy(cbind(y1, y2) ~ x1 + x2, weights = w, data = data, prior = prior_general, n_dir = 10)plot(fit_ald, type = "trace", which = "x1", tau = 0.5)plot(fit_ald, type = "trace", which = "x2", tau = 0.5)print(fit_mo)Summary methods for bayesQRsurvey
Description
summary.bayesQRsurvey is an S3 method that summarizes the output of thebqr.svy ormo.bqr.svy function. For thebqr.svy the posterior mean,posterior credible interval and convergence diagnostics are calculated. For themo.bqr.svythe iterations for convergence, the MAP and the direction are calculated.
Usage
## S3 method for class 'bqr.svy'summary(object, probs = c(0.025, 0.975), digits = 2, ...)## S3 method for class 'mo.bqr.svy'summary(object, digits = 4, ...)Arguments
object | An object of class |
probs | Two-element numeric vector with credible interval probabilities.Default |
digits | Integer; number of decimals used by printing helpers. Default |
... | Unused. |
Value
An object of classsummary.bqr.svy with one block per\tau.
An object of classsummary.mo.bqr.svy.