| Title: | Probability Theory for Selecting Candidates in Plant Breeding |
| Version: | 1.0.4.9 |
| Description: | Use probability theory under the Bayesian framework for calculating the risk of selecting candidates in a multi-environment context. Contained are functions used to fit a Bayesian multi-environment model (based on the available presets), extract posterior values and maximum posterior values, compute the variance components, check the model’s convergence, and calculate the probabilities. For both across and within-environments scopes, the package computes the probability of superior performance and the pairwise probability of superior performance. Furthermore, the probability of superior stability and the pairwise probability of superior stability across environments is estimated. A joint probability of superior performance and stability is also provided. |
| URL: | https://github.com/saulo-chaves/ProbBreed,https://saulo-chaves.github.io/ProbBreed_site/,https://saulo-chaves.github.io/ProbBreed/ |
| BugReports: | https://github.com/saulo-chaves/ProbBreed/issues |
| License: | AGPL (≥ 3) |
| Depends: | R (≥ 3.5.0) |
| Imports: | ggplot2, lifecycle, methods, Rcpp (≥ 0.12.0), RcppParallel(≥ 5.0.1), rlang, rstan (≥ 2.32.0), rstantools (≥ 2.4.0),stats, utils |
| LinkingTo: | StanHeaders (≥ 2.32.0), rstan (≥ 2.32.0), BH (≥1.72.0-2), Rcpp (≥ 0.12.0), RcppEigen (≥ 0.3.3.3.0),RcppParallel (≥ 5.0.1) |
| Suggests: | knitr, rmarkdown |
| Encoding: | UTF-8 |
| UseLTO: | true |
| NeedsCompilation: | yes |
| RoxygenNote: | 7.3.3 |
| LazyData: | true |
| Biarch: | true |
| SystemRequirements: | GNU make |
| Packaged: | 2025-12-15 14:52:08 UTC; saulo |
| Author: | Saulo Chaves |
| Maintainer: | Saulo Chaves <saulochaves@usp.br> |
| Repository: | CRAN |
| Date/Publication: | 2025-12-15 20:10:23 UTC |
The 'ProbBreed' package.
Description
ProbBreed uses probability theory under the Bayesian framework for calculatingthe risk of selecting candidates in a multi-environment context.Contained are functions used to fit a Bayesian multi-environment model(based on the available presets), extract posterior values and maximum posterior values,compute the variance components, check the model’s convergence, and calculate the probabilities.For both across and within-environments scopes, the package computes the probability of superior performance and the pairwise probability of superior performance.Furthermore, the probability of superior stability and the pairwise probability of superior stability across environments is estimated.
Author(s)
Maintainer: Saulo Chavessaulochaves@usp.br (ORCID)
Authors:
Kaio Diaskaio.o.dias@ufv.br (ORCID) [copyright holder]
Matheus Krausemdkrause@iastate.edu (ORCID)
References
Stan Development Team (NA). RStan: the R interface to Stan. R package version 2.32.6. https://mc-stan.org
Dias, K. O. G, Santos J. P. R., Krause, M. D., Piepho H. -P., Guimarães, L. J. M.,Pastina, M. M., and Garcia, A. A. F. (2022). Leveraging probability conceptsfor cultivar recommendation in multi-environment trials.Theoretical andApplied Genetics, 133(2):443-455.doi:10.1007/s00122-022-04041-y
See Also
Useful links:
Report bugs athttps://github.com/saulo-chaves/ProbBreed/issues
Bayesian model for multi-environment trials
Description
Fits a Bayesian multi-environment model usingrstan, theR interface toStan.
Usage
bayes_met( data, gen, loc, repl, trait, reg = NULL, year = NULL, res.het = FALSE, iter = 2000, cores = 1, chains = 4, pars = NA, warmup = floor(iter/2), thin = 1, seed = sample.int(.Machine$integer.max, 1), init = "random", verbose = FALSE, algorithm = c("NUTS", "HMC", "Fixed_param"), control = NULL, include = TRUE, show_messages = TRUE, ...)Arguments
data | A data frame in which to interpret the variables declared in the other arguments. |
gen,loc | A string. The name of the columns that contain the evaluatedcandidates and locations (or environments, if you are working with factor combinations), respectively. |
repl | A string, a vector, or |
trait | A string. The analysed variable. Currently, only single-trait models are fitted. |
reg | A string or NULL. The name of the column that contain information onregions or mega-environments. |
year | A string or NULL. The name of the column that contain information onyears (or seasons). |
res.het | Should the model consider heterogeneous residual variances?Defaults for |
iter | A positive integer specifying the number of iterations for each chain (including warmup). The default is 2000. |
cores | Number of cores to use when executing the chains in parallel,which defaults to 1 but we recommend setting the |
chains | A positive integer specifying the number of Markov chains. The default is 4. |
pars | A vector of character strings specifying parameters of interest. The default is |
warmup | A positive integer specifying the number of warmup (aka burnin)iterations per chain. If step-size adaptation is on (which it is by default), this also controls the number of iterations for which adaptation is run (andhence these warmup samples should not be used for inference). The number of warmup iterations should be smaller than |
thin | A positive integer specifying the period for saving samples. The default is 1, which is usually the recommended value. |
seed | The seed for random number generation. The default is generated from 1 to the maximum integer supported byR on the machine. Even if multiple chains are used, only one seed is needed, with other chains having seeds derived from that of the first chain to avoid dependent samples.When a seed is specified by a number, |
init | Initial values specification. See the detailed documentation for the init argument in |
verbose |
|
algorithm | One of sampling algorithms that are implemented in Stan. Current options are |
control | A named |
include | Logical scalar defaulting to |
show_messages | Either a logical scalar (defaulting to |
... | Additional arguments can be |
Details
The function has nine available models, which will be fitted according to theoptions set in the arguments:
Entry-mean model : fitted when
repl = NULL,reg = NULLandyear = NULL:y = \mu + g + l + \varepsilonWhere
yis the phenotype,\muis the intercept,gis the genotypiceffect,lis the location (or environment) effect, and\varepsilonisthe error (which contains the genotype-by-location interaction, in this case).Randomized complete blocks design : fitted when
replis a single string.It will fit different models depending ifregandyearareNULL:reg = NULLandyear = NULL:y = \mu + g + l + gl + r + \varepsilonwhere
glis the genotype-by-location effect, andris the replicate effect.reg = "reg"andyear = NULL:y = \mu + g + m + l + gl + gm + r + \varepsilonwhere
mis the region effect, andgmis the genotype-by-region effect.reg = NULLandyear = "year":y = \mu + g + t + l + gl + gt + r + \varepsilonwhere
tis the year effect, andgtis the genotype-by-year effect.reg = "reg"andyear = "year":y = \mu + g + m + t + l + gl + gm + gt + r + \varepsilon
Incomplete blocks design : fitted when
replis a string vector of size 2.It will fit different models depending ifregandyearareNULL:reg = NULLandyear = NULL:y = \mu + g + l + gl + r + b + \varepsilonwhere
bis the block within replicates effect.reg = "reg"andyear = NULL:y = \mu + g + m + l + gl + gm + r + b + \varepsilonreg = NULLandyear = "year":y = \mu + g + t + l + gl + gt + r + b + \varepsilonreg = "reg"andyear = "year":y = \mu + g + m + t + l + gl + gm + gt + r + b + \varepsilon
The models described above have predefined priors:
x \sim \mathcal{N} \left( 0, S^{[x]} \right)
\sigma \sim \mathcal{HalfCauchy}\left( 0, S^{[\sigma]} \right)
wherex can be any effect but the error, and\sigma is the standarddeviation of the likelihood. Ifres.het = TRUE, then\sigma_k \sim \mathcal{HalfCauchy}\left( 0, S^{\left[ \sigma_k \right]} \right).The hyperpriors are set as follows:
S^{[x]} \sim \mathcal{HalfCauchy}\left( 0, \phi \right)
where\phi is the known global hyperparameter defined such as\phi = max(y) \times 10.
More details about the usage ofbayes_met and other functions oftheProbBreed package can be found athttps://saulo-chaves.github.io/ProbBreed_site/.Solutions to convergence or mixing issues can be found athttps://mc-stan.org/misc/warnings.html.
Value
An object of S4 classstanfit representingthe fitted results. Slotmode for this objectindicates if the sampling is done or not.
Methods
samplingsignature(object = "stanmodel")Call a sampler (NUTS, HMC, or Fixed_param depending on parameters) to draw samples from the model defined by S4 classstanmodelgiven the data, initial values, etc.
See Also
rstan::sampling,rstan::stan,rstan::stanfit
Examples
mod = bayes_met(data = maize, gen = "Hybrid", loc = "Location", repl = c("Rep","Block"), trait = "GY", reg = "Region", year = NULL, res.het = TRUE, iter = 2000, cores = 2, chain = 4)Bayesian Probabilistic Selection Index (BPSI)
Description
This function estimates the genotype's merit for multiple traits using theprobabilities of superior performance across environments.
Usage
bpsi(problist, increase = NULL, lambda = NULL, int, save.df = FALSE)Arguments
problist | A list of object of class |
increase | Optional logical vector with size corresponding to the number of traitsof |
lambda | A numeric representing the weight of each trait. Defaults to 1 (equal weights).The trait with more economic interest should be greater. |
int | A numeric representing the selection intensity (between 0 and 1), considering the selection index. |
save.df | Logical. Should the data frames be saved in the work directory? |
Details
Bayesian Probabilistic Selection Index
BPSI_i = \sum_{m=1}^{t} \frac{\gamma_{pt} -\gamma_{it} }{(1/\lambda_t)}
where\gamma_p is the probability of superior performance of the worst genotype for the traitt,\gamma is the probability of superior performance of genotypei for traitt,t is the total number of traits evaluated,\left(m = 1, 2, ..., t \right),and\lambda is the weight for each traitt.
More details about the usage ofbpsi can be found athttps://tiagobchagas.github.io/BPSI/.
Value
The function returns an object of classbpsi, which contains two lists,one with the BPSI- Bayesian Probabilistic Selection Index, and another with the originaldata-with across-environments probabilities of superior performance for each trait.
Author(s)
José Tiago Barroso Chagas
References
Chagas, J. T. B., Dias, K. O. G., Carneiro, V. Q., Oliveira, L. M. C., Nunes, N. X.,Pereira Júnior, J. D., Carneiro, P. C. S., & Carneiro, J. E. S. (2025).Bayesian probabilistic selection index in the selection of common bean families.Crop Science, 65(3).doi:10.1002/CSC2.70072
See Also
Examples
mod = bayes_met(data = soy_pat, gen = "gen", loc = "env", repl = NULL, trait = "PH", reg = NULL, year = NULL, res.het = TRUE, iter = 2000, cores = 2, chain = 4)mod2 = bayes_met(data = soy_pat, gen = "gen", loc = "env", repl = NULL, trait = "GY", reg = NULL, year = NULL, res.het = TRUE, iter = 2000, cores = 2, chain = 4)mod3 = bayes_met(data = soy_pat, gen = "gen", loc = "env", repl = NULL, trait = "NDM", reg = NULL, year = NULL, res.het = TRUE, iter = 2000, cores = 2, chain = 4)models=list(mod,mod2,mod3)names(models) <- c("PH","GY","NDM")increase = c(FALSE,TRUE,FALSE)names(increase) <- names(models)probs = list()for (i in names(models)) { outs <- extr_outs(model = models[[i]], probs = c(0.05, 0.95), verbose = TRUE) probs[[i]] <- prob_sup( extr = outs, int = .2, increase = increase[[i]], save.df = FALSE, verbose = TRUE )}index = bpsi( problist = probs, increase = increase, int = 0.1, lambda = c(1, 2, 1), save.df = FALSE)Extract outputs fromstanfit objects obtained frombayes_met
Description
Extracts outputs of the Bayesian model fittedusingbayes_met(), and provides some diagnostics.
Usage
extr_outs(model, probs = c(0.025, 0.975), verbose = FALSE)Arguments
model | An object of class |
probs | A vector with two elements representing the probabilities(in decimal scale) that will be considered for computing the quantiles. |
verbose | A logical value. If |
Details
More details about the usage ofextr_outs and other functions oftheProbBreed package can be found athttps://saulo-chaves.github.io/ProbBreed_site/.
Value
The function returns an object of classextr, which is a list with:
variances: a data frame containing the variance components ofthe model effects, their standard deviation, naive standard error and highestposterior density interval.post: a list with the posterior of the effects, and the datagenerated by the model.map: a list with the maximum posterior values of each effectppcheck: a matrix containing the p-values of maximum, minimum,median, mean and standard deviation; effective number of parameters, WAIC2value, Rhat and effective sample size.
See Also
rstan::stan_diag,ggplot2::ggplot,rstan::check_hmc_diagnostics,plot.extr
Examples
mod = bayes_met(data = maize, gen = "Hybrid", loc = "Location", repl = c("Rep","Block"), trait = "GY", reg = "Region", year = NULL, res.het = TRUE, iter = 2000, cores = 2, chain = 4)outs = extr_outs(model = mod, probs = c(0.05, 0.95), verbose = TRUE)Maize real dataset
Description
This dataset belongs to value of cultivation and use maize trials ofEmbrapa Maize and Sorghum, and was used by Dias et al. (2022).It contains the grain yield of 32 single-cross hybrids and four commercial checks(36 genotypes in total) evaluated in 16 locations acrossfive regions or mega-environments. These trials were laid out in incomplete blocksdesign, using a block size of 6 and two replications per trial.
Usage
maizeFormat
maize
A data frame with 823 rows and 6 columns:
- Location
16 locations
- Region
5 regions
- Rep
2 replicates
- Block
6 blocks
- Hybrid
36 genotypes
- GY
Grain yield (phenotypes)
Source
Dias, K. O. G, Santos, J. P. R., Krause, M. D., Piepho, H. -P., Guimarães, L. J. M.,Pastina, M. M., and Garcia, A. A. F. (2022). Leveraging probability conceptsfor cultivar recommendation in multi-environment trials.Theoretical andApplied Genetics, 133(2):443-455.doi:10.1007/s00122-022-04041-y
Plots for thebpsi object
Description
Build plots using the outputs stored in thebpsi object.
Usage
## S3 method for class 'bpsi'plot(x, ..., category = "BPSI")Arguments
x | An object of class |
... | currently not used |
category | A string indicating which plot to build. There are currently twotypes of visualizations. Set "Ranks" for bar plots along each trait and "BPSI" (default) for circular bar plots multitrait. |
Author(s)
José Tiago Barroso Chagas
References
Chagas, J. T. B., Dias, K. O. das G., Quintão Carneiro, V., de Oliveira, L. M. C., Nunes, N. X., Júnior, J. D. P., Carneiro, P. C. S., & Carneiro, J. E. de S. (2025).Bayesian probabilistic selection index in the selection of common bean families.Crop Science, 65(3).doi:10.1002/CSC2.70072
See Also
Examples
mod = bayes_met(data = soy_pat, gen = "gen", loc = "env", repl = NULL, trait = "PH", reg = NULL, year = NULL, res.het = TRUE, iter = 2000, cores = 2, chain = 4)mod2 = bayes_met(data = soy_pat, gen = "gen", loc = "env", repl = NULL, trait = "GY", reg = NULL, year = NULL, res.het = TRUE, iter = 2000, cores = 2, chain = 4)mod3 = bayes_met(data = soy_pat, gen = "gen", loc = "env", repl = NULL, trait = "NDM", reg = NULL, year = NULL, res.het = TRUE, iter = 2000, cores = 2, chain = 4)models=list(mod,mod2,mod3)names(models) <- c("PH","GY","NDM")increase = c(FALSE,TRUE,FALSE)names(increase) <- names(models)probs = list()for (i in names(models)) { outs <- extr_outs(model = models[[i]], probs = c(0.05, 0.95), verbose = TRUE) probs[[i]] <- prob_sup( extr = outs, int = .2, increase = increase[[i]], save.df = FALSE, verbose = TRUE )}index = bpsi( problist = probs, increase = increase, int = 0.1, lambda = c(1, 2, 1), save.df = FALSE)plot(index, category = "BPSI")plot(index, category = "Ranks")Plots for theextr object
Description
Build plots using the outputs stored in theextr object.
Usage
## S3 method for class 'extr'plot(x, ..., category = "ppdensity")Arguments
x | An object of class |
... | Passed toggplot2::geom_histogram, when |
category | A string indicating which plot to build. See options in the Details section. |
Details
The available options are:
ppdensity: Density plots of the empirical and sampled data, useful to assess themodel's convergence.density: Density plots of the model's effects.histogram: Histograms of the model's effects.traceplot: Trace plot showing the changes in the effects' values across iterations and chains.
See Also
Examples
mod = bayes_met(data = maize, gen = "Hybrid", loc = "Location", repl = c("Rep","Block"), trait = "GY", reg = "Region", year = NULL, res.het = TRUE, iter = 2000, cores = 2, chain = 4)outs = extr_outs(model = mod, probs = c(0.05, 0.95), verbose = TRUE)plot(outs, category = "ppdensity")plot(outs, category = "density")plot(outs, category = "histogram")plot(outs, category = "traceplot")Plots for theprobsup object
Description
Build plots using the outputs stored in theprobsup object.
Usage
## S3 method for class 'probsup'plot(x, ..., category = "perfo", level = "across")Arguments
x | An object of class |
... | currently not used |
category | A string indicating which plot to build. See options in the Details section. |
level | A string indicating the information level to be used for buildingthe plots. Options are |
Details
The available options are:
hpd: a caterpillar plot representing the marginal genotypic value ofeach genotype, and their respective highest posterior density interval (95% represented by thethick line, and 97.5% represented by the thin line). Available only iflevel = "across".perfo: iflevel = "across", a lollipop plot illustrating the probabilities of superior performance.Iflevel = "within", a heatmap with the probabilities of superior performance withinenvironments. If a model withregand/oryearis fitted, multiple plots are produced.stabi: a lollipop plot with the probabilities of superior stability.If a model withregand/oryearis fitted, multiple plots are produced.Available only iflevel = "across". Unavailable if an entry-mean model was used inbayes_met.pair_perfo: iflevel = "across", a heatmap representing the pairwise probability of superiorperformance (the probability of genotypes at thex-axis being superior.to those on they-axis). Iflevel = "within", a list of heatmaps representing the pairwise probability of superiorperformance within environments. If a model withregand/oryearis fitted, multiple plots (and multiple lists) are produced.Should this option is set, it is mandatory to store the outputs in an object.(e.g.,pl <- plot(obj, category = "pair_perfo", level = "within")) so they can be visualized one at a time.The optionlevel = "within"is unavailable if an entry-mean model was used inbayes_met.pair_stabi: a heatmap with the pairwise probabilities of superior stability(the probability of genotypes at thex-axis being more stable than those on they-axis).If a model withregand/oryearis fitted, multiple plots are produced. Available only iflevel = "across".Unavailable if an entry-mean model was used inbayes_met.joint: a lollipop plot with the joint probabilities of superior performance and stability. Unavailable if an entry-mean model was used inbayes_met.
See Also
Examples
mod = bayes_met(data = maize, gen = "Hybrid", loc = "Location", repl = c("Rep","Block"), trait = "GY", reg = "Region", year = NULL, res.het = TRUE, iter = 2000, cores = 2, chain = 4)outs = extr_outs(model = mod, probs = c(0.05, 0.95), verbose = TRUE)results = prob_sup(extr = outs, int = .2, increase = TRUE, save.df = FALSE, verbose = FALSE)plot(results, category = "hpd")plot(results, category = "perfo", level = "across")plot(results, category = "perfo", level = "within")plot(results, category = "stabi")plot(results, category = "pair_perfo", level = "across")plwithin = plot(results, category = "pair_perfo", level = "within")plot(results, category = "pair_stabi")plot(results, category = "joint")Print an object of classbpsi
Description
Print abpsi object in R console
Usage
## S3 method for class 'bpsi'print(x, ...)Arguments
x | An object of class |
... | currently not used |
Author(s)
José Tiago Barroso Chagas
See Also
Print an object of classextr
Description
Print aextr object in R console
Usage
## S3 method for class 'extr'print(x, ...)Arguments
x | An object of class |
... | currently not used |
See Also
Print an object of classprobsup
Description
Print aprobsup object in R console
Usage
## S3 method for class 'probsup'print(x, ...)Arguments
x | An object of class |
... | currently not used |
See Also
Probabilities of superior performance and stability
Description
This function estimates the probabilities of superior performance and stabilityacross environments, and probabilities of superior performance within environments.
Usage
prob_sup(extr, int, increase = TRUE, save.df = FALSE, verbose = FALSE)Arguments
extr | An object of class |
int | A numeric representing the selection intensity(between 0 and 1) |
increase | Logical. |
save.df | Logical. Should the data frames be saved in the work directory? |
verbose | A logical value. If |
Details
Probabilities provide the risk of recommending a selection candidate for a targetpopulation of environments or for a specific environment.prob_supcomputes the probabilities of superior performance and the probabilities of superior stability:
Probability of superior performance
Let\Omega represent the subset of selected genotypes based on theirperformance across environments. A given genotypej will belong to\Omegaif its genotypic marginal value (\hat{g}_j) is high or low enough compared toits peers.prob_sup leverages the Monte Carlo discretized samplingfrom the posterior distribution to emulate the occurrence ofS trials. Then,the probability of thej^{th} genotype belonging to\Omega is theratio of success (\hat{g}_j \in \Omega) events and the total number of sampled events,as follows:
Pr\left(\hat{g}_j \in \Omega \vert y \right) = \frac{1}{S}\sum_{s=1}^S{I\left(\hat{g}_j^{(s)} \in \Omega \vert y\right)}
whereS is the total number of samples\left(s = 1, 2, ..., S \right),andI\left(g_j^{(s)} \in \Omega \vert y\right) is an indicator variable that can assumetwo values: (1) if\hat{g}_j^{(s)} \in \Omega in thes^{th} sample,and (0) otherwise.S is conditioned to the number of iterations and chainspreviously set atbayes_met.
Similarly, the within-environment probability of superior performance can be applied toindividual environments. Let\Omega_k represent the subset of superiorgenotypes in thek^{th} environment, so that the probability of thej^{th} \in \Omega_k can calculated as follows:
Pr\left(\hat{g}_{jk} \in \Omega_k \vert y\right) = \frac{1}{S} \sum_{s=1}^S I\left(\hat{g}_{jk}^{(s)} \in \Omega_k \vert y\right)
whereI\left(\hat{g}_{jk}^{(s)} \in \Omega_k \vert y\right) is an indicator variablemapping success (1) if\hat{g}_{jk}^{(s)} exists in\Omega_k, andfailure (0) otherwise, and\hat{g}_{jk}^{(s)} = \hat{g}_j^{(s)} + \widehat{ge}_{jk}^{(s)}.Note that when computing within-environment probabilities, we are accounting forthe interaction of thej^{th} genotype with thek^{th}environment.
The pairwise probabilities of superior performance can also be calculated acrossor within environments. This metric assesses the probability of thej^{th}genotype being superior to another experimental genotype or a commercial check.The calculations are as follows, across and within environments, respectively:
Pr\left(\hat{g}_{j} > \hat{g}_{j^\prime} \vert y\right) = \frac{1}{S} \sum_{s=1}^S I\left(\hat{g}_{j}^{(s)} > \hat{g}_{j^\prime}^{(s)} \vert y\right)
or
Pr\left(\hat{g}_{jk} > \hat{g}_{j^\prime k} \vert y\right) = \frac{1}{S} \sum_{s=1}^S I\left(\hat{g}_{jk}^{(s)} > \hat{g}_{j^\prime k}^{(s)} \vert y\right)
These equations are set for when the selection direction is positive. Ifincrease = FALSE,> is simply switched by<.
Probability of superior stability
This probability makes a direct analogy with themethod of Shukla (1972): a stable genotype is the one that has a lowgenotype-by-environment interaction variance[var(\widehat{ge})].Using the same probability principles previously described, the probabilityof superior stability is given as follows:
Pr \left[var \left(\widehat{ge}_{jk}\right) \in \Omega \vert y \right] = \frac{1}{S} \sum_{s=1}^S I\left[var \left(\widehat{ge}_{jk}^{(s)} \right) \in \Omega \vert y \right]
whereI\left[var \left(\widehat{ge}_{jk}^{(s)} \right) \in \Omega \vert y \right] indicates ifvar\left(\widehat{ge}_{jk}^{(s)}\right) exists in\Omega (1) or not (0).Pairwise probabilities of superior stability are also possible in this context:
Pr \left[var \left(\widehat{ge}_{jk} \right) < var\left(\widehat{ge}_{j^\prime k} \right) \vert y \right] = \frac{1}{S} \sum_{s=1}^S I \left[var \left(\widehat{ge}_{jk} \right)^{(s)} < var \left(\widehat{ge}_{j^\prime k} \right)^{(s)} \vert y \right]
Note thatj will be superior toj^\prime if it has alowervariance of the genotype-by-environment interaction effect. This is true regardlessifincrease is set toTRUE orFALSE.
The joint probability independent events is the product of the individual probabilities.The estimated genotypic main effects and the variances of GEI effects are independentby design, thus the joint probability of superior performance and stability as follows:
Pr \left[\hat{g}_j \in \Omega \cap var \left(\widehat{ge}_{jk} \right) \in \Omega \right] = Pr \left(\hat{g}_j \in \Omega \right) \times Pr \left[var \left(\widehat{ge}_{jk} \right) \in \Omega \right]
The estimation of these probabilities are strictly related to some key questions thatconstantly arises in plant breeding:
What is the risk of recommending a selection candidate for a target population of environments?
What is the probability of a given selection candidate having good performance ifrecommended to a target population of environments? And for a specific environment?
What is the probability of a given selection candidate having better performancethan a cultivar check in the target population of environments? And in specific environments?
How probable is it that a given selection candidate performs similarly across environments?
What are the chances that a given selection candidate is more stablethan a cultivar check in the target population of environments?
What is the probability that a given selection candidate having asuperior and invariable performance across environments?
More details about the usage ofprob_sup, as well as the other function oftheProbBreed package can be found athttps://saulo-chaves.github.io/ProbBreed_site/.
Value
The function returns an object of classprobsup, which contains two lists,one with theacross-environments probabilities, and another with thewithin-environments probabilities.If an entry-mean model was used inProbBreed::bayes_met, only theacross list will be available.
Theacross list has the following elements:
g_hpd: Highest posterior density (HPD) of the posterior genotypic main effects.perfo: the probabilities of superior performance.pair_perfo: the pairwise probabilities of superior performance.stabi: a list with the probabilities of superior stability. It contains the data framesgl,gm(whenregis notNULL) andgt(whenyearis notNULL). Unavailable if an entry-mean model was used inbayes_met.pair_stabi: a list with the pairwise probabilities of superior stability. It contains the data framesgl,gm(whenregis notNULL) andgt(whenyearis notNULL). Unavailable if an entry-mean model was used inbayes_met.joint_prob: the joint probabilities of superior performance and stability. Unavailable if an entry-mean model was used inbayes_met.
Thewithin list has the following elements:
perfo: a list of data frames containing the probabilities of superior performancewithin locations (gl), regions (gm) and years (gt).pair_perfo: lists with the pairwise probabilities of superior performancewithin locations (gl), regions (gm) and years (gt).
References
Dias, K. O. G, Santos, J. P. R., Krause, M. D., Piepho, H. -P., Guimarães, L. J. M.,Pastina, M. M., and Garcia, A. A. F. (2022). Leveraging probability conceptsfor cultivar recommendation in multi-environment trials.Theoretical andApplied Genetics, 133(2):443-455.doi:10.1007/s00122-022-04041-y
Shukla, G. K. (1972) Some statistical aspects of partioning genotype environmentalcomponentes of variability.Heredity, 29:237-245.doi:10.1038/hdy.1972.87
See Also
Examples
mod = bayes_met(data = maize, gen = "Hybrid", loc = "Location", repl = c("Rep","Block"), trait = "GY", reg = "Region", year = NULL, res.het = TRUE, iter = 2000, cores = 2, chain = 4)outs = extr_outs(model = mod, probs = c(0.05, 0.95), verbose = TRUE)results = prob_sup(extr = outs, int = .2, increase = TRUE, save.df = FALSE, verbose = FALSE)Soybean real dataset
Description
This dataset belongs to the USDA Northern Region Uniform Soybean Tests,and it is a subset of the data used by Krause et al. (2023). It contains theempirical best linear unbiased estimates of genotypic means of the seed yieldfrom 39 experimental genotypes evaluated in 14 locations. The original data, available at the packageSoyURT, has 4,257 experimental genotypes evaluated at 63 locations and31 years resulting in 591 location-year combinations (environments) with39,006 yield values.
Usage
soyFormat
soy
A data frame with 823 rows and 3 columns:
- Loc
14 locations
- Gen
39 experimental genotypes
- Y
435 EBLUEs (phenotypes)
Source
Krause, M. D., Dias, K. O. G., Singh A. K., Beavis W. D. (2023). Using soybeanhistorical field trial data to study genotype by environmentvariation and identify mega-environments with the integrationof genetic and non-genetic factors.Agronomy Journal,117(1):170023.doi:10.1002/agj2.70023
Soybean Pan-African Trials data set
Description
This data set belongs to the Soybean Pan-African Trials (PAT). This subset hasthe best linear unbiased estimates of grain yield (GY), plant height (PH) andnumber of days to maturity (NDM) of 65 soybean genotypes evaluated over 19 environments.The complete data set is available at Araújo et al. (2025) (check references).It contains the empirical best linear unbiased estimates ofgrain yield (GY), plant height (PH) and number of days to maturity (NDM)from 65 experimental genotypes evaluated in 19 locations.
Usage
soy_patFormat
soy_pat
A data frame with 540 rows and 5 columns:
- Env
19 environments
- Gen
65 experimental genotypes
- Plant_Height
395 BLUEs - Plant height measurements
- Grain_Yield
525 BLUEs - Grain yield measurements
- Days_to_Maturity
312 BLUEs - Number of days to maturity
Source
Araújo, M. S., Chaves, S., Ferreira, G. N. C., Chigeza, G.,Leles, E. P., Santos, M. F. S., Diers, B. W., Goldsmith, P.,and Pinheiro, J. B. (2025). High-resolution soybean trial data supporting theexpansion of agriculture in Africa.Scientific Data, 12:1908.doi:10.1038/s41597-025-06190-3