The continuous Bernoulli distribution arises indeep learning andcomputer vision, specifically in the context ofvariational autoencoders,[4][5] for modeling the pixel intensities of natural images. As such, it defines a proper probabilistic counterpart for the commonly used binarycross entropy loss, which is often applied to continuous,-valued data.[6][7][8][9] This practice amounts to ignoring the normalizing constant of the continuous Bernoulli distribution, since the binary cross entropy loss only defines a true log-likelihood for discrete,-valued data.
The continuous Bernoulli also defines anexponential family of distributions. Writing for thenatural parameter, the density can be rewritten in canonical form:.[10]
where the left hand side is the expected value of continuous Bernoulli with parameter. Although does not admit a closed-form expression, it can be easily calculated with numerical inversion.
where is a scalar parameter between 0 and 1. Applying this same functional form on the continuous interval results in the continuous Bernoulliprobability density function, up to a normalizing constant.
^Loaiza-Ganem, G., & Cunningham, J. P. (2019). The continuous Bernoulli: fixing a pervasive error in variational autoencoders. In Advances in Neural Information Processing Systems (pp. 13266-13276).
^Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.
^Kingma, D. P., & Welling, M. (2014, April). Stochastic gradient VB and the variational auto-encoder. In Second International Conference on Learning Representations, ICLR (Vol. 19).
^Larsen, A. B. L., Sønderby, S. K., Larochelle, H., & Winther, O. (2016, June). Autoencoding beyond pixels using a learned similarity metric. In International conference on machine learning (pp. 1558-1566).
^Jiang, Z., Zheng, Y., Tan, H., Tang, B., & Zhou, H. (2017, August). Variational deep embedding: an unsupervised and generative approach to clustering. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (pp. 1965-1972).
^Lee, C. J.; Dahl, B. K.; Ovaskainen, O.; Dunson, D. B. (2025).Scalable and robust regression models for continuous proportional data. arXiv preprint arXiv:2504.15269.https://arxiv.org/abs/2504.15269
^Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). The continuous categorical: a novel simplex-valued exponential family. In 36th International Conference on Machine Learning, ICML 2020. International Machine Learning Society (IMLS).