Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Support units of measurement in PyMC#7812

drbenvincent started this conversation inIdeas
Discussion options

Many statistical models (especially in scientific, engineering, and health applications) are built around physical quantities that carry units (e.g., meters, seconds, kilograms). Currently, PyMC treats all variables as unitless, which may lead to misinterpretations or errors when combining data with different units or when interpreting model parameters.

Adding optional support for units would:

  • Improve model readability and transparency
  • Enable automated unit-checking to catch errors (e.g., adding kg to m)
  • Facilitate interpretation of parameter estimates and priors

This could be implemented via integration with existing Python libraries likepint

In general, the idea would be tooptionally specify the units. Initially it might be required to specify the units of all data and parameters and pymc could help with unit checking to avoid errors.

This would be useful as we could check unit consistency, but also make errors less likely (e.g. expressing slope priors in the wrong units).

It could be very fun to exploreunit inference. For example, if you specify the units of data but not the parameters, if we are regressingweight ~ age where weight is in kg and age is in years, the model could infer that the slope is in units of kg/year and the intercept is in units of kg.

I'll leave it there - this is an initial proposal which is intended to spark discussion.

You must be logged in to vote

Replies: 3 comments 4 replies

Comment options

Code examples?

You must be logged in to vote
2 replies
@drbenvincent
Comment options

drbenvincentJun 8, 2025
Collaborator Author

Something vaguely like this?

importpintimportpymcaspmureg=pint.UnitRegistry()age=pm.Data("age", [10,20,30,40]*ureg.year)weight=pm.Data("weight", [50,60,65,67]*ureg.kg)withpm.Model()asmodel:intercept=pm.Normal("intercept",mu=1.2*ureg.kg,sigma=0.2*ureg.kg)beta=pm.Normal("beta",mu=0.5*ureg.kg/ureg.year,sigma=0.1*ureg.kg/ureg.year)mu=pm.Deterministic("mu",intercept+beta*age)# PyMC infers mu should be in kgpm.Normal("obs",mu=mu,sigma=0.5*ureg.kg,observed=weight)
  • PyMC does unit checks and throws errors if there are incompatibilities
  • PyMC optionally infers units of any nodes where units are not provided, or throws an error if it is not possible, asking for units of more variables.
  • In theory I guess you could allow the intercept's mu to be provided in kg and the sigma in another weight unit and auto-convert, but perhaps emit a warning.
  • Units would be incorporated in the idata
@Armavica
Comment options

A few thoughts:

  • "PyMC optionally infers units of any nodes where units are not provided": I am not sure how this would work, how would you distinguish a dimensionless variable and a variable with unspecified units?
  • "PyMC infers mu should be in kg": would there be a way to impose that? something likepm.Deterministic("mu", [...], unit=ureg.kg) that would throw an error if the expression is incompatible?
  • Perhaps it could also allowunit=ureg.gram, which is compatible because it's also a mass, and make the conversion transparently?
Comment options

Chiming in here to say that Ireally like the idea of explicit units. Mis-managing units is a really common error in data analysis, leading to mistakes in published papers (a couple off the top of my head:https://www.pnas.org/doi/10.1073/pnas.1900438116,https://www.sciencedirect.com/science/article/pii/S004565352402811X).

You must be logged in to vote
0 replies
Comment options

I could have sworn there was another discussion thread on this somewhere else started by@williambdean where I put some thoughts on this, but I can't find it now.

First, I love this idea, and I would like to have it. I think there's a ton of powerful stuff we can do with automatic reparameterization if we know units and we know conversions between the units a scientists wants to "think" in and units that are naturally more compatible for sampling. These could form the basis for RV transformations (with appropriate jacobian correction), the same way we handle sampling RVs that don't like on R+.

That said, I think it's something that should be developed on top of pytensor first. We really want to be able to reason graphically about metadata. I've had been thinking mostly about mathematical properties like "strictly postitive" or "real", or matrix structure like "lower triangular", "banded", "block diagonal". But I think units also fits very naturally into this structure, and it's an incredibly exciting direction to go in.

You must be logged in to vote
2 replies
@williambdean
Comment options

This one?arviz-devs/preliz#674

@jessegrabowski
Comment options

Yes exactly!

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Category
Ideas
Labels
None yet
6 participants
@drbenvincent@Armavica@ricardoV94@jessegrabowski@williambdean@ErikRingen

[8]ページ先頭

©2009-2025 Movatter.jp