pymc-devs/pymc-extrasPublic

NotificationsYou must be signed in to change notification settings
Fork74
Star127

Transforming PyMC models to`ModelGraph` and back#112

lucianopaz started this conversation inIdeas

Transforming PyMC models to `ModelGraph` and back#112

lucianopaz

Feb 23, 2023

· 1 comment

Return to top

Discussion options

lucianopaz
Feb 23, 2023
Maintainer

This started out as an internal discussion some months ago. Since@ricardoV94 has opened#111, I thought that it would be best to condense all of the discussion here and open it up to everyone that is interested.

Goal

PyMC provides a way to define a generative model (and also non-generative models through the use ofPotentials) and then gives access to automatic ways of drawing samples from the prior, posterior and posterior predictive distributions. Since random variables are now aTensor (at some point viatheano, thenaesara and nowpytensor), we can leverage the computational backend to do rewrites of models. I'll list a few cases that are relevant use cases of model rewrites:

It allows us to implement ado operator (discussedhere). Where we say, "replace some random variable with a given value"
It allows us to implement anobserve operator. This would allow us to first define a model, and then say "this rv should have these observed values". Maybedo andobserve are equivalent because I don't know enough about do calculus, but from my naive point of view,observe adds a logp term conditioned on the observed values, anddo simply sets the values but ignores the logp.
We can define the model in some form and then automatically get different variants. For example, get a model that marginalizes out a variable, exploits a conjugacy relation or changes a parametrization.
We can also address cases where we want different forms for observed and unobserved variables easily:Implement DiffTransform for RandomWalk distributions pymc#6098
More generally we can do arbitrary functions that take models as inputs and return new models as outputs. For instance, we can use this for GP predictions, where we replace the GP prior by the GPconditional so that all variables and deterministics downstream now depend on the conditional when you dosample_posterior_predictive. Before, users had to recreate their deterministics, on top of the conditional manually.

PyTensor (andaesara before that) enable ways to rewrite the computational graph. The piece that is missing is to connect PyMCModels with the entities that are used for applying rewrites:FunctionGraphs.

What do`PyMC` models store?

PyMC models work as bookkeepers of a few things:

Free random variables
Observed random variables
Deterministics
Potentials
Dimension names
Dimension coordinate values or lengths
Model name to prepend to newly added entities
A compile configuration (unless this has been deprecated at some point and I'm not aware of it)
The parent model context if the current model is nested within another one
Stores a mapping between variables and their transformed counterparts in unconstrained space (used for running inference, but not forward samples)

Many of the above entities are simpleTensorVariables and mappings between them. This means that we could very plainly use aFunctionGraph that takes as outputs the random variables, deterministics and potentials as outputs, and we'd get thePyMC model's induced function graph. We will need to store the rest of the information somewhere to be able to make the leap back into aModel instance.

How to target rewrites?

We need to choose how to track some of the meta information. In particular, how to track whichTensorVariable is a free random variable, an observed random variable, a deterministic or a potential. I see two alternatives

Store this information in thetag of theTensor
Wrap theTensor using a newOp (the approach taken inAdd utility to convert Model to and from FunctionGraph #111)

The benefit of the first is that rewrites don't need to reason about newly inventedOps since they can work with whatever was in the computational graph. The benefit of the second approach is that the newOps can include extra information that conditions the shapes and value variables of the resulting RVs (e.g.#111 includes the dimension information in theOps directly).

There is still a lot of extra information that we need to carry around with us: the mappings, the configuration and scopes. All of these could potentially be included throughFeatures that are appended to theFunctionGraph.

You must be logged in to vote

Replies: 1 comment

Comment options

ricardoV94
Feb 27, 2023
Maintainer

@lucianopaz thanks for the write-up. My original approach did not introduce the new dummyOps, but I came to think we actually want them. Let me try to explain.

Value variables

For others reading the discussion: The value variables define the conditioning points of the logp graph and, possibly, the input variables of a random graph (do-operator, posterior predictive based on traced values). The goal obviously matters, but in general, I think we want to reason explicitly about the "placement" of value variables in our rewrites.

To give a concrete example, note that a graph like:

withpm.Model()asm:x=pm.Normal.dist()y=pm.Normal.dist()+xm.register_rv(x,"x")m.register_rv(y,"y")

Is very different from the following, for the purposes of logp evaluation / MCMC sampling:

withpm.Model()asm:x=pm.Normal.dist()y=pm.Normal.dist()z=x+ym.register_rv(x,"x")m.register_rv(y,"y")m.add_named_variable(z,"z")# Deterministic

Which is also different than the following (whose logp/ MCMC sampling is currently unsupported by PyMC):

withpm.Model()asm:x=pm.Normal.dist()y=pm.Normal.dist()z=x+ym.register_rv(z,"z")

The newFreeRV /ObservedRV Op naturally force us to treat those 3 graphs differently. In general I think that's what we would want to do anyway.

Other times, our rewrites may be just about changing the conditioning points, without altering anything from the random graph:

FreeRV(Cumsum(Normal(0, 1, size=10), axis=0)) -> Cumsum(FreeRV(Normal(0, 1, size=10)), axis=0)

Again it helps that they are an explicit part of the graph. We can use the same "language" to do these rewrites.

It may be worthwhile to note that this type of marker Op's were introduced in the IR reperesentation of Aeppl recently, because we always needed to check that the "source of measurability" was not being conditioned on already. This is a more specific reason related to the logp rewrites, but I think it shows how these markers may be generally useful:aesara-devs/aeppl#78

Some rewrites require manipulating the value variables themselves. Examples: splitting observed/missing components; splitting the initial and innovation steps of a time-series so that they can be sampled separately; removing a value variable during marginalization.

Having the variables directly as inputs to these dummy Ops gives us a very natural hook to manipulate them. Old Aeppl and current PyMC had to add aupdate_rv_maps method to be able to do this sort of manipulations. It feels much more clean to use the same native PyTensor rewrite features as we do for changes in the random graphs, which requires having value variables explicitly in the graph.

https://github.com/pymc-devs/pymc/blob/f96594bb215b44197615c695130c9d60e1bf9601/pymc/logprob/rewriting.py#L121-L150

Potentials

I also think it makes a lot of sense to label Potentials, because those correspond to expressions that exist on the logp space, and have nothing to do with the random space. We usually don't want to mess with them when we manipulate "random" graphs.

Deterministics

The exception here is Deterministics! Initially I didn't add a dummy Op for them, and they were just an "un-wrapped" output. I ended up adding them just because it looked cleaner, but I think my initial hunch was correct! Deterministics shouldn't constrain our rewrites at all.

I think we should add them as new "copy" outputs, and leave them out of the main random graph. So the following user-defined graph:

withpm.Model()asm:x=pm.Normal("x")exp_x=pm.Determinsitic("exp_x",pm.math.exp(x))y=pm.Normal("y",exp_x)

Should be represented internally as the following:

withpm.Model()asm:x=pm.Normal("x")exp_x=pm.math.exp("x")y=pm.Normal("y",exp_x)pm.Determinsitic("exp_x",exp_x.copy())

We can still add the dummy Deterministic when we putexp_x.copy() as one of the outputs of the FunctionGraph, just for easy of labelling, but this label should never show up in the graph betweeny andx. I will update the PR soon with this change.

You must be logged in to vote

0 replies

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Transforming PyMC models to`ModelGraph` and back#112

Uh oh!

{{title}}

Uh oh!

lucianopaz
Feb 23, 2023
Maintainer

Goal

What do`PyMC` models store?

How to target rewrites?

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

ricardoV94
Feb 27, 2023
Maintainer

Value variables

Potentials

Deterministics

Select a reply

Uh oh!

Movatterモバイル変換

Transforming PyMC models toModelGraph and back#112

Uh oh!

lucianopazFeb 23, 2023 Maintainer

Goal

What doPyMC models store?

How to target rewrites?

Replies: 1 comment

Uh oh!

Uh oh!

ricardoV94Feb 27, 2023 Maintainer

Value variables

Potentials

Deterministics

Uh oh!

Transforming PyMC models to`ModelGraph` and back#112

lucianopaz
Feb 23, 2023
Maintainer

What do`PyMC` models store?

ricardoV94
Feb 27, 2023
Maintainer