Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Transforming PyMC models toModelGraph and back#112

lucianopaz started this conversation inIdeas
Discussion options

This started out as an internal discussion some months ago. Since@ricardoV94 has opened#111, I thought that it would be best to condense all of the discussion here and open it up to everyone that is interested.

Goal

PyMC provides a way to define a generative model (and also non-generative models through the use ofPotentials) and then gives access to automatic ways of drawing samples from the prior, posterior and posterior predictive distributions. Since random variables are now aTensor (at some point viatheano, thenaesara and nowpytensor), we can leverage the computational backend to do rewrites of models. I'll list a few cases that are relevant use cases of model rewrites:

  • It allows us to implement ado operator (discussedhere). Where we say, "replace some random variable with a given value"
  • It allows us to implement anobserve operator. This would allow us to first define a model, and then say "this rv should have these observed values". Maybedo andobserve are equivalent because I don't know enough about do calculus, but from my naive point of view,observe adds a logp term conditioned on the observed values, anddo simply sets the values but ignores the logp.
  • We can define the model in some form and then automatically get different variants. For example, get a model that marginalizes out a variable, exploits a conjugacy relation or changes a parametrization.
  • We can also address cases where we want different forms for observed and unobserved variables easily:Implement DiffTransform for RandomWalk distributions pymc#6098
  • More generally we can do arbitrary functions that take models as inputs and return new models as outputs. For instance, we can use this for GP predictions, where we replace the GP prior by the GPconditional so that all variables and deterministics downstream now depend on the conditional when you dosample_posterior_predictive. Before, users had to recreate their deterministics, on top of the conditional manually.

PyTensor (andaesara before that) enable ways to rewrite the computational graph. The piece that is missing is to connect PyMCModels with the entities that are used for applying rewrites:FunctionGraphs.

What doPyMC models store?

PyMC models work as bookkeepers of a few things:

  • Free random variables
  • Observed random variables
  • Deterministics
  • Potentials
  • Dimension names
  • Dimension coordinate values or lengths
  • Model name to prepend to newly added entities
  • A compile configuration (unless this has been deprecated at some point and I'm not aware of it)
  • The parent model context if the current model is nested within another one
  • Stores a mapping between variables and their transformed counterparts in unconstrained space (used for running inference, but not forward samples)

Many of the above entities are simpleTensorVariables and mappings between them. This means that we could very plainly use aFunctionGraph that takes as outputs the random variables, deterministics and potentials as outputs, and we'd get thePyMC model's induced function graph. We will need to store the rest of the information somewhere to be able to make the leap back into aModel instance.

How to target rewrites?

We need to choose how to track some of the meta information. In particular, how to track whichTensorVariable is a free random variable, an observed random variable, a deterministic or a potential. I see two alternatives

The benefit of the first is that rewrites don't need to reason about newly inventedOps since they can work with whatever was in the computational graph. The benefit of the second approach is that the newOps can include extra information that conditions the shapes and value variables of the resulting RVs (e.g.#111 includes the dimension information in theOps directly).

There is still a lot of extra information that we need to carry around with us: the mappings, the configuration and scopes. All of these could potentially be included throughFeatures that are appended to theFunctionGraph.

You must be logged in to vote

Replies: 1 comment

Comment options

@lucianopaz thanks for the write-up. My original approach did not introduce the new dummyOps, but I came to think we actually want them. Let me try to explain.

Value variables

For others reading the discussion: The value variables define the conditioning points of the logp graph and, possibly, the input variables of a random graph (do-operator, posterior predictive based on traced values). The goal obviously matters, but in general, I think we want to reason explicitly about the "placement" of value variables in our rewrites.

To give a concrete example, note that a graph like:

withpm.Model()asm:x=pm.Normal.dist()y=pm.Normal.dist()+xm.register_rv(x,"x")m.register_rv(y,"y")

Is very different from the following, for the purposes of logp evaluation / MCMC sampling:

withpm.Model()asm:x=pm.Normal.dist()y=pm.Normal.dist()z=x+ym.register_rv(x,"x")m.register_rv(y,"y")m.add_named_variable(z,"z")# Deterministic

Which is also different than the following (whose logp/ MCMC sampling is currently unsupported by PyMC):

withpm.Model()asm:x=pm.Normal.dist()y=pm.Normal.dist()z=x+ym.register_rv(z,"z")

The newFreeRV /ObservedRV Op naturally force us to treat those 3 graphs differently. In general I think that's what we would want to do anyway.

Other times, our rewrites may be just about changing the conditioning points, without altering anything from the random graph:

FreeRV(Cumsum(Normal(0, 1, size=10), axis=0)) -> Cumsum(FreeRV(Normal(0, 1, size=10)), axis=0)

Again it helps that they are an explicit part of the graph. We can use the same "language" to do these rewrites.

It may be worthwhile to note that this type of marker Op's were introduced in the IR reperesentation of Aeppl recently, because we always needed to check that the "source of measurability" was not being conditioned on already. This is a more specific reason related to the logp rewrites, but I think it shows how these markers may be generally useful:aesara-devs/aeppl#78

Some rewrites require manipulating the value variables themselves. Examples: splitting observed/missing components; splitting the initial and innovation steps of a time-series so that they can be sampled separately; removing a value variable during marginalization.

Having the variables directly as inputs to these dummy Ops gives us a very natural hook to manipulate them. Old Aeppl and current PyMC had to add aupdate_rv_maps method to be able to do this sort of manipulations. It feels much more clean to use the same native PyTensor rewrite features as we do for changes in the random graphs, which requires having value variables explicitly in the graph.

https://github.com/pymc-devs/pymc/blob/f96594bb215b44197615c695130c9d60e1bf9601/pymc/logprob/rewriting.py#L121-L150

Potentials

I also think it makes a lot of sense to label Potentials, because those correspond to expressions that exist on the logp space, and have nothing to do with the random space. We usually don't want to mess with them when we manipulate "random" graphs.

Deterministics

The exception here is Deterministics! Initially I didn't add a dummy Op for them, and they were just an "un-wrapped" output. I ended up adding them just because it looked cleaner, but I think my initial hunch was correct! Deterministics shouldn't constrain our rewrites at all.

I think we should add them as new "copy" outputs, and leave them out of the main random graph. So the following user-defined graph:

withpm.Model()asm:x=pm.Normal("x")exp_x=pm.Determinsitic("exp_x",pm.math.exp(x))y=pm.Normal("y",exp_x)

Should be represented internally as the following:

withpm.Model()asm:x=pm.Normal("x")exp_x=pm.math.exp("x")y=pm.Normal("y",exp_x)pm.Determinsitic("exp_x",exp_x.copy())

We can still add the dummy Deterministic when we putexp_x.copy() as one of the outputs of the FunctionGraph, just for easy of labelling, but this label should never show up in the graph betweeny andx. I will update the PR soon with this change.

You must be logged in to vote
0 replies
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Category
Ideas
Labels
None yet
2 participants
@lucianopaz@ricardoV94

[8]ページ先頭

©2009-2025 Movatter.jp