NotificationsYou must be signed in to change notification settings
Fork301
Star358

WIP ZeroSumNormal example notebook#210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Draft

drbenvincent wants to merge24 commits intopymc-devs:main

base:main

Choose a base branch

fromdrbenvincent:ZeroSumNormal

Draft

WIP ZeroSumNormal example notebook#210

drbenvincent wants to merge24 commits intopymc-devs:mainfromdrbenvincent:ZeroSumNormal

+4,225 −0

Conversation

Copy link

Contributor

drbenvincent commentedAug 16, 2021•
edited
Loading

Here is the first draft of a notebook which showcases a new PyMC distribution,ZeroSumNormal created by@aseyboldt.

A few things to note:

this runs under v4 and it importsZeroSumNormal from a moduleZeroSumNormal.py. The idea is that after this pull requestAdded ZeroSumNormal Distribution pymc#4776 is done we can get rid of this import and just callpm. ZeroSumNormal.
this runs under v3 and it importsZeroSumNormal from a moduleZeroSumNormal.py. The idea is that after this pull requestAdded ZeroSumNormal Distribution pymc#4776 is done we can get rid of this import and just callpm. ZeroSumNormal.
the end of the notebook is purposefully vague as I believe@aseyboldt has some suggestions to make.
Adrian: I've tried to make sure credit and attribution is very clear but do let me know if you'd like anything changed.

To be clear, I'm not anticipating this pull request to be merged untilpymc-devs/pymc#4776 is done and the notebook can callpm.ZeroSumNormal natively.

Open to comments and feedback.

TODO

Revise introduction. Greater emphasis on categorical coding and differences in the Frequentist and Bayesian approach
What is the intuition behindZeroSumNormal doing better than the manual sum to zero constraint?
Add 2x3 ANOVA style example
(maybe) Add multinomial/ordinal regression example
"How does ZeroSumNormal differ from the manually implementation in a way that it solves the problems above? Does this differ mathematically from the previous model?" from@tomicapretto
"I'm curious what this does different than the manual approach above that leads to better convergence?" from@twiecki

Copy link

review-notebook-appbot commentedAug 16, 2021

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered byReviewNB

drbenvincent mentioned this pull request

Aug 16, 2021

Added ZeroSumNormal Distributionpymc-devs/pymc#4776

Closed

Copy link

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

View / edit / reply tothis conversation on ReviewNB

mjhajharia commented on 2021-08-18T00:14:14Z
----------------------------------------------------------------

this is great!! I also really like the lucid writing style in the notebook :))

drbenvincent commented on 2021-08-18T13:51:51Z
----------------------------------------------------------------

Thanks

Copy link

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

View / edit / reply tothis conversation on ReviewNB

mjhajharia commented on 2021-08-18T00:14:15Z
----------------------------------------------------------------

theano: not installed we wouldn't really want it in v4 right? (might be wrong, just curious)

drbenvincent commented on 2021-08-19T08:32:26Z
----------------------------------------------------------------

Have changed to the latestZeroSumNormal code now (which is currently only compatible with v3. So I've left in Theano for now

Copy link

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

View / edit / reply tothis conversation on ReviewNB

twiecki commented on 2021-08-18T10:38:05Z
----------------------------------------------------------------

deflectsions typo

drbenvincent commented on 2021-08-19T08:28:55Z
----------------------------------------------------------------

fixed

Copy link

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

View / edit / reply tothis conversation on ReviewNB

twiecki commented on 2021-08-18T10:38:06Z
----------------------------------------------------------------

god -> good

Copy link

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

View / edit / reply tothis conversation on ReviewNB

twiecki commented on 2021-08-18T10:38:06Z
----------------------------------------------------------------

Why does this look so terrible?

drbenvincent commented on 2021-08-18T10:48:04Z
----------------------------------------------------------------

We're at the limits of floating point precision? I'm not sure this plot is strictly required. I could switch it out for anis_close assertion. Speaking to Adrian later today, so will discuss.

drbenvincent commented on 2021-08-19T08:30:55Z
----------------------------------------------------------------

Have removed this and replaced withnp.allclose(trace_centered.posterior.β.stack(sample=("chain", "draw")).sum("groups").values, 0.0)

OriolAbril commented on 2021-08-23T12:24:57Z
----------------------------------------------------------------

I think np.allclose can take dataarrays directly and that it doesn't care if the input is 1d or multidimensional so I believenp.allclose(trace_centered.posterior.β.sum("groups"), 0.0) will work

drbenvincent commented on 2021-09-05T10:55:42Z
----------------------------------------------------------------

Yes - have switched to that. It will appear in an upcoming commit.

Copy link

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

View / edit / reply tothis conversation on ReviewNB

twiecki commented on 2021-08-18T10:38:07Z
----------------------------------------------------------------

I'm curious what this does different than the manual approach above that leads to better convergence?

drbenvincent commented on 2021-08-19T08:31:40Z
----------------------------------------------------------------

Might have to refer to Adrian for an answer. Have added this question to a list of points to address in updates

Copy link

ContributorAuthor

drbenvincent commentedAug 18, 2021

We're at the limits of floating point precision? I'm not sure this plot is strictly required. I could switch it out for anis_close assertion. Speaking to Adrian later today, so will discuss.

View entire conversation on ReviewNB

Copy link

ContributorAuthor

drbenvincent commentedAug 18, 2021

Thanks

View entire conversation on ReviewNB

Copy link

ContributorAuthor

drbenvincent commentedAug 18, 2021

Thanks for the feedback so far 🙏🏻 Have addressed typos, switched to using the latestZeroSumNormal code (which only works under v3 at the moment), and introduced a random seed for the sampling for reproducibility.

No more small scale comments needed for the moment - am working on some major updates after a discussion with@aseyboldt. Will shout when done 🙂

Copy link

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

View / edit / reply tothis conversation on ReviewNB

chiral-carbon commented on 2021-08-18T19:43:09Z
----------------------------------------------------------------

here would it be possible to usecoords anddims instead ofshape ? Something like

with pm.Model(coords={'obs_id': np.arange(3)}) as naive_model:

and

   β = pm.Normal("β", 0, 10, dims='obs_id')

it's not a major change but the best practice would be to usecoords anddims wherever possible.

drbenvincent commented on 2021-08-19T08:29:33Z
----------------------------------------------------------------

Good point. Did it in the first model, but then lapsed from best practice. Now fixed.

Copy link

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

View / edit / reply tothis conversation on ReviewNB

chiral-carbon commented on 2021-08-18T19:43:10Z
----------------------------------------------------------------

same comment about usingcoords anddims instead ofshape as above.

drbenvincent commented on 2021-08-19T08:30:03Z
----------------------------------------------------------------

done

Copy link

ContributorAuthor

drbenvincent commentedAug 19, 2021

fixed

View entire conversation on ReviewNB

Copy link

ContributorAuthor

drbenvincent commentedAug 19, 2021

Good point. Did it in the first model, but then lapsed from best practice. Now fixed.

View entire conversation on ReviewNB

Copy link

ContributorAuthor

drbenvincent commentedAug 19, 2021

done

View entire conversation on ReviewNB

Copy link

ContributorAuthor

drbenvincent commentedAug 19, 2021

Have removed this and replaced withnp.allclose(trace_centered.posterior.β.stack(sample=("chain", "draw")).sum("groups").values, 0.0)

View entire conversation on ReviewNB

Copy link

ContributorAuthor

drbenvincent commentedAug 19, 2021

Might have to refer to Adrian for an answer. Have added this question to a list of points to address in updates

View entire conversation on ReviewNB

Copy link

ContributorAuthor

drbenvincent commentedAug 19, 2021

Have changed to the latestZeroSumNormal code now (which is currently only compatible with v3. So I've left in Theano for now

View entire conversation on ReviewNB

Copy link

Member

OriolAbril commentedAug 23, 2021

I think np.allclose can take dataarrays directly and that it doesn't care if the input is 1d or multidimensional so I believenp.allclose(trace_centered.posterior.β.sum("groups"), 0.0) will work

View entire conversation on ReviewNB

Copy link

review-notebook-appbot commentedAug 23, 2021•
edited
Loading

View / edit / reply tothis conversation on ReviewNB

OriolAbril commented on 2021-08-23T12:25:05Z
----------------------------------------------------------------

I find the notation a bit confusing. We are usingbeta_0 as the intercept here, but then we are usingintercept in the code andbeta_0 corresponds to thebeta_1 here. We should stick to something and always use the same in both math and code. I personally prefer usingintercept even in the formula instead of having the0 index be "different" and excluded from the zerosumnormal. Between 0 or 1 indexing for betas I don't have any preferences though, I think having the "all betas sum to 0" is clear enough on its own, and coordinate values can be anything so there is no issue with math/code compatibility either.

drbenvincent commented on 2021-09-12T15:14:11Z
----------------------------------------------------------------

Things are a bit different now with the new introduction. For that, I've kept the beta notation because it seems to work. For the intercept + deflections model, I've switched tointercept anddelta parameters. Hopefully this works.

Copy link

review-notebook-appbot commentedAug 23, 2021•
edited
Loading

View / edit / reply tothis conversation on ReviewNB

OriolAbril commented on 2021-08-23T12:25:06Z
----------------------------------------------------------------

All variables are namedbeta here. I remember this bug being fixed already, can you try latest ArviZ and see if it's enough? Otherwise I'd use the development version

drbenvincent commented on 2021-09-12T15:14:34Z
----------------------------------------------------------------

This is labelling properly now.

Copy link

review-notebook-appbot commentedAug 23, 2021•
edited
Loading

View / edit / reply tothis conversation on ReviewNB

OriolAbril commented on 2021-08-23T12:25:07Z
----------------------------------------------------------------

I love this plot! Very minor nit is I would label the arrow corresponding to the intercept.

drbenvincent commented on 2021-09-12T15:14:48Z
----------------------------------------------------------------

Done

Copy link

review-notebook-appbot commentedAug 23, 2021•
edited
Loading

View / edit / reply tothis conversation on ReviewNB

OriolAbril commented on 2021-08-23T12:25:07Z
----------------------------------------------------------------

I have mixed feelings about the bullet points which extend a bit to the pair and posterior plots. First and foremost, the model is a complete failure because it doesn't converge (and because it won't ever converge). I don't see correlations by themselves as a failure too though, but as the main reason for this lack of convergence. And given the model is not converging, we can't trust its results, so it is expected for parameter recovery to fail, but it might not always be the case. We could have a model with no mixing and therefore no convergence but that by chance or because the initial points turned out to be close to the true parameters, it gets stuck there or very close and does "recover" the true parameters. Both models are as much a failure though even if this second hypothetical one "recovers" the parameters.

I think the main factor in deciding how to address this is defining the target audience for this notebook. Do we already expect readers to know about MCMC sampling and convergence?

drbenvincent commented on 2021-09-12T15:15:58Z
----------------------------------------------------------------

I've change the wording around this model now. I think it makes more sense now in the context of the new introduction.

Copy link

review-notebook-appbot commentedAug 23, 2021•
edited
Loading

View / edit / reply tothis conversation on ReviewNB

OriolAbril commented on 2021-08-23T12:25:08Z
----------------------------------------------------------------

Line #9.        β = pm.Deterministic("β", _β - m)

I would be explicit and setdims="groups" here too even though it's not needed to set the shape, I think it's valuable info. This will have no effect on any of the later plots as of now, but it would have if we use 1 based indexing for betas when matching math/code notation

I also don't have any suggestion so this feels a bit empty, but it would be great to try and find some more descriptive names for the_b and_intercept variables, something that gives some insight on the differences between these variables and the constrained ones.

drbenvincent commented on 2021-09-12T15:24:41Z
----------------------------------------------------------------

added thedims="groups"

I've used the_unconstrained suffix. It's more verbose, but maybe clearer.

Copy link

ContributorAuthor

drbenvincent commentedSep 5, 2021

Yes - have switched to that. It will appear in an upcoming commit.

View entire conversation on ReviewNB

Copy link

ContributorAuthor

drbenvincent commentedSep 12, 2021

View entire conversation on ReviewNB

Copy link

ContributorAuthor

drbenvincent commentedSep 12, 2021

This is labelling properly now.

View entire conversation on ReviewNB

Copy link

ContributorAuthor

drbenvincent commentedSep 12, 2021

Done

View entire conversation on ReviewNB

Copy link

ContributorAuthor

drbenvincent commentedSep 12, 2021

I've change the wording around this model now. I think it makes more sense now in the context of the new introduction.

View entire conversation on ReviewNB

Copy link

ContributorAuthor

drbenvincent commentedSep 12, 2022

@tomicapretto Do you think we should revisit this oncepymc-devs/pymc#6121 is closed?

Copy link

tomicapretto commentedSep 12, 2022

@tomicapretto Do you think we should revisit this oncepymc-devs/pymc#6121 is closed?

Yes!

drbenvincent marked this pull request as draft

October 9, 2022 18:27

Copy link

ContributorAuthor

drbenvincent commentedOct 9, 2022

Seeing aspymc-devs/pymc#6121 is now merged, let's pick this up. I'll rebase this and update the notebook to conform to the current style.

drbenvincentand others added24 commits

October 9, 2022 20:33

ZeroSumNormal: initial commit

9c1a603

typos + swap histogram for np.allclose

ecb4bce

latest ZeroSumNormal code, pymc3 v3, random seed for sampling

445cd70

use of coords/dims for all models now

32199a1

Added introduction, framing it in the context of categorical coding.

9940a56

added dims="groups" in a few places

7f5748a

add _unconstrained suffix in manual model

9feff77

create truncated regression example

5e3af8d

delete truncated regression example from main branch

33fc141

fix typo

5d6af4d

ZeroSumNormal: initial commit

61eb3f8

typos + swap histogram for np.allclose

dc2e3a8

latest ZeroSumNormal code, pymc3 v3, random seed for sampling

0ee27f5

use of coords/dims for all models now

05e87b0

added dims="groups" in a few places

0a34ea0

add _unconstrained suffix in manual model

b0692bc

create truncated regression example

3219207

delete truncated regression example from main branch

4a26d81

fix typo

3aa5bd7

re-ran notebook after rebase - do tests pass now?

5e0b00c

make tests pass - formatting of ZeroSumNormal.py

788e909

add Tomás Capretto as a co-author

51c6fad

testing if I can commit to the PR

981f600

Explain something about where zero sum encoding comes from

b9070c1

drbenvincent force-pushed theZeroSumNormal branch from1647415 tob9070c1Compare

October 9, 2022 18:45

Copy link

Member

fonnesbeck commentedDec 16, 2024

Dead or alive?

Labels

None yet

5 participants

Movatterモバイル変換

WIP ZeroSumNormal example notebook#210

Are you sure you want to change the base?

WIP ZeroSumNormal example notebook#210

Conversation

drbenvincent commentedAug 16, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

review-notebook-appbot commentedAug 16, 2021

Uh oh!

review-notebook-appbot commentedAug 18, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

review-notebook-appbot commentedAug 18, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

review-notebook-appbot commentedAug 18, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

review-notebook-appbot commentedAug 18, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

review-notebook-appbot commentedAug 18, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

review-notebook-appbot commentedAug 18, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

drbenvincent commentedAug 18, 2021

Uh oh!

drbenvincent commentedAug 18, 2021

Uh oh!

drbenvincent commentedAug 18, 2021

Uh oh!

review-notebook-appbot commentedAug 18, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

review-notebook-appbot commentedAug 18, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

drbenvincent commentedAug 19, 2021

Uh oh!

drbenvincent commentedAug 19, 2021

Uh oh!

drbenvincent commentedAug 19, 2021

Uh oh!

drbenvincent commentedAug 19, 2021

Uh oh!

drbenvincent commentedAug 19, 2021

Uh oh!

drbenvincent commentedAug 19, 2021

Uh oh!

OriolAbril commentedAug 23, 2021

Uh oh!

review-notebook-appbot commentedAug 23, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

review-notebook-appbot commentedAug 23, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

review-notebook-appbot commentedAug 23, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

review-notebook-appbot commentedAug 23, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

review-notebook-appbot commentedAug 23, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

drbenvincent commentedSep 5, 2021

Uh oh!

drbenvincent commentedSep 12, 2021

Uh oh!

drbenvincent commentedSep 12, 2021

Uh oh!

drbenvincent commentedSep 12, 2021

Uh oh!

drbenvincent commentedSep 12, 2021

Uh oh!

drbenvincent commentedSep 12, 2022

drbenvincent commentedAug 16, 2021•
edited
Loading

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

review-notebook-appbot commentedAug 18, 2021•
edited
Loading

review-notebook-appbot commentedAug 23, 2021•
edited
Loading

review-notebook-appbot commentedAug 23, 2021•
edited
Loading

review-notebook-appbot commentedAug 23, 2021•
edited
Loading

review-notebook-appbot commentedAug 23, 2021•
edited
Loading

review-notebook-appbot commentedAug 23, 2021•
edited
Loading