Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

MultiNorm class#29876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
trygvrad wants to merge5 commits intomatplotlib:main
base:main
Choose a base branch
Loading
fromtrygvrad:multivariate-plot-prapare

Conversation

trygvrad
Copy link
Contributor

PR summary

This PR continues the work of#28658 and#28454, aiming toclose#14168. (Feature request: Bivariate colormapping)

This is part one of the former PR,#29221. Please see#29221 for the previous discussion

Featuresincluded in this PR:

  • AMultiNorm class. This is a subclass ofcolors.Normalize and holdsn_variate norms.
  • Testing of theMultiNorm class

Featuresnot included in this PR:

  • changes to colorizer.py needed to expose the MultiNorm class
  • Exposes the functionality provided byMultiNorm together withBivarColormap andMultivarColormap to the plotting functionsaxes.imshow(...),axes.pcolor, and `axes.pcolormesh(...)
  • Testing of the new plotting methods
  • Examples in the docs

This commit introduces the MultiNorm calss to prepare for the introduction of multivariate plotting methods
@@ -2320,6 +2320,16 @@ def __init__(self, vmin=None, vmax=None, clip=False):
self._scale = None
self.callbacks = cbook.CallbackRegistry(signals=["changed"])

@property
def n_input(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Perhaps input_dims, output_dims for consistency with Transforms?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Dimensions is probably somewhat confusing term in this context, because it typically means the number of axes in an array, i.e.np.zeros((4,5,6)) has three dimensions, but the correspondingn_input would be four [if shown as an image of size (5,6)].

What do you think ofinput_variates andoutput_variates, orinput_vars andoutput_vars?

Copy link
Contributor

@anntzeranntzerApr 7, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sure, although transforms use the same terminology? No strong opinion there.
However this does raise another issue, which is that you seem to put the variates as thefirst dimension. Intuitively I'd rather put them as thelast dimension (just like if you pass a 2D array to plot, thelast dimension are the different variates). I'm sure this design choice must have been discussed somewhere, can you link that discussion?

Copy link
ContributorAuthor

@trygvradtrygvradApr 7, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We had a brief discussion on this here#14168 (comment), but we followed that up on a weekly meeting 14 September 2023https://hackmd.io/@matplotlib/Skaz1lrh6#Mutlivariate-colormapping .

The argument is not formulated well in the log, but basically(V, N, M) because there areV cmaps and norms, and in this way the input for the data mirrors that of the vmin, vmax and norm keywords, i.e.:
ax.imshow((data0, data1), vmin=(vmin0, vmin1), cmap='BiOrangeBlue')
or

ax.imshow((preassure, temperature),            vmin=(vmin_preassure, vmin_temperature),            cmap='BiOrangeBlue')

Quite often the data is qualitatively different, like the example at the bottom of the page here:https://trygvrad.github.io/mpl_docs/Multivariate%20colormaps.html whereGDP_per_capita andAnnual_growth are plotted together. I find it is not very intuitive to wrap these in a single array.

(also: this allows the different variates to have different data types)

Copy link
Contributor

@anntzeranntzerApr 10, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sure, looks like I already lost that discussion once so let's not relitigate it :)
I'll just ping@timhoffm who may have some idea as to the best name for n_input/input_dims (or some other name).

trygvrad reacted with thumbs up emoji
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thank you@timhoffm :D

Why do we have to distinguish input and output? They are both the same here. Do you anticipate that there can be norms with different inputs and outputs? What would be an example for that? - I'm asking because that decides whether we need the input/output parts in the name.

The thinking was that there is no requirement that they be equal, so as to be future-proof they are here implemented as separate. I do believe this was mentioned at a meeting at one point, but I have spent the last day trying to imagine a use case where it would be beneficial to have them separate. I can imagine some use cases where one might want to map a single scalar to two colorbars, such as height above and below sea level on separate colorbars, but even as I stretch my mind to find a use case where one scalar maps to multiple colorbars, I cannot find a use case where it is more intuitive to integrate that into the norm than to handle the additional complexity a separate pre-processing step and use a norm where the number of inputs and outputs are the same.

With that in mind I think we should change the names from n_input/n_output to a single variable name.

In my mind the following are good options:

  • n_variates
  • n_channels
  • n_dims
    Or we could go even shorted and just usevariates,channels ordims.

I have no strong opinion now, but this has been my logic on this topic previously:
Whenn_variates was implemented inMultivariateColormap my thinking was that we have the ability to define our own vocabulary in this case, and whatever word we choose, it is better if we are consistent, thus we ended up withMultivariateColormap andn_variates, but we could alternatively haveMultichannelColormap andn_channels. If we choose to usen_variates also forMultiNorm it would give an indication to that this relates toMultivariateColormap. Similarly, the word 'dims' is used in the context of transform, while the word 'channel' is weakly connected to the concept of an alpha-channel. My hunch has been that using a less-used word [variates] gives us more power to imbue that word with meaning, and that makes it easier to be specific in our vocabulary going forward.

How often will these properties be used by end users? - I'm asking, because if they are rarely used, we may affort longer names for more clarity.

I don't intend for users to use them. The number is implicit from the creation of aMultiNorm, i.e.MultiNorm(['log', None]) or evenplt.imshow([a, b], cmap='2AddA', norm=['log', None]).

Their importance is only for error-handling, to make sure that the data, norm and colormap are compatible, and give the user a reasonable error.

@anntzer I do not hold strong opinion, so if you let me know what you prefer, I will update this PR accordingly.
As this is not something we expect users to interact with much, the main concern should be the maintainers :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

  1. I agree that a single variable is sufficient. It's not the responsibility of the Norm to create or condense information. So it can only beN -> N. It could possibly even be1 -> 1 if we allow a list of Norm instances for theN -> N case. Advantage would be that we don't need to create any new norms and the logic stays simpler. The only thing we'd loose is the ability to couple the variables, but I don't see that that's needed. - We could always later introduce MultiNorm. To be checked: Maybe it's favorable internally to collect these into a single instance, but that would not need to be public. What do you think?

  1. I asked ChatGPT "would you use variate as a standalone term?":

Excellent question — and honestly,"variate" isn’t super commonly used by itself in casual conversation or even in a lot of applied work, but itdoes exist as a proper term in stats.

Technically:

  • Avariate is a random variable, or a measurable quantity that can take on different values.
    Example: If you're studying people's heights, each person’s height would be a value of the "height variate."

In practice:

  • You’re way more likely to hear "variable" or "random variable" than "variate."
  • "Variate" mostly shows up in more formal, old-school, or academic statistical texts.
    Example: "The sample contains 100 observations of the variate X."

It really shines in compound terms like:

  • Univariate → one variable
  • Bivariate → two variables
  • Multivariate → multiple variables

But you wouldn’t usually say, “Let’s analyze this variate.” You’d just say, “variable.”

  1. I don't intend for users to usen_variates. Then (if we still want to expose MultiNorm), let's just call thisn_variables and keep it private.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Lets go forn_variables then, it's a good name :)

So it can only be N -> N. It could possibly even be 1 -> 1 if we allow a list of Norm instances for the N -> N case. Advantage would be that we don't need to create any new norms and the logic stays simpler. The only thing we'd loose is the ability to couple the variables, but I don't see that that's needed. - We could always later introduce MultiNorm. To be checked: Maybe it's favorable internally to collect these into a single instance, but that would not need to be public. What do you think?

We have discussed this before in#28428 where I initially implemented multivariate color mapping with list of norms instead of MultiNorm#28428 (comment) . This was then later changed to MultiNorm.

However, this was before the introduction of theColorizer as a container for norm→colormap.

The choice of having a list of norms or a MultiNorm object is now somewhat a question of where we put the additional complexity for supporting norms for multivariate/bivariate colormaps. Do we put the complexity in A: Colorizer or B: MultiNorm. My hypothesis is that having MultiNorm as a separate class simplifies the colorizer class, as we can call methods on theColorizer.norm regardless of if the data is multivariate or not. In my mind this will make it easier to maintain.

The top-level plotting functions should be largely unaffected by this choice, as they should operate through the colorizer interface, and should not use the norm directly.

If we choose to have a MultiNorm class, it is then a separate question if MultiNorm should be private. To me, this boils down to what should the user recieve if requesting the norm after plotting a bivariate/multivariate image:

im = plt.imshow((a, b), cmap='BiOrangeBlue')im.norm # ← what should this be?

My hypothesis here has been that it is easier for the userìm.norm has type stability, i.e. it is always a subclass ofcolors.Normalize, and therefore we make MultiNorm public.

@timhoffm Let me know if you want me to implement a version with a list of norms instead of MultiNorm, it will have to be prototyped as a complete solution (with working top-level plotting methods) similar to#29221 , but if we go for that solution I could then break it into smaller PRs for review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Well, you don't really have type stability. While MultiNorm is formally a subclass of Normalize, it violates LSP (*), the call you need to do onim.norm depends on the type.

(*) I'm not a LSP nazi, it's sometimes ok to violate it. But you can't argue for that with type stability.

My hypothesis is that having MultiNorm as a separate class simplifies the colorizer class

This is a valid argument and we should at least have MultiNorm internally.

To me, this boils down to what should the user recieve if requesting the norm after plotting a bivariate/multivariate image

Yes. Technically that's a separate question, and I don't have a strong preference here. But since the class exist, it may be ok to expose it.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

All true, I was a bit inaccurate in my comment.
I have changed from n_input/n_output to n_variables as agreed :)

timhoffm reacted with thumbs up emoji
Comment on lines 4103 to 4105
in the case where an invalid string is used. This cannot use
`_api.check_getitem()`, because the norm keyword accepts arguments
other than strings.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm confused because this function is only called forisinstance(norm, str).

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

So this function exists because the norm keyword acceptsNormalize objects in addition to strings.
This is fundamentally the same error you get if you give an invalid norm to aColorizer object.

In main, the@norm.setter oncolorizer.Colorizer reads:

    @norm.setter    def norm(self, norm):        _api.check_isinstance((colors.Normalize, str, None), norm=norm)        if norm is None:            norm = colors.Normalize()        elif isinstance(norm, str):            try:                scale_cls = scale._scale_mapping[norm]            except KeyError:                raise ValueError(                    "Invalid norm str name; the following values are "                    f"supported: {', '.join(scale._scale_mapping)}"                ) from None            norm = _auto_norm_from_scale(scale_cls)()    ...

The_get_scale_cls_from_str() exists in this PR because this functionality is now needed by bothcolorizer.Colorizer.norm() andcolors.MultiNorm.
Note this PR does not include changes tocolorizer.Colorizer.norm() so that it makes use of_get_scale_cls_from_str(). These changes follow in the next PR:#29877 .

Thank you@QuLogicCo-authored-by: Elliott Sales de Andrade <quantum.analyst@gmail.com>
@trygvradtrygvradforce-pushed themultivariate-plot-prapare branch from55b85e3 tof42d65bCompareApril 17, 2025 15:18

def scaled(self):
"""Return whether both *vmin* and *vmax* are set on all constituent norms"""
return all([(n.vmin is not None and n.vmax is not None) for n in self.norms])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A bit too many parentheses, not that it really matters.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm changing this to:

        return all([n.scaled for n in self.norms])

which should be more readable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This wasn't changed?

Copy link
Contributor

@anntzeranntzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Just some minor points, plus re-pinging@timhoffm in case he has an opinion re: n_input / input_dims naming?

@trygvrad
Copy link
ContributorAuthor

Thank you for the feedback@anntzer !
Hopefully we can hear if@timhoffm has any thoughts on n_input / input_dims naming within the coming week.

@timhoffm
Copy link
Member

See#29876 (comment)

@trygvrad
Copy link
ContributorAuthor

Thank you@timhoffm
The PR should now be as we agreed (#29876 (comment)) :)

"""
Parameters
----------
norms : List of strings or `Normalize` objects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

None seems to be left out here:

Suggested change
norms :List ofstrings or`Normalize` objects
norms :list of(str,`Normalize`, or None)


"""

if isinstance(norms, str) or not np.iterable(norms):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is equivalent, I think?

Suggested change
ifisinstance(norms, str) or not np.iterable(norms):
ifcbook.is_scalar_or_string(norms):

# Convert the list of norms to a tuple to make it immutable.
# If there is a use case for swapping a single norm, we can add support for
# that later
self._norms = tuple(norms)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Does this not need a check that the length matchesself.n_variables?


Returns
-------
Data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Single returns should be the type, not a name.

Comment on lines +3404 to +3405
Data, must be of length `n_variables` or be an np.ndarray type with
`n_variables` fields.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This seems to be the only docstring for a data parameter that mentionsnp.ndarray; is that required or not? The others should be made consistent either way.


def scaled(self):
"""Return whether both *vmin* and *vmax* are set on all constituent norms"""
return all([(n.vmin is not None and n.vmax is not None) for n in self.norms])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This wasn't changed?


Returns
-------
list of np.ndarray
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Dedent.

n.autoscale(a)
self._changed()

def autoscale_None(self, A):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Seems to be missing a test?

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@QuLogicQuLogicQuLogic left review comments

@anntzeranntzeranntzer left review comments

@timhoffmtimhoffmtimhoffm left review comments

@oscargusoscargusoscargus left review comments

At least 1 approving review is required to merge this pull request.

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

Feature request: Bivariate colormapping
5 participants
@trygvrad@timhoffm@QuLogic@anntzer@oscargus

[8]ページ先頭

©2009-2025 Movatter.jp