Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Refactor hist for less numerical errors#22773

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
oscargus wants to merge1 commit intomatplotlib:mainfromoscargus:refactorhist

Conversation

oscargus
Copy link
Member

@oscargusoscargus commentedApr 3, 2022
edited
Loading

PR Summary

Should help with#22622

Idea is to do computation on the edges rather than the widths and then do diff on the result. This may be numerically better (or not...). Or rather, it is probably numerically worse, but will give visually better results...

Probably the alternative approach of providing a flag tobar/barh, making sure that adjacent bars are actually exactly adjacent may be a better approach, but I wanted to see what comes out of this first...

PR Checklist

Tests and Styling

  • Has pytest style unit tests (andpytest passes).
  • IsFlake 8 compliant (installflake8-docstrings and runflake8 --docstring-convention=all).

Documentation

  • New features are documented, with examples if plot related.
  • New features have an entry indoc/users/next_whats_new/ (follow instructions in README.rst there).
  • API changes documented indoc/api/next_api_changes/ (follow instructions in README.rst there).
  • Documentation is sphinx and numpydoc compliant (the docs shouldbuild without error).

@jklymak
Copy link
Member

Is the numerical problem the diff? Would it make sense to just convert the numpy bin edges to float64 before the diff?

@oscargus
Copy link
MemberAuthor

Is the numerical problem the diff?

Hard to say. But the problem is that one does quite a bit of computations and at some stage there are rounding errors that leads to that there are overlaps or gaps between edges. So postponing diff will reduce the risk that this happens (on the other hand, one may get cancellations as a result, but I do not think that will happen more now since the only things we add here are about the same order of magnitude).

Would it make sense to just convert the numpy bin edges to float64 before the diff?

Yes, or even float32, but as argued in the issue, one tend to use float16 for memory limited environments, so not clear if one can afford it.

Here, I am primarily trying to see the effect of it. As we do not deal with all involved computations here, some are also inbar/barh, the better approach may be to use a flag, "fill", or something that makes sure that all edges are adjacent if set (I'm quite sure a similar problem can arise if feedingbar-edges infloat16 as well.

@oscargus
Copy link
MemberAuthor

It seems like we do not have any test images that are negatively affected by this at least... But it may indeed not be the best solution to the problem.

@oscargus
Copy link
MemberAuthor

Ahh, but even if the data to hist isfloat16, the actual histogram array doesn't have to be that... And that is probably much smaller compared to the data. So probably a simpler fix is to change the data type of the histogram data before starting to process it...

jklymak and timhoffm reacted with thumbs up emoji

@jklymak
Copy link
Member

I think you just want another type catch here (I guess I'm not sure the difference betweenfloat and"float64"), or at least that fixes the problem for me.

diff --git a/lib/matplotlib/axes/_axes.py b/lib/matplotlib/axes/_axes.pyindex f1ec9406ea..88d90294a3 100644--- a/lib/matplotlib/axes/_axes.py+++ b/lib/matplotlib/axes/_axes.py@@ -6614,6 +6614,7 @@ such objects             m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)             tops.append(m)         tops = np.array(tops, float)  # causes problems later if it's an int+        bins = np.array(bins, float)  # causes problems is float16!         if stacked:             tops = tops.cumsum(axis=0)             # If a stacked density plot, normalize so the area of all the

@timhoffm
Copy link
Member

timhoffm commentedApr 5, 2022
edited
Loading

I guess I'm not sure the difference between float and "float64"

Numpy accepts builtin python types and maps them to numpy types:

https://numpy.org/doc/stable/reference/arrays.dtypes.html#specifying-and-constructing-data-types
(scroll a bit to "Built-in Python types").

The mapping can be platform specific. E.g.int maps tonp.int64 on linux butnp.int32 on win.
float maps on x86 linux and win tonp.float64. But I don't know if that's true on arm etc.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@oscargus@jklymak@timhoffm

[8]ページ先頭

©2009-2025 Movatter.jp