Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

default hash savefig format#160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
bjlittle wants to merge2 commits intomatplotlib:main
base:main
Choose a base branch
Loading
frombjlittle:default-savefig-format

Conversation

@bjlittle
Copy link
Contributor

@bjlittlebjlittle commentedMay 21, 2022
edited
Loading

This PR addresses issue#152, by ensuring that a defaultformat='png' is injected to thesavefig kwargs when generating a hash.

Closes#152

@bjlittle
Copy link
ContributorAuthor

Okay, wowzers... can anyone explain thetest failure behaviour ofpy310 withmpl35 across all platforms ? 😕

@ConorMacBride
Copy link
Member

Very strange. It seems that passingformat='png' to savefig changes the hashes. I wonder if a default has changed in 3.5?

@ConorMacBride
Copy link
Member

The hashes are different onmain also (testing py310-test-mpl35 on macOS locally) so not related to this PR. Matplotlib 3.5.2 was released 19 days ago so my guess is that something has changed there.

bjlittle reacted with thumbs up emoji

@ConorMacBride
Copy link
Member

Yes, the tests pass on MPL 3.5.1. Can you pin this to3.5.1 rather than3.5.*?

mpl35:matplotlib==3.5.*

Hopefully this is something that the perceptual hashes can solve, as we really should be testing against the latest bug fixes from matplotlib!

bjlittle reacted with thumbs up emoji

@bjlittle
Copy link
ContributorAuthor

Awesome detective work 🕵️

@bjlittle
Copy link
ContributorAuthor

bjlittle commentedMay 21, 2022
edited
Loading

@ConorMacBride If thesha hashes are sensitive to some change frommpl>3.5.1, what does that mean for default hash testing? Switch tophash?

What are the implications to the community if we bank this new defaultformat behaviour in this PR? Isn't this a breaking change in behaviour?

mpl33:matplotlib==3.3.*
mpl34:matplotlib==3.4.*
mpl35:matplotlib==3.5.*
mpl35:matplotlib==3.5.1
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@ConorMacBride We should make an issue to unpin this, right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We no longer need to change this.

@ConorMacBride
Copy link
Member

@ConorMacBride If thesha hashes are sensitive to some change frommpl>3.5.1, what does that mean for default hash testing? Switch tophash?

sha hashes are definitely too sensitive. It would be great if such a constrained testing environment wasn't necessary as that would make the package much easier to use. I would need to experiment more withphash to get an understanding about the sensitivity. Although it would be up to the maintainers (astrofrog, Cadair and dopplershift) to decide.

I presume the default hash size, high frequency factor and hamming tolerance can be tuned to a value that minimises, what an average user would consider, a test passing incorrectly? (I would expect most of thephash exampleshere should result in a failed hash comparison.)

I suspectphash may not be appropriate for tests that need to be particularly colour sensitive? All images areconverted to 8-bit grayscale so, for example,this image would be similar to an image where all the values have been multiplied by-1 (based onwhat thebwr colormap looks like in grayscale).

What are the implications to the community if we bank this new defaultformat behaviour in this PR? Isn't this a breaking change in behaviour?

From reading thesavefig documentation, the only way I can see this being a breaking change is if someone has a different value set insavefig.format rcparams or they are using a different backend (by passing abackend kwarg to the pytest-mpl decorator, which is an undocumented feature).

backend=compare.kwargs.get('backend','agg')

I've added some code below. In the default case, both hashes are the same so no breaking change. However, if a custom backend has been specified, passingformat="png" seems to forceAgg which would be a breaking change. However,svg changes hashes every time (due to random ids referencing objects), andps has similar behaviour. So it might be unlikely that someone is changing the backend while using hash comparison.

So, if the user asks for a different backend, settingformat="png" would have the effect of ignoring that when doing hash comparison. Forphash, the image data needs to be in a format thatpillow/PIL can read although I think it should work for more than justpng?

To not interfere with the requested backend, I think we should only setformat="png" if abackend kwarg isn't set. Or maybe we should not be settingformat="png" as that's the default unless the user has requested something different?

Click for code
importioimporthashlibimportmatplotlibimportmatplotlib.pyplotaspltdef_hash_file(in_stream):"""    Hashes an already opened file.    """in_stream.seek(0)buf=in_stream.read()hasher=hashlib.sha256()hasher.update(buf)returnhasher.hexdigest()defhash(**kwargs):fig=plt.figure()ax=fig.add_subplot(1,1,1)ax.plot([1,3,2,4])imgdata=io.BytesIO()fig.savefig(imgdata,**kwargs)plt.close()return_hash_file(imgdata)# Default (pytest-mpl sets agg)matplotlib.use("agg")print(hash())# b7d7754ed51a78949e119e3900b72b4123fcc3674fd6a5750b9a93f7091293abprint(hash(format="png"))# b7d7754ed51a78949e119e3900b72b4123fcc3674fd6a5750b9a93f7091293ab# User passes backend="svg" to pytest-mpl decoratormetadata= {"Creator":None,"Date":None,"Format":None,"Type":None}matplotlib.use("svg")print(hash(metadata=metadata))# 81910e47aa39455d96c41893f73793c65b5405af14af7919dbe8a7172e004fcfprint(hash(metadata=metadata))# 0bf13ec69c253c6a8d3b6b004aa4855e343e5f40a79106e87fee72c208f7de89print(hash(format="png"))# b7d7754ed51a78949e119e3900b72b4123fcc3674fd6a5750b9a93f7091293ab

@bjlittle
Copy link
ContributorAuthor

sha hashes are definitely too sensitive. It would be great if such a constrained testing environment wasn't necessary as that would make the package much easier to use. I would need to experiment more withphash to get an understanding about the sensitivity. Although it would be up to the maintainers (astrofrog, Cadair and dopplershift) to decide.

Agreed.

@bjlittle
Copy link
ContributorAuthor

bjlittle commentedMay 23, 2022
edited
Loading

I presume the default hash size, high frequency factor and hamming tolerance can be tuned to a value that minimises, what an average user would consider, a test passing incorrectly? (I would expect most of thephash exampleshere should result in a failed hash comparison.)

That's pretty much my experience so far. We've opted for ahash_size=16 (which gives a 256-bit hash)highfreq_factor=4 (imagehash.phash default) and a hamming distance tolerance of<=2 for a pass withphash, and that suits our needs.

@bjlittle
Copy link
ContributorAuthor

bjlittle commentedMay 23, 2022
edited
Loading

I suspectphash may not be appropriate for tests that need to be particularly colour sensitive? All images areconverted to 8-bit grayscale so, for example,this image would be similar to an image where all the values have been multiplied by-1 (based onwhat thebwr colormap looks like in grayscale).

I've re-ran and updated a notebook that I wrote to initially investigate and play withimagehash, which is availablehere.

To create a conda environment to run it, simplyconda create -n imagehash -c conda-forge jupyter iris imagehash, and download the notebook gist.

Interestingly, changing the colormap on a plot is detected byphash. But yeah, play around and get a feel for what it's offering 👍

@bjlittle
Copy link
ContributorAuthor

Overall, I wait for you guys to come to a consensus and advise on what to do next here with this PR.

@Cadair
Copy link
Contributor

I think matplotlib puts metadata into the png including the version iirc. I think we can turn that off.

See the metadata keyword argument here:https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.savefig.html and more info on png here:https://matplotlib.org/stable/api/backend_agg_api.html#matplotlib.backends.backend_agg.FigureCanvasAgg.print_png

@ConorMacBride
Copy link
Member

Given how effective removing the metadata version number was in#163 I think we should definitely remove it by default and make it a breaking change (forv1.0.0 😉).

bjlittle reacted with rocket emoji

This was referencedMay 28, 2022
Copy link
Member

@ConorMacBrideConorMacBride left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think we need to set this for the reasons in the second half of#160 (comment). Should we just close this PR?

mpl33:matplotlib==3.3.*
mpl34:matplotlib==3.4.*
mpl35:matplotlib==3.5.*
mpl35:matplotlib==3.5.1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We no longer need to change this.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@ConorMacBrideConorMacBrideConorMacBride left review comments

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

default savefig format

3 participants

@bjlittle@ConorMacBride@Cadair

[8]ページ先頭

©2009-2025 Movatter.jp