Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork7.9k
MAINT: Deterministic SVG and PDF tests#7748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
And document the standard keys in PdfPages. Also insert thematplotlib version in the default value of Producer as wellas Creator, to retain a debugging clue in case the user overridesone of them.
Remove the metadata entries that depend on the date andthe exact version.
codecov-io commentedJan 5, 2017
Current coverage is 62.12% (diff: 100%)@@ master #7748 diff @@========================================== Files 174 174 Lines 56028 56028 Methods 0 0 Messages 0 0 Branches 0 0 ==========================================+ Hits 34805 34806 +1+ Misses 21223 21222 -1 Partials 0 0
|
@@ -298,7 +298,12 @@ def compare(self, idx, baseline, extension): | |||
remove_ticks_and_titles(fig) | |||
actual_fname = os.path.join(self.result_dir, baseline) + '.' + extension | |||
fig.savefig(actual_fname, **self.savefig_kwargs) | |||
kwargs = self.savefig_kwargs.copy() | |||
if extension == 'pdf': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Doesn't this break the test ofSOURCE_DATE_EPOCH
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
No, that test starts a separate subprocess to write the output and doesn't use the image_comparison decorator at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ah, 🐑
the pgf tests roll their own image comparison code. attn@astrofrog this may need to be ported to |
@@ -136,6 +136,10 @@ def set_font_settings_for_testing(): | |||
rcParams['text.hinting_factor']=8 | |||
defset_reproducibility_for_testing(): | |||
rcParams['svg.hashsalt']='matplotlib' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This will also break, in a way, the determinism test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can you explain in more detail? The_test_determinism_save
function intest_backend_svg.py
file also sets this rc parameter to a constant value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Probably along the same lines as@tacaswell; I was thinking itwouldn't fail ifsvg.hashsalt
was broken. However, I see this test doesn't use the decorator anyway, so it is likely to still fail correctly.
There is a cache of png files keyed by the MD5 hashes of correspondingsvg and pdf files, which helps reduce test suite running times for svgand pdf files that stay exactly the same from one run to the next.This patch enables caching of test results, not only expectedresults, which is only useful if the tests are mostly deterministic(seematplotlib#7748). It adds reporting of cache misses, which can be helpfulin getting tests to stay deterministic, and expiration since thetest results are going to change more often than the expectedresults.
There is a cache of png files keyed by the MD5 hashes of correspondingsvg and pdf files, which helps reduce test suite running times for svgand pdf files that stay exactly the same from one run to the next.This patch enables caching of test results, not only expectedresults, which is only useful if the tests are mostly deterministic(seematplotlib#7748). It adds reporting of cache misses, which can be helpfulin getting tests to stay deterministic, and expiration since thetest results are going to change more often than the expectedresults.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Some minor corrections, but otherwise fine.
The reproducibility of the output from the PS and PDF backends has so | ||
far been tested using various plot elements but only default values of | ||
options such as ``{ps,pdf}.fonttype`` that can affect the output at a | ||
low level, and not with the mathtext or usetex features. When | ||
matplotlib calls external tools (such as PS distillers or LaTeX) their | ||
versions need to be kept constant for reproducibility, and they may | ||
add sources of nondeterminism outside the control of matplotlib. | ||
For SVG output, the ``svg.hashsalt`` rc parameter has been added in an | ||
earlier release. In can be used to change some random id values in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
In -> It
For SVG output, the ``svg.hashsalt`` rc parameter has been added in an | ||
earlier release. In can be used to change some random id values in the | ||
output to be deterministic, at the cost that including multiple such | ||
svg files in one document can lead to collisions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
svg -> SVG
output to be deterministic, at the cost that including multiple such | ||
svg files in one document can lead to collisions. | ||
These features are now enabled in the tests for the pdf and svg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
pdf -> PDF
svg -> SVG
The ``SOURCE_DATE_EPOCH`` environment variable can now be used to set | ||
the timestamp value in the PS and PDF outputs. See | ||
https://reproducible-builds.org/specs/source-date-epoch/ | ||
Alternatively, calling ``savefig`` with ``metadata={creationDate=None}`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thisdict
literal is a bit broken; should bedict(creationDate=None)
or{'creationDate': None}
.
@@ -484,11 +483,13 @@ def __init__(self, filename, metadata=None): | |||
self.infoDict= { | |||
'Creator':'matplotlib %s, http://matplotlib.org'%__version__, | |||
'Producer':'matplotlib pdf backend%s'%revision, | |||
'Producer':'matplotlib pdf backend%s'%__version__, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Do we even need the version here? It's in theCreator
tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I thought including the version in both would help with debugging when the user has overridden just one of these. If you use Matplotlib as a component of some larger application, you might want to override Creator and leave Producer pointing to the PDF backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Fair enough.
The reproducibility of the output from the PS and PDF backends has so | ||
far been tested using various plot elements but only default values of | ||
options such as ``{ps,pdf}.fonttype`` that can affect the output at a | ||
low level, and not with the mathtext or usetex features. When | ||
matplotlib calls external tools (such as PS distillers or LaTeX) their | ||
versions need to be kept constant for reproducibility, and they may | ||
add sources of nondeterminism outside the control of matplotlib. | ||
For SVG output, the ``svg.hashsalt`` rc parameter has been added in an | ||
earlier release. It can be used to change some random id values in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I don't understand the following sentence:
"It can be used to change some random id values in the output to be deterministic, at the cost thatincluding multiple such SVG files in one document can lead to collisions."
Isn't there an grammar problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'll try to rephrase.
Thanks! This looks good to me. |
SVG files in one document can lead to collisions. | ||
earlier release. This parameter changes some random identifiers in the | ||
SVG file to be deterministic. The downside of this setting is that if | ||
more than one file is generated using with deterministic identifiers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
using with -> using.
There is a cache of png files keyed by the MD5 hashes of correspondingsvg and pdf files, which helps reduce test suite running times for svgand pdf files that stay exactly the same from one run to the next.This patch enables caching of test results, not only expectedresults, which is only useful if the tests are mostly deterministic(seematplotlib#7748). It adds reporting of cache misses, which can be helpfulin getting tests to stay deterministic, and expiration since thetest results are going to change more often than the expectedresults.
There is a cache of png files keyed by the MD5 hashes of correspondingsvg and pdf files, which helps reduce test suite running times for svgand pdf files that stay exactly the same from one run to the next.This patch enables caching of test results, not only expectedresults, which is only useful if the tests are mostly deterministic(seematplotlib#7748). It adds reporting of cache misses, which can be helpfulin getting tests to stay deterministic, and expiration since thetest results are going to change more often than the expectedresults.
There is a cache of png files keyed by the MD5 hashes of correspondingsvg and pdf files, which helps reduce test suite running times for svgand pdf files that stay exactly the same from one run to the next.This patch enables caching of test results, not only expectedresults, which is only useful if the tests are mostly deterministic(seematplotlib#7748). It adds reporting of cache misses, which can be helpfulin getting tests to stay deterministic, and expiration since thetest results are going to change more often than the expectedresults.
There is a cache of png files keyed by the MD5 hashes of correspondingsvg and pdf files, which helps reduce test suite running times for svgand pdf files that stay exactly the same from one run to the next.This patch enables caching of test results, not only expectedresults, which is only useful if the tests are mostly deterministic(seematplotlib#7748). It adds reporting of cache misses, which can be helpfulin getting tests to stay deterministic, and expiration since thetest results are going to change more often than the expectedresults.
There is a cache of png files keyed by the MD5 hashes of correspondingsvg and pdf files, which helps reduce test suite running times for svgand pdf files that stay exactly the same from one run to the next.This patch enables caching of test results, not only expectedresults, which is only useful if the tests are mostly deterministic(seematplotlib#7748). It adds reporting of cache misses, which can be helpfulin getting tests to stay deterministic, and expiration since thetest results are going to change more often than the expectedresults.
There has been plenty of work (at least#4434,#5671,#5766,#6597) to make it possible to have deterministic svg and pdf output, or at least closer to deterministic than before. This change adds one related feature: callers can remove the bits of pdf metadata that contain the date and matplotlib version, so that new releases don't necessarily change the result of every test. Furthermore, this change enables all the optional pdf and svg determinism features in tests.
I am developing some improvements to test-result caching on another branch, and in my tests only the following test files still change from run to run after this patch (although in my environment I get KNOWNFAIL=27, SKIP=8 so some more are possible):