Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Type42 subsetting in PS/PDF#20391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
jkseppan merged 29 commits intomatplotlib:masterfromaitikgupta:pyftsubset-fonttools
Jul 26, 2021

Conversation

aitikgupta
Copy link
Contributor

@aitikguptaaitikgupta commentedJun 8, 2021
edited
Loading

PR Summary

This PR is a fresh rebase from#18143.

  • Adds a dependency:fonttools to handle font subsetting for us
    (we already have an external ttconv dependency, which does not handle subsetting)
  • Interfaces agetSubset utility to get file-like objects containing subsetted font data

Possiblyfixes#11303 (large file sizes)
Fixes#18191.

PR Checklist

  • Has pytest style unit tests (andpytest passes).
  • IsFlake 8 compliant (runflake8 on changed files to check).
  • New features are documented, with examples if plot related.
  • Documentation is sphinx and numpydoc compliant (the docs shouldbuild without error).
  • Conforms to Matplotlib style conventions (installflake8-docstrings and runflake8 --docstring-convention=all).
  • New features have an entry indoc/users/next_whats_new/ (follow instructions in README.rst there).
  • API changes documented indoc/api/next_api_changes/ (follow instructions in README.rst there).

lianghongzhuo reacted with thumbs up emojilianghongzhuo reacted with hooray emoji
@jklymakjklymak marked this pull request as draftJune 8, 2021 21:31
@lumberbot-app
Copy link

I'm Mr. Meeseek,@aitikgupta, Look at meee !

aitikgupta reacted with laugh emoji

@aitikguptaaitikgupta changed the titleType42 subsetting in PDFType42 subsetting in PS/PDFJun 13, 2021
@aitikgupta
Copy link
ContributorAuthor

aitikgupta commentedJun 13, 2021
edited
Loading

The last commit subsets and embeds Type 42 PS/EPS, here's the size difference in outputs:

nosub-dejavu.ps                ----> 1.2 MBsub-dejavu.ps                  ----> 8.5 kB

(To test it out on your machine, apply this patch fornosub-dejavu.ps:)

diff --git a/lib/matplotlib/backends/backend_ps.py b/lib/matplotlib/backends/backend_ps.pyindex 2a3ab64a1c..ed0af4ea85 100644--- a/lib/matplotlib/backends/backend_ps.py+++ b/lib/matplotlib/backends/backend_ps.py@@ -997,12 +997,8 @@ class FigureCanvasPS(FigureCanvasBase):                             ) as tmp:                                 tmp.write(fontdata)                                 tmp.seek(0, 0)-                                font = FT2Font(tmp.name)-                                glyph_ids = [-                                    font.get_char_index(c) for c in chars-                                ]                                 convert_ttf_to_ps(-                                    os.fsencode(tmp.name),+                                    os.fsencode(font_path),                                     fh,                                     fonttype,                                     glyph_ids,

The script:

importmatplotlib.pyplotaspltplt.rcParams["ps.fonttype"]=42plt.figtext(0.5,0.5,"hello")# plt.savefig('sub-dejavu.ps')                    # before applying the patch# plt.savefig('nosub-dejavu.ps')                  # after applying the patch

@aitikguptaaitikgupta marked this pull request as ready for reviewJune 15, 2021 16:57
@jklymakjklymak added this to thev3.5.0 milestoneJun 17, 2021
@jklymakjklymak marked this pull request as draftJune 17, 2021 14:40
@aitikgupta
Copy link
ContributorAuthor

aitikgupta commentedJun 18, 2021
edited
Loading

#20391 (comment): This appears to require testing.

I think it's difficult to testget_glyphs_subset, since it really depends on font files, and if we chose certain characters to test with a fixed font (lets say), there's no guarantee that the subset will be the same for different versions of the library.

Is there any other way to test this?

edit:@jklymak this isn't a draft anymore, just needed to address review comments 😄

@jklymak
Copy link
Member

You can mark as ready-to-review at any point. I just kick things to Draft so they don't show up in the queue if the PR requires action on the part of the author. (this one is still not passing the tests).

@aitikgupta
Copy link
ContributorAuthor

(this one is still not passing the tests).

Yeah, CI isn't installing fonttools, which is whyNoModuleFound is all over the place (even after adding it tominvers.txt:#20391 (comment))

@QuLogic
Copy link
Member

Install is done with--no-deps, so hard dependencies also need to be listed here:https://github.com/matplotlib/matplotlib/blob/master/.github/workflows/tests.yml#L148minver.txt is only aconstraint on versions.

aitikgupta reacted with thumbs up emoji

@QuLogic
Copy link
Member

I think it's difficult to testget_glyphs_subset, since it really depends on font files, and if we chose certain characters to test with a fixed font (lets say), there's no guarantee that the subset will be the same for different versions of the library.

We ship DejaVu ourselves, so that should always be available for testing.

Is there any other way to test this?

A file with just 'A' embedded should be smaller than one with the full alphabet, say?
Also, a subsetted PDF should appear the same as one without subsetting (assuming Ghostscript doesn't happen to substitute the same glyphs.)

@aitikgupta
Copy link
ContributorAuthor

A file with just 'A' embedded should be smaller than one with the full alphabet, say?

Oh, if we just test the subset being 'smaller' yeah we can do it, but not with the exact number glyphs, since that will definitely vary. (we also setrecommeded_glyphs=True)

But even so, isn't that the 'full-time job' of the fonttools library itself? or in other words wouldn't testing theget_glyphs_subset function be the same as testing the library itself (that it does reduce the number of glyphs)?

@aitikguptaaitikgupta marked this pull request as ready for reviewJune 19, 2021 19:13
@jkseppan
Copy link
Member

Oh, and one more thing (sorry for not remembering this earlier): thePDF specification has some extra requirements for font subsets in section 9.6.4. The PostScript name of subsetted fonts needs to have a prepended tag of six random uppercase letters and a plus sign, e.g.EOODIA+Poetica if the original font name isPoetica. This is relevant in the case that multiple Matplotlib plots are combined in the same document, e.g. in a LaTeX paper with several figures. The random tags prevent collisions between the different versions of the same font.

@aitikgupta
Copy link
ContributorAuthor

extra requirements for font subsets ... PostScript name of subsetted fonts needs to have a prepended tag of six random uppercase letters and a plus sign

I think we don't do this even for Type 3 subsetting?
Since wealways try to subset those, I can probably just add a function to modify the name for Type 3 and Type 42 both.

@aitikgupta
Copy link
ContributorAuthor

This breaks thetest_determinism_check:

deftest_determinism_check(objects,fmt,usetex):
"""
Output three times the same graphs and checks that the outputs are exactly
the same.

This is for obvious reasons, since we have random string inside font postscript name. (due to the specification)

@jkseppan
Copy link
Member

Then the string shouldn't be random, but perhaps something derived from the subset of glyphs? Takehash(frozenset(glyphs)), convert it to base 26 and take the first six letters, or something like that.

The PDF specification only requires the tag for Type 1 and TrueType fonts. It probably doesn't hurt with Type 3 fonts either.

@aitikgupta
Copy link
ContributorAuthor

Then the string shouldn't be random, but perhaps something derived from the subset of glyphs?

That totally makes sense! Let me try this out 👍🏼

@jklymakjklymak marked this pull request as draftJuly 22, 2021 13:42
@jklymak
Copy link
Member

@sauerburger can you suggest a test for that, or is it already in#20633?

@aitikgupta
Copy link
ContributorAuthor

can you suggest a test for that, or is it already in#20633?

The test is already in, and that is exactly what is breaking the CI for this PR.
I'll rebase and push a commit, fixing the error..

jklymak and sauerburger reacted with thumbs up emoji

@aitikguptaaitikgupta marked this pull request as ready for reviewJuly 22, 2021 16:34
@jkseppan
Copy link
Member

It seems to me that the requested changes have been made.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@jkseppanjkseppanjkseppan approved these changes

@QuLogicQuLogicQuLogic left review comments

@anntzeranntzeranntzer left review comments

@sauerburgersauerburgersauerburger left review comments

@tacaswelltacaswelltacaswell approved these changes

Assignees
No one assigned
Projects
None yet
Milestone
v3.5.0
7 participants
@aitikgupta@jklymak@QuLogic@jkseppan@sauerburger@anntzer@tacaswell

[8]ページ先頭

©2009-2025 Movatter.jp