Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Proof of concept: Type42 subsetting in pdf#18143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
jkseppan wants to merge4 commits intomatplotlib:masterfromjkseppan:subset-type42

Conversation

jkseppan
Copy link
Member

PR Summary

Usefonttools to subset TrueType fonts when embedding them in Type42 format. This is a somewhat hacky proof of concept, but it seems to work:

importmatplotlibfrommatplotlibimportpyplotaspltmatplotlib.rcParams['pdf.fonttype']=42plt.plot([3,1,4,1,5,9,2])plt.title(r'$\pi$')plt.text(1,5,'Hellø World! ()℻ǘ ⇐⇑⇒⇓←↑→↓↴↵≀')plt.savefig('foo.pdf')

outputs

SUBSET /Users/jks/matplotlib/lib/matplotlib/mpl-data/fonts/ttf/DejaVuSans-Oblique.ttf characters: πSUBSET /Users/jks/matplotlib/lib/matplotlib/mpl-data/fonts/ttf/DejaVuSans-Oblique.ttf 633840 -> 3052SUBSET /Users/jks/matplotlib/lib/matplotlib/mpl-data/fonts/ttf/DejaVuSans.ttf characters: ←↑→↓ !()0123456789↴℻↵≀H⇐⇑⇒⇓WǘdelorøSUBSET /Users/jks/matplotlib/lib/matplotlib/mpl-data/fonts/ttf/DejaVuSans.ttf 756072 -> 11340

and produces the attached filefoo.pdf, which looks fine in at least Preview.app. The debug output shows the size reduction from the original font file to the subset (before compression).

Do people think this would be worth pursuing? The fonttools library would be a new dependency, but it has been around for a long time and seems to be under development. It does raise a DeprecationWarning that seems quite pointless (you can just comment out the problematic import with no effect) but we could probably send them a PR to fix that. The library can also read and subset OpenType fonts and read Type-1 fonts (but it doesn't seem to include subsetting support for those).

PR Checklist

  • Has Pytest style unit tests
  • Code isFlake 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/next_api_changes/* if API changed in a backward-incompatible way

@jklymak
Copy link
Member

Looks fine in Acrobat.

I'm not an authority on extra dependencies, but this one certainly looks reasonable so long as it pip installs on most machines. Looks like its all python?

Does this come at a huge speed hit in creating the files? i.e. is it something the user may want to toggle?

@jkseppan
Copy link
MemberAuthor

I'm not an authority on extra dependencies, but this one certainly looks reasonable so long as it pip installs on most machines. Looks like its all python?

Yes, it's pure python. Some related projects are in C++, at least compreffor (something for reducing the size of tables in CFF fonts).

Does this come at a huge speed hit in creating the files? i.e. is it something the user may want to toggle?

I didn't measure, but on the command line it felt pretty fast.

This would have to be toggleable on a per-font basis, because font subsetting seems to be a bit of an arcane art. Font specifications have evolved over the years and there are many old font files and many PDF consuming applications out there, so I would not be surprised if subsetting some specific font causes some specific PDF viewer to fail to display it.

@anntzer
Copy link
Contributor

fonttools seems like a reasonable dependency. I don't know how much wewant to have type-42 subsetting (as in, is type-3 subsetting really not sufficient?), but I agree that if we do we more or less have to bring fonttools in.

@jkseppan
Copy link
MemberAuthor

I know that some publishers run a quality check on pdf files and reject them if there are any Type 3 fonts. I think this is because for a long time dvipdf/pdfTeX produced poor-quality Type 3 fonts, basically just TeX Metafonts rendered as bitmaps (since the conversion from Metafont to PostScript is not trivial). Eventually good-quality Type-1 versions of the TeX fonts became available but TeX systems had to be configured to use them, so requiring Type 1 instead of Type 3 was a simple way to ensure acceptable-quality fonts.

These days there probably is little reason for publishers not to accept files with Type 3 fonts, but when you have established that kind of quality check, it's hard to go back. Also I think I've heard that there are some uses of pdf files where Type 42 is actually better than Type 3, although I can't recall any details. Perhaps Asian language support? I'm sure there's some reason that both kinds of embeddings have been implemented.

anntzer reacted with thumbs up emoji

@QuLogic
Copy link
Member

So is the only thing holding this up verifying whether it might break something? Or is there some more implementation to be done?

@@ -17,6 +17,8 @@ def pytest_configure(config):
("markers", "baseline_images: Compare output against references."),
("markers", "pytz: Tests that require pytz to be installed."),
("filterwarnings", "error"),
("filterwarnings",
"ignore:.*The py23 module has been deprecated:DeprecationWarning"),
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

this is probably not needed any more: seefonttools/fonttools#2035

with tempfile.NamedTemporaryFile(suffix='.ttf') as tmp:
tmp.write(fontdata)
tmp.seek(0, 0)
font = FT2Font(tmp.name)
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Reloading the FT2Font object is a bit ugly, and I think it is only needed here to get the glyph widths, the cid to gid map and the unicode mapping. These could probably be obtained otherwise. On the other hand, reusing the old code makes this patch smaller.

''.join(chr(c) for c in characters)
)
print(f'SUBSET {filename} {os.stat(filename).st_size}'
f' ↦ {len(fontdata)}')
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

These should obviously be log calls at the debug level.

@aitikguptaaitikgupta mentioned this pull requestJun 8, 2021
7 tasks
@tacaswell
Copy link
Member

Moved to#20391

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers
No reviews
Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

5 participants
@jkseppan@jklymak@anntzer@QuLogic@tacaswell

[8]ページ先頭

©2009-2025 Movatter.jp