NotificationsYou must be signed in to change notification settings
Fork7.9k
Star21.3k

Simplify definition of mathtext symbols & correctly end tokens in mathtext parsing#22950

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

tacaswell merged 3 commits intomatplotlib:mainfromanntzer:mathtextsymbols

Aug 3, 2022

Merged

Simplify definition of mathtext symbols & correctly end tokens in mathtext parsing#22950

tacaswell merged 3 commits intomatplotlib:mainfromanntzer:mathtextsymbols

Aug 3, 2022

Conversation

Copy link

Contributor

anntzer commentedMay 1, 2022

PR Summary

First commit: Simplify definition of mathtext symbols.

Use a single regex that handles both single_symbol (a single character)
and symbol_name (\knowntexsymbolname), and also slightly simplify the
"end-of-symbol-name" regex.

This parsing element comes up extremely often, and removing one
indirection layers shaves off ~3-4% off drawing all the current mathtext
tests, i.e.

MPLBACKEND=agg python -c 'import time; from pylab import *; from matplotlib.tests.test_mathtext import math_tests; fig = figure(figsize=(3, 10)); fig.text(0, 0, "\n".join(filter(None, math_tests)), size=6); start = time.perf_counter(); [fig.canvas.draw() for _ in range(10)]; print((time.perf_counter() - start) / 10)'

Second commit: Correctly end tokens in mathtext parsing.

This avoids parsing\sinx as\sin x (it now raises an error
instead), and removes the need foraccentprefixed (because\doteq
is treated as a single token now, instead of\dot{eq}). This also
means that\doteq (and friends) are now correctly treated as relations
(per_relation_symbols, thus changing the spacing around them); hence
then change in baseline images. Only keep thex \doteq y baseline
(and adjust the test string to undo the spacing), to avoid regen'ing
baselines.

Also shaves ~2% off drawing all the current mathtext tests, i.e.

MPLBACKEND=agg python -c 'import time; from pylab import *; from matplotlib.tests.test_mathtext import math_tests; fig = figure(figsize=(3, 10)); fig.text(0, 0, "\n".join(filter(None, math_tests)), size=6); start = time.perf_counter(); [fig.canvas.draw() for _ in range(10)]; print((time.perf_counter() - start) / 10)'

(including adjustment for the two removed test cases), probably because
accentprefixed was previously extremely commonly checked, being at the
top of the placeable list; however, performance wasn't really the main
goal here.

PR Checklist

Tests and Styling

Has pytest style unit tests (andpytest passes).
IsFlake 8 compliant (installflake8-docstrings and runflake8 --docstring-convention=all).

Documentation

New features are documented, with examples if plot related.
New features have an entry indoc/users/next_whats_new/ (follow instructions in README.rst there).
API changes documented indoc/api/next_api_changes/ (follow instructions in README.rst there).
Documentation is sphinx and numpydoc compliant (the docs shouldbuild without error).

Use a single regex that handles both single_symbol (a single character)
and symbol_name (\knowntexsymbolname), and also slightly simplify the
"end-of-symbol-name" regex.

This parsing element comes up extremely often, and removing one
indirection layers shaves off ~3-4% off drawing all the current mathtext
tests, i.e.

MPLBACKEND=agg python -c 'import time; from pylab import *; from matplotlib.tests.test_mathtext import math_tests; fig = figure(figsize=(3, 10)); fig.text(0, 0, "\n".join(filter(None, math_tests)), size=6); start = time.perf_counter(); [fig.canvas.draw() for _ in range(10)]; print((time.perf_counter() - start) / 10)'

anntzer added Performance topic: text/mathtext PR: bugfixPull requests that fix identified bugs labels

May 1, 2022

Copy link

Member

oscargus commentedMay 14, 2022•
edited
Loading

Considering the doc-build failure: maybe one should also add some test in the main test suite for accents of the types\",\~, etc? As far as I can see, only\acute and so on are tested.

anntzer force-pushed themathtextsymbols branch from1eb8b2c to483dce2Compare

May 15, 2022 18:28

Copy link

ContributorAuthor

anntzer commentedMay 15, 2022

Ah, good catch, fixed and added test.

oscargus approved these changes

May 17, 2022

View reviewed changes

Copy link

Member

tacaswell commentedMay 17, 2022

I would prefer if we re-gen the test images here.

I think the'\ddots' symbol was one of the glyphs that was flat out wrong but still passing image tests with a tolerance.

tacaswell added this to thev3.6.0 milestone

May 17, 2022

Copy link

ContributorAuthor

anntzer commentedMay 17, 2022

Actually this reveals another bug: I deleted the ddots (etc.) test because they are now recognized as relation operators and extra spaces got added around them, but such spaces should actually not be there because the test string isr'$\dotplus$ $\doteq$ $\doteqdot$ $\ddots$' i.e. the relation operator is at (both) extremities of the dollar-enclosed part, in which case tex does not introduce a space. For a simpler example, considerfigtext(.5, .5, "a$=b$", size=24) with or without usetex. With usetex, there's no space between "a" and "=" (whereas there's one between "=" and "b"), whereas mathtext introduces a space on both sides of the "=".

Fixing this bug (which probably involves reusing something like the "Binary operators at start of string should not be spaced" part of the code indef symbol()) should allow keeping the old ddots test, so I'll look into that...

anntzer marked this pull request as draft

May 17, 2022 21:04

anntzer force-pushed themathtextsymbols branch from483dce2 to2eb3040Compare

June 11, 2022 21:48

Copy link

ContributorAuthor

anntzer commentedJun 11, 2022

I went for the easier path of just adding\hspace{-0.2} as needed to fix the images.

anntzer marked this pull request as ready for review

June 11, 2022 21:49

anntzer force-pushed themathtextsymbols branch from2eb3040 todfb2db7Compare

June 11, 2022 21:52

anntzer mentioned this pull request

Jun 11, 2022

Add support for more accents in mathtext#23189

Draft

10 tasks

oscargus added the status: needs review label

Jun 12, 2022

Simplify definition of mathtext symbols.

56a5153

Use a single regex that handles both single_symbol (a single character)and symbol_name (`\knowntexsymbolname`), and also slightly simplify the"end-of-symbol-name" regex.This parsing element comes up extremely often, and removing oneindirection layers shaves off ~3-4% off drawing all the current mathtexttests, i.e.```MPLBACKEND=agg python -c 'import time; from pylab import *; from matplotlib.tests.test_mathtext import math_tests; fig = figure(figsize=(3, 10)); fig.text(0, 0, "\n".join(filter(None, math_tests)), size=6); start = time.perf_counter(); [fig.canvas.draw() for _ in range(10)]; print((time.perf_counter() - start) / 10)'```

anntzer force-pushed themathtextsymbols branch fromdfb2db7 to32f1fa2Compare

June 16, 2022 22:17

QuLogic mentioned this pull request

Jun 21, 2022

Relation operator in mathtext should not be spaced when at end#23315

Open

jklymak requested a review fromtacaswell

June 23, 2022 08:08

Copy link

Member

jklymak commentedJun 23, 2022

@tacaswell can you re-review to be sure your concerns are met?

jklymak removed the request for review fromtacaswell

June 30, 2022 09:28

Copy link

Member

jklymak commentedJun 30, 2022

@anntzer it looks like you have still removed a bunch of baseline images. Are we sure those are still tested?

anntzer force-pushed themathtextsymbols branch from32f1fa2 tod58306fCompare

July 3, 2022 18:34

Copy link

ContributorAuthor

anntzer commentedJul 3, 2022

Yes I am sure, this is only removing test 77, which checks that "accentprefixed" commands are correctly interpreted (e.g. \doteq is not interpreted as \dot eq), but this is essentially also covered by ther'$\dotplus$ $\doteq$ $\doteqdot$ $\ddots$' just above, and by the\sinx test I added below to check, more generally, that spaces (or braces) are required after operators now (I also added a similar\dota test for good measure).

Copy link

Member

tacaswell commentedJul 4, 2022

It it worth an API change note on the spacing?

Copy link

ContributorAuthor

anntzer commentedJul 4, 2022

Changelog entry added, also added dotminus to the spaced operators as it was clearly missing before.
Also moved dotplus and dotminus from "relational operators" to "binary operators" (see e.g.https://mirrors.ircam.fr/pub/CTAN/macros/unicodetex/latex/unicode-math/unimath-symbols.pdf), this actually has no effect on our rendering because we use the same amount of space for both even though that is a simplification over tex's algorithm (https://tex.stackexchange.com/a/38986).

anntzer added2 commits

July 5, 2022 10:16

Correctly end tokens in mathtext parsing.

3efad3b

This avoids parsing `\sinx` as `\sin x` (it now raises an errorinstead), and removes the need for `accentprefixed` (because `\doteq`is treated as a single token now, instead of `\dot{eq}`).  This alsomeans that `\doteq` (and friends) are now correctly treated as relations(per `_relation_symbols`, thus changing the spacing around them); hencethen change in baseline images.  Adjust test strings accordingly to undothe spacing, to avoid regen'ing baselines.Also shaves ~2% off drawing all the current mathtext tests, i.e.```MPLBACKEND=agg python -c 'import time; from pylab import *; from matplotlib.tests.test_mathtext import math_tests; fig = figure(figsize=(3, 10)); fig.text(0, 0, "\n".join(filter(None, math_tests)), size=6); start = time.perf_counter(); [fig.canvas.draw() for _ in range(10)]; print((time.perf_counter() - start) / 10)'```(including adjustment for the removed test case), probably becauseaccentprefixed was previously extremely commonly checked, being at thetop of the placeable list; however, performance wasn't really the maingoal here.

Make dotplus, dotminus binary operators.

b94addd

anntzer force-pushed themathtextsymbols branch froma1ec777 tob94adddCompare

July 5, 2022 08:16