Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork7.9k
Fix error with pyparsing 3 for 3.5.x#21454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Ping@anntzer. Your proposed fix doesn't seem to work. |
anntzer commentedOct 24, 2021 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Sorry, I initially missed that the error message (that we're matching against) slightly changed as well: diff --git a/lib/matplotlib/tests/test_mathtext.py b/lib/matplotlib/tests/test_mathtext.pyindex 75bc6af9b6..39f2242c85 100644--- a/lib/matplotlib/tests/test_mathtext.py+++ b/lib/matplotlib/tests/test_mathtext.py@@ -250,7 +250,9 @@ def test_fontinfo(): (r'$\leftF$', r'Expected a delimiter'), (r'$\rightF$', r'Unknown symbol: \rightF'), (r'$\left(\right$', r'Expected a delimiter'),- (r'$\left($', r'Expected "\right"'),+ # PyParsing 2 uses double quotes, PyParsing 3 uses single quotes and an+ # extra backslash.+ (r'$\left($', re.compile(r'Expected ("|\'\\)\\right["\']')), (r'$\dfrac$', r'Expected \dfrac{num}{den}'), (r'$\dfrac{}{}$', r'Expected \dfrac{num}{den}'), (r'$\overset$', r'Expected \overset{annotation}{body}'),@@ -281,8 +283,8 @@ def test_fontinfo(): ) def test_mathtext_exceptions(math, msg): parser = mathtext.MathTextParser('agg')-- with pytest.raises(ValueError, match=re.escape(msg)):+ match = re.escape(msg) if isinstance(msg, str) else msg+ with pytest.raises(ValueError, match=match): parser.parse(math) (it's basically the second commit of#21448) |
- Code suggestion taken frommatplotlib#21448- Removed all pyparsing pinning. If this runs through CI, the fix is proven to be working.
@@ -2044,7 +2044,7 @@ def __init__(self): | |||
p.accentprefixed <<= Suppress(p.bslash) + oneOf(self._accentprefixed) | |||
p.symbol_name <<= ( | |||
Combine(p.bslash + oneOf(list(tex2uni))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Could this also have been fixed by changingoneOf(list(tex2uni))
tooneOf(list(tex2uni), asKeyword=True)
and removing the FollowedBy altogether? This has the benefit of reducing the number of pyparsing terms to be matched, which should translate to faster parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ithink the problem of asKeyword is that "word boundary" needs a custom definition here (e.g. underscores and digits also separate words). I this this is configurable on pyparsing's side, but I'd rather just write our own regex (something like(?![a-zA-Z])
) and be done with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This sounds like a better approach than trying to do custom keywords.
If youreally want to collapse this down to a singleRegex
, then you could useoneOf
to build aRegex
for you of just the word choices, but then extract the generated regex pattern into a newRegex
, something likeRegex(r"\\(" + oneOf(text2uni).pattern + ")(?![a-zA-Z])")
. (untested, ymmv, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
tex symbols should not have any metacharacters, so the middle part is probably even just"|".join(tex2uni)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Beware of this -oneOf
also takes care of reordering longer entries before shorter in the event the shorter entry is a leading subset, and also deduping. Youcould just reverse sort the keys intex2uni
by length, but if there are known frequencies to some symbols vs others, and more frequent entries were placed first,oneOf
would only reorder them to avoid masking entries, whereas sorting by longest-to-shortest would lose some of this priority ordering.
You can test this for yourself:
re.match(r"ab|abb", "abb")
will return "ab", not the longer and more desirable match "abb". And thereare several cases in tex2uni of these kind of masking pairs.
Maybe keep it simple first, just do"|".join(sorted(set(tex2uni), key=len, reverse=True))
and then get clever with the frequency-based ordering as a later experiment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ah, good point, thanks for mentioning that.
v3.5.x
directly. The main branch will be fixed throughUse named groups in mathtext parser. #21448 (and then only needs the unpinning as well).