Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

FIX: Remove type checking for strings in '_validate_linestyle'#8165

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

afvincent
Copy link
Contributor

This PRfixes#8145 .

The type checking of strings was raising errors when non Unicode strings were passed as arguments to_validate_linestyle (that was introduced in#8040), which can easily occur under Python 2.7 for example. This PR replaces this type checking with a simpletry: … except: … approach. It also adds a couple of test cases to explicitly test the situation of non Unicode arguments.

@afvincentafvincent added this to the2.0.1 (next bug fix release) milestoneFeb 27, 2017
@@ -926,7 +930,7 @@ def _validate_linestyle(ls):
# (called inside the instance of validate_nseq_float).
pass

raise ValueError("linestyle must be a string or " +
raise ValueError("linestyle must be avalidstring or " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Remove the "+" here?

Copy link
ContributorAuthor

@afvincentafvincentFeb 27, 2017
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'll do that.Edit: done in278913d.

@afvincent
Copy link
ContributorAuthor

Well Travis is really angry at me for theb'dotted' test case ^^! (Too bad, the tests pass locally on my computer with Python 2.7…) I am going to investigate this.

@afvincent
Copy link
ContributorAuthor

Ok, it looks like Python 2 (, which I am working with) is less strict about string-like object comparisons:

# Python 2.7.12 |Continuum Analytics, Inc.| (default, Jul  2 2016, 17:42:40)In [1]:b'dotted'in [u'dotted']Out[1]:TrueIn [2]:'dotted'.encode('ascii')in [u'dotted']Out[2]:True

than Python 3:

# Python 3.6.0 |Continuum Analytics, Inc.| (default, Dec 23 2016, 12:22:00)In [1]:b'dotted'in ['dotted']Out[1]:FalseIn [2]:b'dotted'.decode()in ['dotted']Out[2]:TrueIn [3]:'dotted'.encode('ascii').decode()in ['dotted']Out[3]:True

The commit973dd5e should fix the previous failures, by using thedecode method if it is available.

@afvincent
Copy link
ContributorAuthor

Hum, Travis seems a bit happier with Python 3, but I just tried locally_validate_linestyle('dotted'.encode('utf-16') on Python 3, and now I understand better@tacaswell's comment in#8145 😞… It raises a niceUnicodeDecodeError as the enconding is not UTF-8.

What is the usual way to deal with this kind of encoding problem?

PS: I'll take take of the PEP8 issue, do not worry.

@afvincent
Copy link
ContributorAuthor

Ok, I had a look at howmatplotlib.text.Text handles the problem of encoding. It accepts “string or anything printable with '%s' conversion”. So actually, I wonder if it would not be better to do something similar in_validate_linestyle if we want to avoid avoid raising aUnicodeDecodeError exception when_validate_linestyle is given a “string” with an “exotic” encoding. For example

try:return_validate_named_linestyle("{:s}".format(ls))exceptKeyError:# ls is not a valid name of line style.pass

should avoid raising aUnicodeDecodeError with UTF-16 & Co. Or would it be better to really raise aUnicodeDecodeErrorexception? In this case,

try:return_validate_named_linestyle(six.text_type(ls))exceptKeyError:# ls is not a valid name of line style.pass

may be more consistent with the other string checking done inrcsetup rather than trying to call thedecode method if it is available (like in973dd5e).

@afvincent
Copy link
ContributorAuthor

Removing theneed_review flag for the moment because handling encoding has appear to be a bit harder than I expected and the proper way to do it has still to be chosen.

Copy link
Member

@tacaswelltacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Remove attempts to decode.

@@ -342,6 +342,8 @@ def generate_validator_testcases(valid):
('', ''), (' ', ' '),
('None', 'none'), ('none', 'none'),
('DoTtEd', 'dotted'),
(b'dotted', 'dotted'), # binary string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

These should only pass on python2 (where str and bytes are conflated). In python 3 this is correctly failing asb'dotted' != 'dotted' because to do the comparison you would have to either decode the bytes (which we can't do because we don't know the encoding) or encode the string into the same encoding as the bytes (which we can't do because we don't know the encoding).

I suggest putting this in another test with a skip if python3 mark?

@tacaswell
Copy link
Member

Just pass the user input through, in 3, users passing bytes in is wrong and should fail, on 2 it happens to work due to the warts of string v unicode on python2.

@afvincent
Copy link
ContributorAuthor

Ok thanks for the comment@tacaswell . I'll do this in a moment.

Note to myself: go back to before the decode commit, and play withsix.PY3 in the test.

@afvincentafvincentforce-pushed thefix_issue_8145_ls_validation branch from973dd5e to9ae7265CompareMarch 1, 2017 10:35
@afvincent
Copy link
ContributorAuthor

Rebased! I squashed some similar commits and dropped the former one that attempted to decode the arguments.

@tacaswell No more attempt to decode the arguments, and I modified the tests to take care of the difference between Python 2 and Python 3. TBH, I struggled a bit to find a clean way to do the latter: I hope the style of the new version of the tests is fine. On my local computer with Python 2, the tests are passing (and the doc is building): hopefully Travis with Python 3 will agree.

@anntzer
Copy link
Contributor

The new approach may not play nicely (as in, cause a warning) due to the same issue as#7658? I haven't checked, but you may want to have a look.

@afvincent
Copy link
ContributorAuthor

@anntzer Thanks, I'll have a look this weekend.

@afvincentafvincent mentioned this pull requestMar 8, 2017
@afvincent
Copy link
ContributorAuthor

Based on@anntzer's remark, it appears that indeed

np.array([1,2])in ['dotted','solid']

raises aFutureWarningbecause one day the comparison that is done under the hood byin will be element-wise (the warning will then become aValueError exception, which does not belong to the ones that are currently captured in this PR). Actually, in the precise case of this PR, thisFutureWarning is not raised because we ignore the case, which calls.lower() before anyin operation, and thus raises anAttributeError error that is correctly caught if a Numpy array is given as on-off sequence.

With97234b5, I reintroduce some early type checking to avoid any comparison of a Numpy array with strings. I agree it is kind of a nuke for a problem that does not even currently exist, but some (super) thin reasons to support this may be:

  • it makes the exception catching clearer (one can remove theAttributeError that may be raised by.lower())
  • if one day one wantnot to ignore the case of the argument, it will still work properly… (eventhough this API break is not very likely to happen ;) )
  • well, it works as well as the other solution (at least on Python 2 the tests pass locally, I hope Travis will agree on Python 3 ^^).

@afvincent
Copy link
ContributorAuthor

@tacaswell Ping if you want to have a look when Travis will have run (TL; DR:decode is no more 🐑)

@anntzer I hope you will notice the effort ofnot usingis_string_like, despite the urge of doing so on first thought 😁. (ref:#8011)

if isinstance(ls, six.string_types):
try:
return _validate_named_linestyle(ls)
except (KeyError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

no parentheses

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Indeed…844115f takes care of it.

@anntzer
Copy link
Contributor

@afvincent Congratulations, you've earned un bon point. (Dix points = une image. Dropping Python2 support = many many points :-))

@afvincent
Copy link
ContributorAuthor

@anntzer You know, I had a few teachers that were doing so in primary school 😄.

Abouy dropping Python2, well… Let's say that I made a “mistake” when I started to learn Python at the beginning of my Ph.D. But I promise, I'll switch for my next big research project (2020 is coming close anyway…). We'll see if the bounty points lasts until then ;). If it doesn't, at least I'll have access to all the wonderful things like @, f-strings & Co!

@anntzer
Copy link
Contributor

You know, I grew up in France... (in fact, pretty close to where you did your PhD...)

NelleV reacted with laugh emoji

@afvincent
Copy link
ContributorAuthor

Rahh, Travis does not seem to be happy (@anntzer you may want to take yourbon point back). I think I understand why (if I am right, I was lucky about exceptions catching before…): I'll push a commit when I will have a fix.

@afvincentafvincentforce-pushed thefix_issue_8145_ls_validation branch from844115f tod4bd0cfCompareMarch 9, 2017 20:33
@afvincent
Copy link
ContributorAuthor

Rebased (and squashed) with a new version! The former spirit is still there but now the exception catching should be more robust, both on Python 2 (a local pytest runs smoothly withtest_rcparams.py and the docs are building) and on Python 3 (manual testing in an interactive session, I hope Travis will agree with me this time).

For the record, previously on both Python versions, the exception handling was flaky or error-prone. For example on Python 2, aUnicodeDecodeError was not caught with arguments like'dotted'.encode('utf-16'), causing a short exit that was not expected. Unfortunately, this exception was then caught by pytest, which was expecting aValueError and was thus marking the test as fine… On Python 3, there was (at least) a funny doubly wrong behavior: byte-likels argument that were of even-length were passed to the instance ofvalidate_nseq_float, which was converting them into funny on-off ink sequences (for examplevalidate_nseq_float()('dotted'.encode('utf-16')) corresponds to[255.0, 254.0, 100.0, 0.0, 111.0, 0.0, 116.0, 0.0, 116.0, 0.0, 101.0, 0.0, 100.0, 0.0] 😄 )

@afvincent
Copy link
ContributorAuthor

Ok, now Travis stays calm :)! Ping to@tacaswell if you want to have a look, now that Travis is peacefull. And@anntzer because there have been quite a few changes since you had a look yesterday (sorry for that 🐑 ).

Copy link
Contributor

@anntzeranntzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Minor fix.

# 'solid'.encode('utf-16'), may raise a unicode error.
raise ValueError("the linestyle string is not a valid string.")

if hasattr(ls, 'decode'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

isinstance(ls, bytes) seems to be more explicit? (in the Py2 case this will already have been covered by the case above).

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Then I guess it should beisinstance(ls, (bytes, bytearray)) if one want to avoid jokes (and keep an identical behavior). I'll try to test that during the day.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It'smemoryview on Py3, and it doesn't have adecode attribute. I think we should just not worry about that case (people can always pass in objects with arbitrarily messed up__len__/__iter__/...)

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Huh, on Python 3.6, I get

In [16]:hello=bytearray("Hello".encode('utf-16'))In [17]:hasattr(hello,'decode')Out[17]:True

so this looks like bytearray instancesdo have a decode method, don't they?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Oops sorry, was confusing with buffer/memoryview. Ignore what I said :-)

except (UnicodeDecodeError, KeyError):
# On Python 2, string-like *ls*, like for example
# 'solid'.encode('utf-16'), may raise a unicode error.
raise ValueError("the linestyle string is not a valid string.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

For debugging purpose of our users, could you add here what the value passed is?
raise ValueError("the linestyle string %r is not a valid string." % ls) should work.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I am not sure@anntzer will support the idea of using old string formatting ^^. Joke apart, the idea to return argument that was passed seems OK to me: I'll do that during the day too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Wait doesn't everyone usef"the linestyle string {ls} is not a valid string." these days? :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

👍 to this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

however you prefer the string formatting to be :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

When we let master branch go 3-only we probably can target 3.6+ if we go back to supporting the 'last 2' python minor versions and are targeting a 3 only release for 2018-07 (py3.7 is scheduled for 2018-06-15).....

@NelleVNelleV changed the titleFIX: Remove type checking for strings in '_validate_linestyle'[MRG+1] FIX: Remove type checking for strings in '_validate_linestyle'Mar 11, 2017
@afvincentafvincentforce-pushed thefix_issue_8145_ls_validation branch fromd4bd0cf toed04d93CompareMarch 11, 2017 10:42
@afvincent
Copy link
ContributorAuthor

Re-rebased: squashed to 2 single commits and reordered them (1. code changes, 2. update of the related test).

The only difference with the previous commits should be:

  • hasattr(ls, 'decode') <-isinstance(ls, (bytes, bytearray)), because “Explicit is better than implicit;”
  • more useful exception messages for debugging by adding the “representation” of thels argument (done with"Yada yada {!r} yada".format(ls), because 2020 is far away and I likeformat :) ).

@NelleV
Copy link
Member

This looks good to me.

@NelleVNelleV changed the title[MRG+1] FIX: Remove type checking for strings in '_validate_linestyle'[MRG+2] FIX: Remove type checking for strings in '_validate_linestyle'Mar 11, 2017
@anntzeranntzer merged commit3c77c28 intomatplotlib:masterMar 12, 2017
@anntzer
Copy link
Contributor

Thanks!

@QuLogicQuLogic changed the title[MRG+2] FIX: Remove type checking for strings in '_validate_linestyle'FIX: Remove type checking for strings in '_validate_linestyle'Mar 12, 2017
@afvincent
Copy link
ContributorAuthor

Thanks to all who reviewed the PR :). It was a windy journey, but an instructive one from my side ^^.

@tacaswell
Copy link
Member

You have hit on both why the bytes / unicode changes were made in python3 and why the transition has been a bit rough 😈 .

@QuLogic
Copy link
Member

There does not appear to be a_validate_linestyle inv2.0.x;@afvincent can you verify if this needs backporting?

@afvincent
Copy link
ContributorAuthor

@QuLogic Indeed, backporting does not seem required.#8040 (which is fixed by this PR) was not backported but milestoned for 2.1 (by@tacaswell) instead: that is why there is no_validate_linestyle in v2.0.x branch ⇒ removing theneed_backport flag.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@tacaswelltacaswelltacaswell approved these changes

@NelleVNelleVNelleV approved these changes

@anntzeranntzeranntzer approved these changes

Assignees
No one assigned
Projects
None yet
Milestone
v2.0.1
Development

Successfully merging this pull request may close these issues.

Warning treated as error while generating docs
5 participants
@afvincent@tacaswell@anntzer@NelleV@QuLogic

[8]ページ先頭

©2009-2025 Movatter.jp