Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Fix check 1d#22141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
tacaswell merged 3 commits intomatplotlib:mainfromjklymak:fix-check-1d
Mar 3, 2022
Merged

Fix check 1d#22141

tacaswell merged 3 commits intomatplotlib:mainfromjklymak:fix-check-1d
Mar 3, 2022

Conversation

jklymak
Copy link
Member

@jklymakjklymak commentedJan 7, 2022
edited
Loading

PR Summary

Closes#22125 and#22330

This was an issue with handling

x = pd.Series(x, dtype="Float32")

in the logic of our plot argument parsing.

x.values returns aFloatingArray. But it doesn't behave like a numpy array, in thatx.values[:, None] returns aValueError: values must be a 1D array. I actually think this is a Pandas bug, but I'll leave that to someone with more pandas knowledge to inform them. However, we can get around it by usingx.to_numpy() which seems to work fine.

PR Checklist

Tests and Styling

  • Has pytest style unit tests (andpytest passes).
  • IsFlake 8 compliant (installflake8-docstrings and runflake8 --docstring-convention=all).

Documentation

  • New features are documented, with examples if plot related.
  • New features have an entry indoc/users/next_whats_new/ (follow instructions in README.rst there).
  • API changes documented indoc/api/next_api_changes/ (follow instructions in README.rst there).
  • Documentation is sphinx and numpydoc compliant (the docs shouldbuild without error).

Comment on lines 1752 to 1753
for x in [pd.Series([1, 2], dtype="float64"),
pd.Series([1, 2], dtype="Float32")]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Is this specifically for "Float32" or would the test also do with "Float64"? If it's only the custom type and precision does not matter, I'd got with "Float64" to communicate that we're primarily testing the custom pandas type.

Side-note: It seems the capital "Float" types are not yet documented. There's onlya v1.2.0 change note and the GH issues linked therein.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

As far as I can tell its undocumented. Maybe we should hold of on support, or maybe this PR can go in, but without the test?

I actually disagree with Pandas having a new type here - it seems people want this for pedantic reasons, but I think 99.999% of the world doesn't care if NaN means a failed computation or missing data. However, if they feel strongly, they should get numpy onboard, and then everyone will have this flag rather than making a new data type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think any of their capital F floats will work.

@tacaswelltacaswell added this to thev3.5.2 milestoneJan 11, 2022
@@ -1649,7 +1650,7 @@ def index_of(y):
The x and y values to plot.
"""
try:
return y.index.values, y.values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Do we need both this change and the additional exception handling above?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't know, but I think we should be in the habit of usingto_numpy() when possible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

to_numpy came in via pandan 0.24 on Jan 25, 2019 so we can rely on it being there andhttps://stackoverflow.com/a/54508052/380231 makes an arguement in favor ofno_numpy.

I think this change may be un-related (or fix this by chance) but I do not think it will avoid needing the other one and does no harm.

tacaswell
tacaswell previously approved these changesJan 21, 2022
@jklymak
Copy link
MemberAuthor

@tacaswell, thanks for your review, but dismissing as this approach is completely different, and attempts to remove the pandas-ness of the data right at the beginning.

@jklymakjklymak dismissedtacaswell’sstale reviewJanuary 27, 2022 12:41

Approach now is completely different, so requires a re-review

@jklymakjklymak added the Release criticalFor bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. labelJan 27, 2022
Copy link
Member

@tacaswelltacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I have some concerns (mostly because any time we change complicated code we find that the complexity was there to handle some weird corner case that failed to get a test), but overall am 👍 on this as I think this will also fix other data containers. I think there is a risk that something subtle will break, but we are already dealing with something subtle breaking and I am optimistic that the cure will not be worse than the malady.

If you think of a DF as a 2D array with columns, then this is consistent with how we handle 2D arrays being passed intoplot.

@jklymak
Copy link
MemberAuthor

jklymak commentedJan 27, 2022
edited
Loading

This gets used in_plot_args, usually viaindex_of (which adds a y if it is missing).

iflen(xy)==2:
x=_check_1d(xy[0])
y=_check_1d(xy[1])
else:
x,y=index_of(xy[-1])
ifself.axes.xaxisisnotNone:
self.axes.xaxis.update_units(x)
ifself.axes.yaxisisnotNone:
self.axes.yaxis.update_units(y)

and it gets calledbefore we check units. So I guess what I am proposing here would strip an object that has ato_numpy method but somehow was being used by somebody for units.

This does give me pause that we have maybe done the wrong thing here, and are using_check_1d to both coerce a singleton to an arrayand to convert pandas series and dataframes to arrays. Maybe theright thing to do is just return the DataFrame, run through the converter, andthen coerce to numpy if it is not already.

Meh, it has done this conversion for a long time.

@jklymak
Copy link
MemberAuthor

@tacaswell, this simplified a bit more since your approval - if you wanted to double check, that would be appreciated.

@QuLogicQuLogic mentioned this pull requestFeb 25, 2022
2 tasks
@oscargus
Copy link
Member

In#22560 I addedcbook._unpack_pandas (so_unpack_pandas for this purpose) which basically doesto_numpy(), but also with a fallback tovalues. I did not touch the lines you changed here though.

I guess it can make sense to have a coordinates merge of this and that PR. If this is merged first, I'll update my PR. If my PR is merged first, it can make sense to use that function here.

@jklymak
Copy link
MemberAuthor

Well first it's not only pandas that has to_numpy so that is a bit of a misnomer. But also, why have a separate method at all?

@oscargus
Copy link
Member

It was suggested to use a separate function. Right now, slightly different approaches are used at different locations in the code. Sometimes a fallback tovalues, sometimes not. A benefit is possibly that it is enough to modify it in a single location (and that old Pandas versions are still supported with this fallback). Also, if we want to support new libraries that with some other name, it will be easy to do that. Well, all the standard reasons to factor out a piece of common code...

Regarding naming, I considered that, but I do not know which other libraries support that function. However, I do not really see the name neither written in stone nor something that should prohibit using a single function, there will be a name that is correct enough. (And as you can see, there are explicit comments mentioning pandas at all the other locations where it was used.)

@oscargus
Copy link
Member

I then assume that we merge this first, find a good name for the function and, if you want to strongly object using the function here, you can do that in#22560.

@jklymak
Copy link
MemberAuthor

I meant why have a separate method than check_1d? Our problem is inconsistent duckttping so if we can have it all in one spot that would be very helpful. If check1d does more than duck type pandas then sure it could call the ducktype converter.

I do actually wonder if all of this should just be part of the unit conversion machinery rather than cbook calls

@oscargus
Copy link
Member

I meant why have a separate method than check_1d?

I do not have an enough overview of the code base to see if one should/could have used check_1d (or check_2d?) instead. But if possible, that is of course even better.

@tacaswelltacaswell merged commit0359832 intomatplotlib:mainMar 3, 2022
meeseeksmachine pushed a commit to meeseeksmachine/matplotlib that referenced this pull requestMar 3, 2022
QuLogic added a commit that referenced this pull requestMar 3, 2022
…141-on-v3.5.xBackport PR#22141 on branch v3.5.x (Fix check 1d)
@jklymakjklymak deleted the fix-check-1d branchMarch 4, 2022 07:41
@mhvkmhvk mentioned this pull requestMay 4, 2022
6 tasks
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@timhoffmtimhoffmtimhoffm left review comments

@tacaswelltacaswelltacaswell approved these changes

@dstansbydstansbydstansby approved these changes

Assignees
No one assigned
Labels
Release criticalFor bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions.third-party integration: pandas
Projects
None yet
Milestone
v3.5.2
5 participants
@jklymak@oscargus@tacaswell@timhoffm@dstansby

[8]ページ先頭

©2009-2025 Movatter.jp