Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[RFC] Speed up Axes.bar#20092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
ivirshup wants to merge3 commits intomatplotlib:mainfromivirshup:faster-bar
Closed

Conversation

ivirshup
Copy link

@ivirshupivirshup commentedApr 27, 2021
edited
Loading

PR Summary

Currently Axes.bar callsself.add_patch for every bar. This can be quite slow for dense histograms. This PR instead puts the patches into a collection and adds that. This seems to give a large performance increase.

Details

Evidence of a speed up

frommatplotlibimportpyplotaspltimportnumpyasnpN=10_000x=np.arange(N)y=np.random.randn(N)%timeplt.bar(x,y)

Onmaster:

CPU times: user 4.97 s, sys: 64.2 ms, total: 5.03 sWall time: 5.06 s

On this branch:

CPU times: user 1.42 s, sys: 27.1 ms, total: 1.44 sWall time: 1.47 s

More thoughts

I don't think this counts as an API change. Of the failing tests, all the image comparison ones I've looked at so far don't look different by eye. Some tests fail because they are checking for artists which are now in a different place.

I thought it would be good to get feedback on whether this was a reasonable approach before making more changes. Any thoughts?

TODO

Based on the failing tests

  • Color patches correctly
  • Get legends working (not sure how patches get labels attached)
  • Figure out what's going on with misaligned axis in test_default_edges (bar plot is just floating off the bottom axis)
  • Fix log scaling inhist

PR Checklist

  • Has pytest style unit tests (andpytest passes).
  • IsFlake 8 compliant (runflake8 on changed files to check).
  • New features are documented, with examples if plot related.
  • Documentation is sphinx and numpydoc compliant (the docs shouldbuild without error).
  • Conforms to Matplotlib style conventions (installflake8-docstrings and runflake8 --docstring-convention=all).
  • New features have an entry indoc/users/next_whats_new/ (follow instructions in README.rst there).
  • API changes documented indoc/api/next_api_changes/ (follow instructions in README.rst there).

Copy link

@github-actionsgithub-actionsbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thank you for opening your first PR into Matplotlib!

If you have not heard from us in a while, please feel free to ping@matplotlib/developers or anyone who has commented on the PR. Most of our reviewers are volunteers and sometimes things fall through the cracks.

You can also join uson gitter for real-time discussion.

For details on testing, writing docs, and our review process, please seethe developer guide

We strive to be a welcoming and open project. Please follow ourCode of Conduct.

@jklymak
Copy link
Member

Thanks! I think this idea has been discussed before. I'm not sure of the technical issues, but in general it seems reasonable.

Some of your image changes are quite large according to the tests so it would be good to see before and after and decide what is wrong. The way to do that is to check in the new images as part of the PR (in tests/baseline_images) and we can see the difference.

@mwaskom
Copy link

FWIW this will also break all of the seaborn tests that assert the correctness of bar/hist plots by checking the attributes of the patch artists.

I don't know if there is some way to keep a "dummy" reference to artists in their original place that does not trigger a duplicate draw ... could be useful?

@ivirshup
Copy link
Author

@jklymak thanks for the quick look! I've added a TODO to the top of the PR with the problems I've noticed in the test plots.

Currently looking at legends forhist. Not totally sure how to approach it, but am currently thinking of modifying this part ofAxes.hist

# If None, make all labels None (via zip_longest below); otherwise,# cast each element to str, but keep a single str as it.labels= []iflabelisNoneelsenp.atleast_1d(np.asarray(label,str))forpatch,lblinitertools.zip_longest(patches,labels):ifpatch:p=patch[0]p.update(kwargs)iflblisnotNone:p.set_label(lbl)

And instead adding a new something (Artist I guess?) to the plot from the patches that would have been assigned labels.

example
labels= []iflabelisNoneelsenp.atleast_1d(np.asarray(label,str))legend_patches= []forpatch,lblinitertools.zip_longest(patches,labels):ifpatch:p=patch[0]p.update(kwargs)iflblisnotNone:p.set_label(lbl)legend_patches.append(p)forpinpatch[1:]:p.update(kwargs)p.set_label('_nolegend_')self.add_somehow(legend_patches)

Is there a more elegant way to do this?


@mwaskom, is it possible to expand the collections into lists of patches? Maybe a helper around this?

@tacaswell
Copy link
Member

Changing the return type is definitively an API change. Any user who is trying to do something with the returned is going to have their code broken.

You might be able to sub-class the Collection and make it look enough like the current return type, but it will be a lot of work as things like

bars = ax.bar(....)bars[1].set_color('green')bars[2].set_visible(False)bars[3].set_linewidth(15)*_, last, two = bars

all should keep working.


If your problem is primarily histograms@ivirshup you may do better either passinghisttype='step' or using the new in mpl34ax.stairs method. If you really want the bar look on histograms adding a new histtype (and accepting a bit more opt-in type instability inhist) may be easier to get done than changing the return type of allbar calls.

@jklymak
Copy link
Member

maybe we could haveax.bar(..., quick=True)? Thoughax.histbar isn't a bad idea.

@ivirshup
Copy link
Author

@tacaswell, my thing was mainly that I didn't want to have to consider the size of my data and choose the keyword arguments. Particularly as a library author, having changes to the plots around arbitrary data size cutoffs is not going to be fun to maintain. Also I think people generally are expecting something that looks like bars when the bins are wide enough to be able to tell.

I think your point is totally fair. I wasn't really sure what was considered to be key parts of the API though, since arguably any changes to public attributes of the object is breaking. I wasn't planning on changing the return type, just it arguably wouldn't be a particularly useful value. Basically, my thinking was if changing the contents ofax.containers isn't breaking, is changing the behaviour ofres = ax.bar(...); res[0].set_facecolor(...) breaking? But also, if the first is breaking, it would be difficult to make any improvements to this library.

TBH, if this got rejected here, my plan was to see if@mwaskom would be up for replacing the use ofax.bar inseaborn with someCollections, since initial tests make it look like this will cut plot time for large histograms by at least 50%.


From looking at the code a bit, I was actually wondering if the plan was to deprecate thecontainers module, since it's only used in a small subset of the library (specifically for bar and stem charts, I think). A little confusingly, some things that are annotated asContainer actually expect subclasses ofToolContainerBase which is unrelated.

container : Container
`backend_bases.ToolContainerBase` object that will get the tools added.

My point here being, maybe this would be worth it for 4.0? It can be quite slow at the moment, and the machinery for making it fast is all here. Using a Collection instead of a Container would also bring this function more in line with the rest of package.

@timhoffm
Copy link
Member

Every change that can break or change the result of currently valid user code is a breaking change to us.

if the first is breaking, it would be difficult to make any improvements to this library.

Unfortunately, it is. This is the downside of a very large user base. We have to be extermely careful with these.

Using a Collection for the bars is reasonable. Probably the way forward is similar tohist(..., histtype=...): Add a parameter to make the Artist configurable. The default stays on isolated bars, using a collection is opt-in. The default can be switched later.

A clever in between way may be returning a proxy incontainer.patch, that will resolve the PatchCollection into individual rectangles when trying to access individual elements. If the user does not mess with individual rectangles, he'll get the speedup, otherwise he'll get the current behavior. This is also not trivial to get right, but seems more managable than trying to write a Rectangle adapter mapping to the collection.

@QuLogicQuLogic modified the milestones:v3.5.0,v3.6.0Aug 23, 2021
@timhoffm
Copy link
Member

@ivirshup are you still interested in working on this (under the constraints of backward compatibility and the paths laid out in my last comment)?

@timhoffmtimhoffm modified the milestones:v3.6.0,unassignedMay 1, 2022
@ivirshup
Copy link
Author

@timhoffm, I don't think so. Many of the solutions suggested above got fairly involved, and I don't think I'm familiar enough with matplotlib internals to do this in a reasonable timeframe.

Feel free to close, though I think the conversation here definitely has value.

@timhoffm
Copy link
Member

@ivirshup thanks for the feedback. I've created issue#22976 with a concise description of the suggested solution. Therefore, I close this thread.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@github-actionsgithub-actions[bot]github-actions[bot] left review comments

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

6 participants
@ivirshup@jklymak@mwaskom@tacaswell@timhoffm@QuLogic

[8]ページ先頭

©2009-2025 Movatter.jp