Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork7.9k
Description
Bug summary
There are apparently 3 problems which combine to make savefig slow: (1) The use of many sub-plots, (2) the use ofbbox_inches='tight'
, and (3) the use ofsharex='columns'
. Unfortunately I need to use all three.
Code for reproduction
%matplotlibinlinefromioimportBytesIOimportnumpyasnpfrommatplotlib.figureimportFigure# Random Number Generator.rng=np.random.default_rng()# Constants.figsize= (10,6)ncols=3nrows=10size=100size_total=ncols*nrows*size# Figure with many subplots.fig_many=Figure(figsize=figsize)axs_many=fig_many.subplots(ncols=ncols,nrows=nrows)# Figure with many subplots and sharex='col'.fig_many_sharex=Figure(figsize=figsize)axs_many_sharex=fig_many_sharex.subplots(ncols=ncols,nrows=nrows,sharex='col')# Figure with a single axes.fig_single=Figure(figsize=figsize)ax_single=fig_single.subplots()# Helper-function: Generate random line-plots in the many subplots.defgenerate_fig_many(axs):forrowinrange(nrows):forcolinrange(ncols):ax=axs[row,col]x=rng.normal(loc=row+1,scale=col+1,size=size)y=rng.normal(loc=col+1,scale=row+1,size=size)x=np.sort(x)ax.plot(x,y);ax.set_yticks([])# Generate fig_manygenerate_fig_many(axs=axs_many)fig_many.tight_layout()# Generate fig_many_sharexgenerate_fig_many(axs=axs_many_sharex)fig_many_sharex.tight_layout()# Generate fig_singlex=rng.normal(size=size_total)y=rng.normal(size=size_total)x=np.sort(x)ax_single.plot(x,y);fig_single.tight_layout()# The following code-chunks were run in individual Jupyter cells.%%timeitstream=BytesIO()fig_single.savefig(stream,format='svg')s=stream.getvalue()# 29.2 ms ± 168 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)%%timeitstream=BytesIO()fig_single.savefig(stream,format='svg',bbox_inches='tight')s=stream.getvalue()# 102 ms ± 6.03 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)%%timeitstream=BytesIO()fig_many.savefig(stream,format='svg')s=stream.getvalue()# 374 ms ± 4.17 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)%%timeitstream=BytesIO()fig_many.savefig(stream,format='svg',bbox_inches='tight')s=stream.getvalue()# 1.4 s ± 12.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)%%timeitstream=BytesIO()fig_many_sharex.savefig(stream,format='svg')s=stream.getvalue()# 565 ms ± 5.58 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)%%timeitstream=BytesIO()fig_many_sharex.savefig(stream,format='svg',bbox_inches='tight')s=stream.getvalue()# 2.22 s ± 20.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)%%timeitstream=BytesIO()fig_many_sharex.savefig(stream,format='jpg',bbox_inches='tight')s=stream.getvalue()# 2.17 s ± 21 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)%%timeitstream=BytesIO()fig_many_sharex.savefig(stream,format='png',bbox_inches='tight')s=stream.getvalue()# 2.19 s ± 31.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Actual outcome
The test-results are summarized in this table, which are all for the SVG format. A few tests are made above for JPG and PNG formats and the results are similar.
Figure | no bbox | bbox=tight | layout=constrained | layout=tight |
---|---|---|---|---|
fig_single | 29 | 102 | 30 | 30 |
fig_many | 374 | 1,400 | 1,410 | 1,340 |
fig_many_sharex | 565 | 2,220 | 2,220 | 2,110 |
Edit: Added time-usage for setting eitherlayout='constrained'
or'tight'
when creating theFigure
objects.
Expected outcome
I would like it to runlike this (you asked for a visual example).
Additional information
Thanks for making Matplotlib, I've been using it for many open-source projects in the past!
I am currently building a web-app where Matplotlib will be generating many SVG plots on a server that is running in the cloud. My own functions for generating the data are very fast, but unfortunately the plotting itself is very slow. For example, a figure with 3 columns and 10 rows of sub-plots takes7 seconds to runsavefig
- even though most of the sub-plots only have a simple text-string such as "Same as previous", and the few other sub-plots are either line-plots orfill_between
that are generated from just 100 data-points each.
I have tried simulating this problem in the sample code above, wherefig_many
has many sub-plots, andfig_single
has a single plot with the same total number of data-points. I also tried using a profiler on this code, but it would take me forever to try and understand what the problem is in Matplotlib's code, and whether it's even fixable.
Please tell me if it might be possible to improve the speed, or if it's not possible then please explain the technical reason, and whether there is a work-around.
Thanks!
Operating system
Kubuntu 22
Matplotlib Version
3.7.1
Matplotlib Backend
module://matplotlib_inline.backend_inline
Python version
3.9.12
Jupyter version
6.4.12 (through VSCode)
Installation
pip