Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork7.9k
Description
Problem
- Violin plots for log scaled data currently require transforming the data by log-scaling and then modifying the ticks inversely to match. This becomes problematic if you are plotting multiple data forms, etcetera.
- Seaborn implemented a
log_scale
capability in version 0.13, and I think matplotlib could do the same
Proposed solution
I believe all that is needed is to allow for a log version of the KDE.
In log scale, that would look like:
def _kde_method(X, coords): # Unpack in case of e.g. Pandas or xarray object X = matplotlib.cbook._unpack_to_numpy(X) # fallback gracefully if the vector contains only one value if np.all(X[0] == X): return (X[0] == coords).astype(float) X_log = np.log(X) # Transform to log space kde = matplotlib.mlab.GaussianKDE(X_log, bw_method) # Return KDE evaluated at log-transformed coordinates coords_log = np.log(coords) # Transform to log space return kde.evaluate(coords_log)
so inmatplotlib/lib/matplotlib/axes/_axes.py:violinplot()
, thescale
could just be passed as an option to_kde_method
to allow switching between the current and this form.
You would probably also want, inmatplotlib/lib/matplotlib/cbook.py:violin_stats()
to also pass in thescale
so that the coords can, rather than beingcoords = np.linspace(min_val, max_val, points)
, instead becoords = np.geomspace(min_val, max_val, points)
.
I've checked and this does create reasonable plots, and I think this is all that would need to be done.
All other stats like quantiles remain unchanged, though perhaps one could debate whethermean
should be calculated in log scale or not.