Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork7.9k
Improve speed in some modules#22678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Uh oh!
There was an error while loading.Please reload this page.
I'm actually quite surprised (but remember being surprised by this before) that raising to a fraction power is pretty fast: $ipythonPython3.11.0a6+ (heads/main:29624e769c,Mar182022,18:36:12) [GCC11.2.0]Type'copyright','credits'or'license'formoreinformationIPython8.2.0.dev--AnenhancedInteractivePython.Type'?'forhelp.In [1]:importnumpyasnpIn [2]:%timeitnp.sqrt(52)476ns ±1.14nsperloop (mean ±std.dev.of7runs,1,000,000loopseach)In [3]:%timeit52** (1/2)3.12ns ±0.018nsperloop (mean ±std.dev.of7runs,100,000,000loopseach)In [4]:importmathIn [5]:%timeitmath.sqrt(52)25.4ns ±0.125nsperloop (mean ±std.dev.of7runs,10,000,000loopseach)In [6]:s=math.sqrtIn [7]:%timeits(52)25.4ns ±0.0733nsperloop (mean ±std.dev.of7runs,10,000,000loopseach)In [8]:%timeits(52.0)14.9ns ±0.0463nsperloop (mean ±std.dev.of7runs,100,000,000loopseach)In [9]:%timeit52.0** (1/2)3.15ns ±0.00981nsperloop (mean ±std.dev.of7runs,100,000,000loopseach)In [10]: |
|
Speaking of, can we write them as |
@@ -1150,7 +1150,7 @@ def boxplot_stats(X, whis=1.5, bootstrap=None, labels=None, | |||
def _bootstrap_median(data, N=5000): | |||
# determine 95% confidence intervals of the median | |||
M = len(data) | |||
percentiles =[2.5, 97.5] | |||
percentiles =(2.5, 97.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I don't think this matters wrt speed (does it?); also I prefer a list here as this matches "more" the return type of np.percentile below (a ndarray is closer to a list than to a tuple.)
@@ -1221,7 +1222,8 @@ def _compute_conf_interval(data, med, iqr, bootstrap): | |||
stats['mean'] = np.mean(x) | |||
# medians and quartiles | |||
q1, med, q3 = np.percentile(x, [25, 50, 75]) | |||
percentiles = (25, 50, 75) | |||
q1, med, q3 = np.percentile(x, percentiles) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
ditto, leave as before.
@@ -1169,8 +1169,9 @@ def _compute_conf_interval(data, med, iqr, bootstrap): | |||
else: | |||
N = len(data) | |||
notch_min = med - 1.57 * iqr / np.sqrt(N) | |||
notch_max = med + 1.57 * iqr / np.sqrt(N) | |||
half_height = 1.57 * iqr / (N ** (1 / 2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I guess you could inline len(data) into the computation here.
vertices = np.array([[0.0, -1.0], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I guess I would just write this is asvertices = np.array([...])
without even bothering with thedtype=float
which is unneeded; this also avoids a temporary variable (mostly avoids having to keep track of it mentally). Ditto below.
@@ -677,7 +677,7 @@ def _set_pentagon(self): | |||
self._path = polypath | |||
else: | |||
verts = polypath.vertices | |||
y = (1 +np.sqrt(5)) / 4. | |||
y = (1 +5 ** (1 / 2)) / 4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'm not convinced of all these sqrt micro-optimizations. IMHO it makes the code less readable, and the performance gain is minimal. For example here
The sqrt is less than 2% of the the time of the immediate surrounding code in the function:
In [21]: %%timeit ...: Affine2D().scale(0.5) ...: polypath = Path.unit_regular_polygon(5) ...: verts = polypath.vertices ...: top = Path(verts[[0, 1, 4, 0]]) ...: bottom = Path(verts[[1, 2, 3, 4, 1]]) ...: left = Path([verts[0], verts[1], verts[2], [0, -y], verts[0]]) ...: right = Path([verts[0], verts[4], verts[3], [0, -y], verts[0]]) ...: 61.1 µs ± 228 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)In [22]: %timeit np.sqrt(5)2.36 µs ± 35 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
and that function_set_pentagon()
again is only used to create a new marker object, which in itself is quite rare. So I bet the performance gain is not measurable in any real example.
timhoffm commentedDec 15, 2022 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
@oscargus Thanks for trying to speed the code up, However, as said before:
Unless we can show there is a measurable performance gain in real situations, I propose to close this. |
@oscargus is it OK to close this? I agree that some of the optimizations are marginal at best... In either case, moving to draft ;-) |
Uh oh!
There was an error while loading.Please reload this page.
PR Summary
Enable pre-computation during compilation and remove redundant computations.
Before:
After:
PR Checklist
Tests and Styling
pytest
passes).flake8-docstrings
and runflake8 --docstring-convention=all
).Documentation
doc/users/next_whats_new/
(follow instructions in README.rst there).doc/api/next_api_changes/
(follow instructions in README.rst there).