Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork34.1k
Description
Feature or enhancement
Proposal:
Currently, sum() builtin lacks any specialization for complex numbers, yet it's usually faster than better pure-python alternatives.
benchmark sum() wrt pure-python version
# a.pyfromrandomimportrandom,seedseed(1)data= [complex(random(),random())for_inrange(10)]defmsum(xs):it=iter(xs)res=next(it)forzinit:res+=zreturnresdefsum2(xs):returncomplex(sum(_.realfor_inxs),sum(_.imagfor_inxs))
$ ./python -m timeit -r11 -unsec -s 'from a import data, msum' 'sum(data)'500000 loops, best of 11: 963 nsec per loop$ ./python -m timeit -r11 -unsec -s 'from a import data, msum' 'msum(data)'200000 loops, best of 11: 1.31e+03 nsec per loopHardly using sum() component-wise is an option:
$ ./python -m timeit -r11 -unsec -s 'from a import data, sum2' 'sum2(data)'50000 loops, best of 11: 8.56e+03 nsec per loopUnfortunately, direct using this builtin in numeric code doesn't make sense, as results are (usually) inaccurate. It's not too hard to do summation component-wise with math.fsum(), but it's slow and there might be a better way.
In#100425 simple algorithm, using compensated summation, was implemented in sum() for floats. I propose (1) make specialization in sum() for complex numbers, and (2) reuse#100425 code to implement accurate summation of complexes.
(1) is simple and strightforward, yet it will give some measurable performance boost
benchmark sum() in the main wrt added specialization for complex
diff --git a/Python/bltinmodule.c b/Python/bltinmodule.cindex 6e50623caf..da0eed584a 100644--- a/Python/bltinmodule.c+++ b/Python/bltinmodule.c@@ -2691,6 +2691,59 @@ builtin_sum_impl(PyObject *module, PyObject *iterable, PyObject *start) } } }++ if (PyComplex_CheckExact(result)) {+ Py_complex c_result = PyComplex_AsCComplex(result);+ Py_SETREF(result, NULL);+ while(result == NULL) {+ item = PyIter_Next(iter);+ if (item == NULL) {+ Py_DECREF(iter);+ if (PyErr_Occurred())+ return NULL;+ return PyComplex_FromCComplex(c_result);+ }+ if (PyComplex_CheckExact(item)) {+ Py_complex x = PyComplex_AsCComplex(item);+ c_result.real += x.real;+ c_result.imag += x.imag;+ _Py_DECREF_SPECIALIZED(item, _PyFloat_ExactDealloc);+ continue;+ }+ if (PyLong_Check(item)) {+ long value;+ int overflow;+ value = PyLong_AsLongAndOverflow(item, &overflow);+ if (!overflow) {+ c_result.real += (double)value;+ c_result.imag += 0.0;+ Py_DECREF(item);+ continue;+ }+ }+ if (PyFloat_Check(item)) {+ double value = PyFloat_AS_DOUBLE(item);+ c_result.real += value;+ c_result.imag += 0.0;+ Py_DECREF(item);+ continue;+ }+ result = PyComplex_FromCComplex(c_result);+ if (result == NULL) {+ Py_DECREF(item);+ Py_DECREF(iter);+ return NULL;+ }+ temp = PyNumber_Add(result, item);+ Py_DECREF(result);+ Py_DECREF(item);+ result = temp;+ if (result == NULL) {+ Py_DECREF(iter);+ return NULL;+ }+ }+ } #endif for(;;) {
main:
$ ./python -m timeit -r11 -unsec -s 'from a import data, msum' 'sum(data)'500000 loops, best of 11: 963 nsec per loopwith specialization:
$ ./python -m timeit -r11 -unsec -s 'from a import data, msum' 'sum(data)'500000 loops, best of 11: 606 nsec per loop(2) also seems to be a no-brain task: simple refactoring of PyFloat specialization should allows us use same core for PyComplex specialization.
If there are no objections against - I'll work on a patch.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response