Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork10.9k
Open
Description
I've been doing some benchmarks on complex matrix multiplication in NumPy and noticed that some quite simple optimizations haven't been implemented. Is there a design-related reason for that or I can try to contribute it?
Reproducing code example:
importnumpyasnpm,n,k= (2048,4096,2048)A=np.random.uniform(size=(m,n))+1j*np.random.uniform(size=(m,n))B=np.random.uniform(size=(n,k))+1j*np.random.uniform(size=(n,k))C=A @B;# CPU times: user 13.8 s, sys: 84.6 ms, total: 13.9 s# Wall time: 7.45 sC1=A.real @B.realC2=A.imag @B.imagC3= (A.real+A.imag) @ (B.real+B.imag)C= (C1-C2)+1j* (C3-C1-C2);# CPU times: user 7.4 s, sys: 178 ms, total: 7.57 s# Wall time: 3.37 s# Check that relative error is finenp.linalg.norm(C-A@B)/np.linalg.norm(A@B)# 3.9738720532213243e-16
(Execution time is measured by Jupyter's%%time
magic command.)
Numpy/Python version information:
1.13.1 3.6.1 (default, Apr 4 2017, 09:40:51)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)]