@@ -9,7 +9,7 @@ SIMD optimizations. StructsOfArrays implements the classic structure of arrays
99optimization. The contents of a given field for all objects is stored linearly
1010in memory, and different fields are stored in different arrays. This permits
1111SIMD optimizations in more cases and can also save a bit of memory if the object
12- contains padding.
12+ contains padding. It is especially useful for arrays of complex numbers.
1313
1414##Benchmark
1515
@@ -18,28 +18,35 @@ using StructsOfArrays
1818regular= complex (randn (1000000 ),randn (1000000 ))
1919soa= convert (StructOfArrays, regular)
2020
21+ function f (x, a)
22+ s= zero (eltype (x))
23+ @simd for iin 1 : length (x)
24+ @inbounds s+= x[i]* a
25+ end
26+ s
27+ end
28+
2129using Benchmarks
22- @benchmark sum (regular)
23- @benchmark sum (soa)
30+ @benchmark f (regular, 0.5 + 0.5im )
31+ @benchmark f (soa, 0.5 + 0.5im )
2432```
2533
26- The time for` sum (regular)` is:
34+ The time for` f (regular, 0.5+0.5im )` is:
2735
2836```
29- Average elapsed time: 1.018 ms
30- 95% CI for average: [887.090 μs, 1.149 ms]
37+ Average elapsed time: 1.244 ms
38+ 95% CI for average: [1.183 ms, 1.305 ms]
39+ Minimum elapsed time: 1.177 ms
3140```
3241
33- and for` sum (soa)` :
42+ and for` f (soa, 0.5+0.5im )` :
3443
3544```
36- Average elapsed time: 754.942 μs
37- 95% CI for average: [688.003 μs, 821.880 μs]
45+ Average elapsed time: 832.198 μs
46+ 95% CI for average: [726.349 μs, 938.048 μs]
47+ Minimum elapsed time: 713.730 μs
3848```
3949
50+ In this case, StructsOfArrays are about 1.5x faster than ordinary arrays.
4051Inspection of generated code demonstrates that` sum(soa) ` uses SIMD
4152instructions, while` sum(regular) ` does not.
42-
43- (This is not necessarily the best benchmark, since it should be possible to
44- vectorize both sums, but at present Julia can only vectorize with the SoA
45- optimization.)