Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit8870917

Browse files
committed
Apply auto-vectorization to the inner loop of numeric multiplication.
Compile numeric.c with -ftree-vectorize where available, and adjustthe innermost loop of mul_var() so that it is amenable to beingauto-vectorized. (Mainly, that involves making it process the arraysleft-to-right not right-to-left.)Applying -ftree-vectorize actually makes numeric.o smaller, at leastwith my compiler (gcc 8.3.1 on x86_64), and it's a little faster too.Independently of that, fixing the inner loop to be vectorizable alsomakes things a bit faster. But doing both is a huge win formultiplications with lots of digits. For me, the numeric regressiontest is the same speed to within measurement noise, but numeric_bigis a full 45% faster.We also looked into applying -funroll-loops, but that makes numeric.obloat quite a bit, and the additional speed improvement is verymarginal.Amit Khandekar, reviewed and edited a little by meDiscussion:https://postgr.es/m/CAJ3gD9evtA_vBo+WMYMyT-u=keHX7-r8p2w7OSRfXf42LTwCZQ@mail.gmail.com
1 parent695de5d commit8870917

File tree

2 files changed

+15
-3
lines changed

2 files changed

+15
-3
lines changed

‎src/backend/utils/adt/Makefile‎

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,9 @@ clean distclean maintainer-clean:
125125

126126
like.o: like.c like_match.c
127127

128+
# Some code in numeric.c benefits from auto-vectorization
129+
numeric.o: CFLAGS += ${CFLAGS_VECTORIZE}
130+
128131
varlena.o: varlena.c levenshtein.c
129132

130133
include$(top_srcdir)/src/backend/common.mk

‎src/backend/utils/adt/numeric.c‎

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8191,6 +8191,7 @@ mul_var(const NumericVar *var1, const NumericVar *var2, NumericVar *result,
81918191
intres_weight;
81928192
intmaxdigits;
81938193
int*dig;
8194+
int*dig_i1_2;
81948195
intcarry;
81958196
intmaxdig;
81968197
intnewdig;
@@ -8327,10 +8328,18 @@ mul_var(const NumericVar *var1, const NumericVar *var2, NumericVar *result,
83278328
*
83288329
* As above, digits of var2 can be ignored if they don't contribute,
83298330
* so we only include digits for which i1+i2+2 <= res_ndigits - 1.
8331+
*
8332+
* This inner loop is the performance bottleneck for multiplication,
8333+
* so we want to keep it simple enough so that it can be
8334+
* auto-vectorized. Accordingly, process the digits left-to-right
8335+
* even though schoolbook multiplication would suggest right-to-left.
8336+
* Since we aren't propagating carries in this loop, the order does
8337+
* not matter.
83308338
*/
8331-
for (i2=Min(var2ndigits-1,res_ndigits-i1-3),i=i1+i2+2;
8332-
i2 >=0;i2--)
8333-
dig[i--]+=var1digit*var2digits[i2];
8339+
i=Min(var2ndigits-1,res_ndigits-i1-3);
8340+
dig_i1_2=&dig[i1+2];
8341+
for (i2=0;i2 <=i;i2++)
8342+
dig_i1_2[i2]+=var1digit*var2digits[i2];
83348343
}
83358344

83368345
/*

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp