Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit9428c00

Browse files
committed
Speed up numeric division by always using the "fast" algorithm.
Formerly there were two internal functions in numeric.c to performnumeric division, div_var() and div_var_fast(). div_var() performeddivision exactly to a specified rscale using Knuth's long divisionalgorithm, while div_var_fast() used the algorithm from the "FM"library, which approximates each quotient digit using floating-pointarithmetic, and computes a truncated quotient with DIV_GUARD_DIGITSextra digits. div_var_fast() could be many times faster thandiv_var(), but did not guarantee correct results in all cases, and wastherefore only suitable for use in transcendental functions, wheresmall errors are acceptable.This commit merges div_var() and div_var_fast() together into a singlefunction with an extra "exact" boolean parameter, which can be set tofalse if the caller is OK with an approximate result. The new functionuses the faster algorithm from the "FM" library, except that when"exact" is true, it does not truncate the computation withDIV_GUARD_DIGITS extra digits, but instead performs the full-precisioncomputation, subtracting off complete multiples of the divisor foreach quotient digit. However, it is able to retain most of theperformance benefits of div_var_fast(), by delaying the propagation ofcarries, allowing the inner loop to be auto-vectorized.Since this may still lead to an inaccurate result, when "exact" istrue, it then inspects the remainder and uses that to adjust thequotient, if necessary, to make it correct. In practice, the quotientrarely needs to be adjusted, and never by more than one in the finaldigit, though it's difficult to prove that, so the code allows forlarger adjustments, just in case.In addition, use base-NBASE^2 arithmetic and a 64-bit dividend array,similar to mul_var(), so that the number of iterations of the outerloop is roughly halved. Together with the faster algorithm, this makesdiv_var() up to around 20 times as fast as the old Knuth algorithmwhen "exact" is true, and up to 2 or 3 times as fast as the olddiv_var_fast() function when "exact" is false.Dean Rasheed, reviewed by Joel Jacobson.Discussion:https://postgr.es/m/CAEZATCVHR10BPDJSANh0u2+Sg6atO3mD0G+CjKDNRMD-C8hKzQ@mail.gmail.com
1 parent4dd3087 commit9428c00

File tree

3 files changed

+371
-474
lines changed

3 files changed

+371
-474
lines changed

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp