arduino/ArduinoCore-APIPublic

NotificationsYou must be signed in to change notification settings
Fork137
Star271

Test Stream::parseFloat() with many input digits#133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

aentinger merged 7 commits intoarduino:masterfromedgar-bonet:many-digits

Jan 25, 2021

Merged

Test Stream::parseFloat() with many input digits#133

aentinger merged 7 commits intoarduino:masterfromedgar-bonet:many-digits

Jan 25, 2021

Conversation

Copy link

Contributor

edgar-bonet commentedDec 10, 2020

As a followup to issue#129: “Stream::parseFloat() fails if it reads too many digits”, this pull request adds a failing test to test_parseFloat.cpp that reveals the problem. I ran into a couple of issues when writing the test:

Ensuring correct rounding when parsing a decimal representation of a number is not trivial, andStream::parseFloat() does not attempt to provide such guarantee. Testing for exact equality to the expected result is thus prone to spurious failures.
The existing tests in test_parseFloat.cpp do test for exact equality, and it seems to work. However, the more digits are processed, the greater the chances that rounding errors pile up and the end result differs from the correctly rounded one. Since this test requires processing many decimal digits, I had to use fuzzy floating point comparison in order to avoid the test failing for the wrong reason.
It would seem these tests are meant to be compiled and run on 64-bit Linux hosts, where along is 64-bits wide. Most (all?) Arduino platforms, in contrast, have 32-bitlongs. Since this test is meant to reveal the effect of along variable overflowing, I had to add many more digits than would be needed to trigger the overflow on an Arduino. As shown in issueStream::parseFloat() fails if it reads too many digits #129, 10 digits of π are enough to makeStream::parseFloat() fail badly on an actual Arduino.

Test Stream::parseFloat() with many digits

129ae52

edgar-bonet mentioned this pull request

Dec 10, 2020

Stream::parseFloat() fails if it reads too many digits#129

Closed

Copy link

Collaborator

matthijskooijman commentedDec 10, 2020

is probably yet another argument to switch official APIs to use e.g.int32_t instead oflong, see alsoDocumentation should use more specific types Arduino#4525 for related discussion.

Copy link

ContributorAuthor

edgar-bonet commentedDec 10, 2020

@matthijskooijman: The overflow affects a local variable, not exposed to the API.

Copy link

Contributor

aentinger commentedDec 11, 2020•
edited
Loading

is probably yet another argument to switch official APIs to use e.g.int32_t instead oflong, see alsoarduino/Arduino#4525 for related discussion.

While I technically agree with@matthijskooijman I've just yesterday participated in a discussion with@tigoe in which he expressed his dismay on some prototype code usinguint8_t (instead ofbyte) and general usage ofuintXX_t types (Hisopinion voiced in 2016 seems to be constant over time 😉 ). Let me also add that this my last word on this topicwithin this PR, if you feel it's something we should discuss in more depth please open a new issue within this repository.

@edgar-bonet Thank you very much for providing this first step on fixing your issue (A failing test is a great was to start!). Considering floating point comparison you can use theCatch2 providedApprox method, e.g.

REQUIRE(mock.parseFloat() == Approx(-3.141592654f);

Here you can find more documentation how to perform such floating point comparisons.

Use Approx() macro for fuzzy comparison

8eb4013

Copy link

ContributorAuthor

edgar-bonet commentedDec 11, 2020

@aentinger: Thanks for the info! I did not know about this library. TheApprox() macro definitely makes both the test code and the error message more readable. I just pushed a commit to use this macro.

Now, I have a question for you.

If the stream provides an integer larger thanLONG_MAX but smaller thanFLT_MAX, the variablevalue also overflows. Whereas this is not exactly the same issue as reading too many decimal digits after the radix point, it is strongly related, and its fix involves the same lines of code as the current issue. My question is: should this “parse large integer” problem be considered another aspect of this issue, or should we handle it as a separate issue?

In the first case, I would add the relevant test to this PR, and then submit a PR than handles both situations. In the second case, I would wait for this issue to be sorted out before moving to the “parse large integer” problem.

Copy link

Contributor

aentinger commentedDec 11, 2020

Thank you integratingApprox, this indeed makes the test code so much easier. I'd say theparseInt should be handled separately, could you please also create an issue and the smallest failing test?

Copy link

Collaborator

matthijskooijman commentedDec 11, 2020

@matthijskooijman: The overflow affects a local variable, not exposed to the API.

Right, that would make it even easier to change fromlong toint32_t. IMHO that would be good to change, since it makes the code behave more consistent across platforms. And since it's already 32-bits, I don't think any of the "use int because that's usually fastest on the current platform" arguments apply here. But this is a little off-topic for this PR, maybe.

Copy link

ContributorAuthor

edgar-bonet commentedDec 11, 2020

@aentinger wrote:

I'd say theparseInt should be handled separately

Sorry, I was not clear enough. It is not aboutparseInt. The issue isparseFloat overflowing a local variable when it reads too many digits. And it can be split into two sub-issues:

Too many digitsafter the decimal point → can be fixed by ignoring the extra digits.
Too many digitsbefore the decimal point (e.g. a large integer) → the extra digits cannot simply be ignored.

My question is whether these two points should be considered parts of the same issue or separate issues.

@matthijskooijman: I do think it is off-topic.

Copy link

Contributor

aentinger commentedDec 14, 2020

Good morning@edgar-bonet 👋 ☕
Thank you very much for clarifying this point 👍 Since those issues also concern issues withinparseFloat in combination with a large number of digits I think it's okay to handle them within this PR. Could you please add additional failing tests for both too many digits after and before the decimal point?

Test Stream::parseFloat() with a large number

5445db3

Copy link

ContributorAuthor

edgar-bonet commentedDec 14, 2020

Hi@aentinger, thanks for your answer.

I just added a failing test case with a number that has too many digits before the decimal point. This is in addition to the previous one that has too many digits after the decimal point.

Should I squash the three commits together and force-push?

Exchanging the type of 'value' from 'long' to 'double' prevents overf…

fae13e5

…low when parsing float values.However, we still need to ensure against too large values contained in streams. This should be possible because the maximum length of a float value pre-comma is known to be 38 digits (FLT_MAX_10_EXP).

Copy link

Contributor

aentinger commentedDec 14, 2020•
edited
Loading

Should I squash the three commits together and force-push?

Please no. I already added a first possible solution on how to address this issue. Please to a pull first before your next commit or you've got disentangle it with the commit I pushed 😉

Copy link

codecov-io commentedDec 14, 2020•
edited
Loading

Codecov Report

Merging#133 (6197511) intomaster (78f3f41) willdecrease coverage by0.34%.
The diff coverage is100.00%.

@@            Coverage Diff             @@##           master     #133      +/-   ##==========================================- Coverage   96.41%   96.06%   -0.35%==========================================  Files          14       14                Lines         837      839       +2     ==========================================- Hits          807      806       -1- Misses         30       33       +3

Impacted Files	Coverage Δ
api/Stream.cpp	`91.09% <100.00%> (-0.07%)`	⬇️
api/String.cpp	`97.69% <0.00%> (-0.77%)`	⬇️
api/String.h	`90.90% <0.00%> (+2.02%)`	⬆️

Continue to review full report at Codecov.

Legend -Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing data
Powered byCodecov. Last update78f3f41...6197511. Read thecomment docs.

edgar-bonet commented

Dec 14, 2020

View reviewed changes

api/Stream.cpp Outdated

		double value =0.0;
		int c;
		float fraction =1.0;
		unsignedint digits_post_comma =0;

Copy link

ContributorAuthor

edgar-bonetDec 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ifvalue is a floating point number, there is no need for this extra variable. It could be folded intovalue.

Copy link

Contributor

aentingerDec 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That's a good question here. What's more computing time expensive -pow or repeatedly doing a double multiplication?

Copy link

Contributor

aentingerDec 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

However, a further argument againstpow is that there's a question of availability (and with what argument types) across all supported platforms.

edgar-bonet commented

Dec 14, 2020

View reviewed changes

api/Stream.cpp Outdated

		returnvalue* fraction;
		else
		return value;
		value/=pow(10, digits_post_comma);

Copy link

ContributorAuthor

edgar-bonetDec 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

pow() involves a logarithm and an exponential, which is very expensive on MMU-less processors.

Copy link

ContributorAuthor

edgar-bonet commentedDec 14, 2020

I had considered parsing straight into a float, but I then assumed the original code used an integer for performance reasons. My tests show parsing into a float takes about 50% more processing time an AVR.

Replacing computational expensive pow call with result of accumulated…

1266b08

… multiplication.

Copy link

Contributor

aentinger commentedDec 14, 2020•
edited
Loading

My tests show parsing into a float takes about 50% more processing time an AVR.

Right nowArduinoCore-API is not used onArduinoCore-avr. There are plans for doing so but the timeline is more than a bit fuzzy. Also, as far as I can see all future Arduino platforms will be based on 32-Bit ARM MCUs or comuting-equivalent MCUs.

EDIT:ArduinoCore-API is used withArduinoCore-megaavr so there is a 8-Bit AVR core which will be affected by this change.

edgar-bonet commented

Dec 14, 2020

View reviewed changes

api/Stream.cpp Outdated

		elseif (c =='.')
		isFraction =true;
		elseif(c >='0' && c <='9') {// is c a digit?
		value = value *10 + c -'0';

Copy link

ContributorAuthor

edgar-bonetDec 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

value may overflow toINFINITY on AVR if there are more than 38 digits (maybe that's so many we do not care about that case?), event with a number smaller thanFLT_MAX. I suggest instead:

if(isFraction) {    fraction *=0.1;    value = value + fraction * (c -'0');  }else {    value = value *10 + c -'0';  }

Note that on something other than AVR one would need more than 308 digits to demonstrate the problem.

Copy link

Contributor

aentingerDec 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I like this form, however in that case we've got to commit to using adouble forvalue. I personally would not mind since we are talking about stream parsing after all and the onlyStream channels currently available are Serial and various networking streams and which are slow even compared to floating point multiplication.

I can further support my argument pro floating point implementation that if you want to run performance-critical/floating-point applications on an 8-Bit architecture you've selected the wrong MCU alltogether.

@facchinm /@cmaglie what's your take on this?

Copy link

Contributor

aentingerJan 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Since I got no feedback over quite a long period of time (unfortunate but not unexpected given what we are currently swamped in) I'm moving forward with merging this PR.

Copy link

Contributor

aentingerJan 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Let's hold on for a second though@edgar-bonet can you integrate your last change suggested above by yourself:

if(isFraction) {    fraction *=0.1;    value = value + fraction * (c -'0');  }else {    value = value *10 + c -'0';  }

Copy link

ContributorAuthor

edgar-bonetJan 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@aentinger: OK, I'll do it today.

edgar-bonet added2 commits

January 25, 2021 13:48

Test Stream::parseFloat() with a huge number of digits

91c5e29

Give a 311-digit number to Stream::parseFloat(). This makes the localvariable `value' overflow to infinity. With so many digits, the numbercannot be parsed into an integer, not even into an integer stored as a`double'.Note that 40 digits would be enough to unveil this issue on AVR.

Avoid overflowing parseFloat()'s internal 'value'

6197511

If more than 309 digits are provided to Stream::parseFloat() (more than39 on AVR), the internal variable 'value' would overflow to infinity. Weavoid this by not storing the parsed number as an integer-in-a-float.

Copy link

ContributorAuthor

edgar-bonet commentedJan 25, 2021

As requested, I just pushed my proposed change. Prior to this, I took the liberty to modify the test “A float is provided with too many digits after the decimal point” by adding more digits, in order to evidence the issue the last commit is fixing.

aentinger approved these changes

Jan 25, 2021

View reviewed changes

Copy link

Contributor

aentinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM 👍 Thank you@edgar-bonet 🚀

aentinger merged commit2af4a9c intoarduino:master

Jan 25, 2021

Labels

None yet

Movatterモバイル変換

Uh oh!

Test Stream::parseFloat() with many input digits#133

Test Stream::parseFloat() with many input digits#133

Uh oh!

Conversation

edgar-bonet commentedDec 10, 2020

Uh oh!

matthijskooijman commentedDec 10, 2020

Uh oh!

edgar-bonet commentedDec 10, 2020

Uh oh!

aentinger commentedDec 11, 2020• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

edgar-bonet commentedDec 11, 2020

Uh oh!

aentinger commentedDec 11, 2020

Uh oh!

matthijskooijman commentedDec 11, 2020

Uh oh!

edgar-bonet commentedDec 11, 2020

Uh oh!

aentinger commentedDec 14, 2020

Uh oh!

edgar-bonet commentedDec 14, 2020

Uh oh!

aentinger commentedDec 14, 2020• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

codecov-io commentedDec 14, 2020• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

edgar-bonet commentedDec 14, 2020

Uh oh!

aentinger commentedDec 14, 2020• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

edgar-bonet commentedJan 25, 2021

Uh oh!

aentinger left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aentinger commentedDec 11, 2020•
edited
Loading

aentinger commentedDec 14, 2020•
edited
Loading

codecov-io commentedDec 14, 2020•
edited
Loading

aentinger commentedDec 14, 2020•
edited
Loading