Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Strings without null-termination#97

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
aentinger merged 9 commits intomasterfromstringnonull
Jan 25, 2021
Merged

Strings without null-termination#97

aentinger merged 9 commits intomasterfromstringnonull
Jan 25, 2021

Conversation

@matthijskooijman
Copy link
Collaborator

When working with the Arduino String class, I've found that I couldn't efficiently combine it with some external libraries that explicitely pass char* and length around, without nul-terminating their strings. This prompted me to modify and expose the concat (const char* cstr, unsigned int length) method, add a new String(const char* cstr, unsigned int length) constructor. While I was going over the string class, I found some other minor improvements, which are included here.

This is a port ofarduino/Arduino#1936. The commits are identical, except for some improved commit messages and one commit was dropped since that change was already made by someone else in the meantime.

I've provided some testcases by updating the string examples:arduino/Arduino#9239

If this is ok to merge, I'll also provide a PR for the reference documentation.

@cvwillegen
Copy link

Can this please be merged? I'm running into problems with this, as I would like to use string with embedded NUL chars in them. When copying a String with a NUL char in it, the memory after the NUL bytes is not being copied, so the rest of that String is uninitialised memory.

Here's sample code showing the error:

String sGlobal;void dumpString(const String &s){  Serial.print("Got a string of length ");  Serial.println(s.length());  Serial.print(">");  for (size_t t = 0; t < s.length(); ++t) {    if (s.charAt(t) != '\0' && isascii(s.charAt(t))) {      Serial.print(s.charAt(t));    } else if (s.charAt(t) == '\0') {      Serial.print("\\0");    } else {      Serial.print("\\x");      Serial.print(s.charAt(t), HEX);    }  }  Serial.print("<");  Serial.println();}void encode(String &s){  while (s.length() < 12)  {    s += ' ';  }  Serial.println("s in encode is, after filling with spaces:");  dumpString(s);  s.setCharAt(11, '!');  s.setCharAt(10, 'd');  s.setCharAt(9, 'l');  s.setCharAt(8, 'r');  s.setCharAt(7, 'o');  s.setCharAt(6, 'w');  s.setCharAt(5, '\0');  s.setCharAt(4, 'o');  s.setCharAt(3, 'l');  s.setCharAt(2, 'l');  s.setCharAt(1, 'e');  s.setCharAt(0, 'H');    Serial.println("s in encode is, after setting its chars:");  dumpString(s);}void test() {  String sLocal;  Serial.println("sLocal in test is, after init:");  dumpString(sLocal);  Serial.println("sGlobal in test is, after init:");  dumpString(sGlobal);    encode(sLocal);  Serial.println("sLocal in test is, after encode:");  dumpString(sLocal);  Serial.println("sGlobal in test is, after encode:");  dumpString(sGlobal);  sGlobal = sLocal;  Serial.println("sGlobal in test is, after assignment:");  dumpString(sGlobal);}void setup(){  Serial.begin(115200);  test();}void loop(){}

Output of this sample:

Local in test is, after init:Got a string of length 0><sGlobal in test is, after init:Got a string of length 0><s in encode is, after filling with spaces:Got a string of length 12>            <s in encode is, after setting its chars:Got a string of length 12>Hello\0world!<sLocal in test is, after encode:Got a string of length 12>Hello\0world!<sGlobal in test is, after encode:Got a string of length 0><sGlobal in test is, after assignment:Got a string of length 12>Hello\0�n\xFFFFFFEF\xFFFFFFD6\xFFFFFFFF\<

@cvwillegen
Copy link

@matthijskooijman How can I build this and run it on my own Arduino's? As you might recall, I added#112 and would like to create a patch to fix it. But then I need to compile this code myself in order to fix it. How does one do that?

@matthijskooijman
Copy link
CollaboratorAuthor

@cvwillegen You should be able to clone this repository (and then the right branch) into~/Arduino/hardware/arduino-git/avr (so you'd get a~/Arduino/hardware/arduino-git/avr/platform.txt file, adapt~/Arduino if you modified your sketchbook directory or are running on Windows).

Then, it is probably useful to change thename inplatform.txt file to allow distinguishing your git clone'd Arduino core from the original one (until IDE 1.8.13 is released, containingarduino/Arduino#10007).

If you then restart the Arduino IDE, you should the AVR boards twice in the boards menu. If you select the right one, it will use the code from git rather than the official release.

Is that enough info?

@aentinger
Copy link
Contributor

@matthijskooijman can you please rebase? I'd like to see the CI unit tests run, probably the tests need some fixing.

@codecov-io
Copy link

Codecov Report

Merging#97 (9a604e7) intomaster (2ca15ad) willdecrease coverage by0.34%.
The diff coverage is85.00%.

Impacted file tree graph

@@            Coverage Diff             @@##           master      #97      +/-   ##==========================================- Coverage   96.41%   96.07%   -0.35%==========================================  Files          14       14                Lines         837      840       +3     ==========================================  Hits          807      807- Misses         30       33       +3
Impacted FilesCoverage Δ
api/String.h90.90% <ø> (+2.02%)⬆️
api/String.cpp97.69% <85.00%> (-0.77%)⬇️

Continue to review full report at Codecov.

Legend -Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing data
Powered byCodecov. Last update2ca15ad...9a604e7. Read thecomment docs.

@matthijskooijman
Copy link
CollaboratorAuthor

@aentinger I finally got around to having another look at this, I see you already did a rebase. I just did another force push, which:

With that, the testcases succeed again.

I think this could be merged as-is (after squashing the fixup), but I also want to look at:

  1. Adding some testcases with embedded nul bytes to verify that those actually work and stay working in the future (and increase coverage).
  2. FixingString::equals (and related methods) with embedded nul characters (probably required to write proper testcases). This is alsoString == String does not work with embedded NUL characters #112.
  3. Fixing some more places where embedded nul bytes still cause problems (in particular certain operations will not work since they use library functions that assume nul termination).

Ideally, we'd do 1. and 2. before merging this, 3. is probably more tricky and might be left for later.

@cvwillegen
Copy link

There are still strcmp() calls in the source, instead of memcmp(), so it's not going to work as-is...

@matthijskooijman
Copy link
CollaboratorAuthor

There are still strcmp() calls in the source, instead of memcmp(), so it's not going to work as-is...

Yeah, that's point 2. on my list above. But for some purposes (such as passing around buffers with embedded nuls) the PR is already useful as it stands (as long as you do not need comparison or some other operations). But as I said, ideally we'd fix at least the comparisons before merging this.

@cvwillegen
Copy link

Yeah, that's point 2. on my list above. But for some purposes (such as passing around buffers with embedded nuls) the PR is already useful as it stands (as long as you do not need comparison or some other operations). But as I said, ideally we'd fix at least the comparisons before merging this.

I actually ran into this when doing an == on 2 NUL-containing strings :-)

Indeed, I created#112 and you pointed me to this PR...

Copy link
Contributor

@aentingeraentinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Hi@matthijskooijman 👋 and a happy new year 🎆

Thank you very much for all your good work 👍! I agree that both

  1. Adding some testcases with embedded nul bytes to verify that those actually work and stay working in the future (and increase coverage).

and

  1. Fixing String::equals (and related methods) with embedded nul characters (probably required to write proper testcases). This is also#112.

will be necessary for merging this PR. Maybe if you could provide test cases in the appropriate files forString::equals andString::compareTo? I don't see a need to provide embedded-nul-string test code for operators ==, !=, <, >, <=, >= as they are basically wrappers around the two functioned mentioned before.

@aentinger
Copy link
Contributor

I assume a rebase is in order@matthijskooijman after my merging of#137?

Instead of calling strlen in a dozen places, instead just callString::concat(s), which will then call strlen. This shrinks the codesize of these calls significantly, the StringAppendOperator example onthe Arduino Uno shrinks by 72 bytes. This change does incur a slightruntime cost, because there is one extra function call.
Previously, these methods took a nul-terminated string and appended itto the current buffer. The length argument (or length of the passedString object) was used to allocate memory, but was not used whenactually copying the string. This meant that: - If the passed length was too short, or the string passed in was not   nul-terminated, the buffer would overflow. - If the string passed in contained embedded nul characters (i.e,   among the first length characters), any characters after the embedded   nul would not be copied (but len would be updated to indicate they   were).In practice, neither of the above would occur, since the length passed isalways the known length of the string, usually as returned by strlen.However, to make this method public, and to allow using this concatmethod to pass in strings that are not nul-terminated, it should bechanged to be more robust.This commit changes the method to use memcpy instead of strcpy, copyingexactly the number of bytes passed in. For the current calls to thismethod, which pass a nul-terminated string, without embedded nulcharacters and a correct length, this behaviour should not change.However, this concat method can now also be used in the two casesmentioned above. Non-nul-terminated strings now work as expected and forstrings with embedded newlines the entire string is copied as-is,instead of leaving uninitialized memory after the embedded nul byte.Note that a lot of operations will still only see the string up to theembedded nul byte, but that's not really fixable unless we reimplementfunctions like strcmp.
Now concat(const char*, unsigned int) no longer requires anul-terminated string, we can simplify the concat(char) method to justpass the address of the single character instead of having to createbuffer with nul-termination.
This method is useful when receiving data from external sources thatpass an explicit length instead of a NUL-terminated string.
This constructor allows converting a buffer containing anon-nul-terminated string to a String object, by explicitely passing thelength.
Before, substring would (temporarily) add a \0 in the original string atthe end of the substring, so the substring could be copied into a newString object. However, now that the String::copy() method can work withnon-nul-terminated strings (by passing an explicit length), this trickis not needed and we can just call the copy method instead.
This just calls the char* version, but allows calling the method with auint8_t* as well (which is not uncommon for buffers).
This allows creating a String from a uint8_t[] or uint8_t* as well,without having to add explicit casts.
@matthijskooijman
Copy link
CollaboratorAuthor

I assume a rebase is in order@matthijskooijman after my merging of#137?

Yup, done.

Copy link
Contributor

@aentingeraentinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM 👍 Thank you@matthijskooijman 🚀

@aentingeraentinger merged commitfdccb19 intomasterJan 25, 2021
@aentingeraentinger deleted the stringnonull branchJanuary 25, 2021 09:29
@matthijskooijman
Copy link
CollaboratorAuthor

Yay for merging this, but I didn't make the changes mentioned in#97 (review) yet... And there was still a fixup commit in here that should have been squashed before merging...

Regardless, I believe that the changes in this PR are already useful and worth including, though support for strings with NUL is now not entirely complete and lacks testcases, so that still needs to be fixed (no time for that right now, though).

mysterywolf added a commit to RTduino/RTduino that referenced this pull requestSep 15, 2024
mysterywolf pushed a commit to RTduino/RTduino that referenced this pull requestSep 15, 2024
When working with the Arduino String class, I've found that I couldn'tefficiently combine it with some external libraries that explicitelypass char* and length around, without nul-terminating their strings.This prompted me to modify and expose the concat (const char* cstr,unsigned int length) method, add a new String(const char* cstr, unsignedint length) constructor. While I was going over the string class, Ifound some other minor improvements, which are included here.This is a port ofarduino/Arduino#1936. The commits are identical,except for some improved commit messages and one commit was droppedsince that change was already made by someone else in the meantime.I've provided some testcases by updating the string examples:arduino/Arduino#9239If this is ok to merge, I'll also provide a PR for the referencedocumentation.cherry-pick from:arduino/ArduinoCore-API#97
mysterywolf pushed a commit to RTduino/RTduino that referenced this pull requestSep 15, 2024
When working with the Arduino String class, I've found that I couldn'tefficiently combine it with some external libraries that explicitelypass char* and length around, without nul-terminating their strings.This prompted me to modify and expose the concat (const char* cstr,unsigned int length) method, add a new String(const char* cstr, unsignedint length) constructor. While I was going over the string class, Ifound some other minor improvements, which are included here.This is a port ofarduino/Arduino#1936. The commits are identical,except for some improved commit messages and one commit was droppedsince that change was already made by someone else in the meantime.I've provided some testcases by updating the string examples:arduino/Arduino#9239If this is ok to merge, I'll also provide a PR for the referencedocumentation.cherry-pick from:arduino/ArduinoCore-API#97
@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is84.21053% with3 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.07%. Comparing base (d5790a0) to head (d6fd84f).
⚠️ Report is 174 commits behind head on master.

Files with missing linesPatch %Lines
api/String.cpp84.21%3 Missing⚠️
Additional details and impacted files
@@            Coverage Diff             @@##           master      #97      +/-   ##==========================================- Coverage   96.42%   96.07%   -0.36%==========================================  Files          14       14                Lines         839      840       +1     ==========================================- Hits          809      807       -2- Misses         30       33       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report?Share it here.

🚀 New features to boost your workflow:
  • ❄️Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
pennam reacted with confused emoji

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@facchinmfacchinmfacchinm approved these changes

+1 more reviewer

@aentingeraentingeraentinger approved these changes

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

6 participants

@matthijskooijman@cvwillegen@aentinger@codecov-io@codecov-commenter@facchinm

[8]ページ先頭

©2009-2025 Movatter.jp