Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork7k
Strings without null-termination#1936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
matthijskooijman commentedMar 17, 2014
I just added one more commit: "Add String::concat(const uint8_t *, unsigned int) version". |
PaulStoffregen commentedMar 17, 2014
If a concat(pointer, len) is going to be added, perhaps "const void *" would be better? |
matthijskooijman commentedApr 25, 2014
@PaulStoffregen Hmm, that might make things easier. On the other hand, having a void* version might complicate overloading, or make it easier for people to accidentally pass a different kind of pointer without the compiler warning them about that. These might not be important for the Does this make sense to you, or do you see things differntly? |
matthijskooijman commentedNov 13, 2014
@PaulStoffregen, would be great if you could respond to my last comment here. |
matthijskooijman commentedJan 19, 2015
@cmaglie, any chance you get merge this for the next release? I just ran into trouble again by not having a |
matthijskooijman commentedJun 12, 2015
I've just rebased this PR, so it is again current. I also realized this changed the AVR WString code, but not SAM, so I updated each commit to apply the same changes to both. |
ffissore commentedJun 12, 2015
@ArduinoBot build this please |
ArduinoBot commentedJun 12, 2015
matthijskooijman commentedJun 15, 2015
I tested this PR against all examples in the repository, which all still work as before (78% even compiles to the same binary). Some sketches become bigger, some sketches became smaller (on average all sketches became 3 bytes bigger. Limiting to only the sketches that actually changed, so the sketches using String in some way, these are 14 bytes bigger on average). All size increases seem to be caused because |
Chris--A commentedJul 9, 2015
The features here would combine well with my PR here:#3096 In the future, rather than relying on the len input (diff), This would resolve problems if someone was to copy or concatenate a buffer which has trailing nulls. Whether it be a logical error (buffer isn't as full as expected), or intention (nulls mean something else),
|
matthijskooijman commentedJul 9, 2015
One of the goals of this PR is to allow a String to store embedded NUL characters as well (e.g. to store a binary packet in it). IIUC, your suggestion would exactly break that possibility. Having a |
Instead of calling strlen in a dozen places, instead just callString::concat(s), which will then call strlen. This shrinks the codesize of these calls significantly, the StringAppendOperator example onthe Arduino Uno shrinks by 72 bytes. This change does incur a slightruntime cost, because there is one extra function call.
Previously, these methods took anul-terminated string and appended it to the current buffer. The lengthargument (or length of the passed String object) was used to allocatememory, but was not used when actually copying the string. This meantthat: - If the length was not correct, or the string passed in was not nul-terminated, the buffer would overflow. - If the string passed in contained embedded nul characters (e.g, in among the first length characters), any characters after the embedded nul would not be copied (but len would be updated to indicate they were).In practice, neither of the above would occur, since the length passed isalways the known length of the string, usually as returned by strlen.However, to allow using this concat method to pass in strings that arenot nul-terminated, its behaviour should change.This commit changes the method to use memcpy instead of strcpy, copyingexactly the number of bytes passed in. For the current calls to thismethod, which pass a nul-terminated string, without embedded nulcharacters and a correct length, this behaviour should not change.However, this concat method can now also be used in the two casesmentioned above. Non-nul-terminated strings now work as expected and forstrings with embedded newlines the entire string is copied as-is,instead of leaving uninitialized memory after the embedded nul byte.Note that a lot of operations will still only see the string up to theembedded nul byte, but that's not really fixable unless we reimplementfunctions like strcmp.
Before, it used strncpy, but that has undefined behaviour when thesource and destination strings overlap. memove is guaranteed to workcorrectly in this case.Note that memmove simply copies all bytes, whereas strncpy stopped atthe first nul-byte. This means this commit also makes remove work forstrings that contain embedded nul bytes.
Now concat(const char*, unsigned int) no longer requires anul-terminated string, we can simplify the concat(char) method to justpass the address of the single character instead of having to createbuffer with nul-termination.
This method is useful when receiving data from external sources thatpass an explicit length instead of a NUL-terminated string.
This constructor allows converting a buffer containing anon-nul-terminated string to a String object, by explicitely passing thelength.
Before, substring would (temporarily) add a \0 in the original string atthe end of the substring, so the substring could be copied into a newString object. However, now that the String::copy() method can work withnon-nul-terminated strings (by passing an explicit length), this trickis not needed and we can just call the copy method instead.
This just calls the char* version, but allows calling the method with auint8_t* as well (which is not uncommon for buffers).
This allows creating a String from a uint8_t[] or uint8_t* as well,without having to add explicit casts.
f821d03 todeadaa2Comparematthijskooijman commentedAug 4, 2015
Rebased this PR again, and added one more commit to add a |
Chris--A commentedAug 12, 2016
I agree, it should be an optional. With regards to this PR, as it doesn't seem to change any existing functionality, and only add features, this can have a benefit (saw the issue you referenced above).
This seems to be a different problem to the issue. By combine, do you mean something like this: String s ="Hello";s += Obj.func();I'm not sure I understand what difficulties you are having. |
matthijskooijman commentedJul 25, 2017 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
For example, consider a library that calls a callback function when data is received, passing in a (non-nul-terminated) Which should just copy the data into the String's buffer, without needing another intermediate copy. For another (pretty much the same) example, see #6546. |
facchinm commentedSep 18, 2019
@matthijskooijman do you think it still makes sense to port this PR to API repo? |
matthijskooijman commentedSep 18, 2019
I think these changes are still useful. They probably need a bit of rebasing and cleaning up. I'm willing to have a look at resubmitting this, if you (or someone else) will review and merge it? |
facchinm commentedSep 18, 2019
I can't assure anything about merging but for sure we'll review them (I think it's a good patch so the merge chances are high). |
matthijskooijman commentedSep 19, 2019
Yeah, that's what I meant. Thanks! Pullrequest created atarduino/ArduinoCore-API#97, so I'm closing this one. |
When working with the Arduino String class, I've found that I couldn'tefficiently combine it with some external libraries that explicitelypass char* and length around, without nul-terminating their strings.This prompted me to modify and expose the concat (const char* cstr,unsigned int length) method, add a new String(const char* cstr, unsignedint length) constructor. While I was going over the string class, Ifound some other minor improvements, which are included here.This is a port ofarduino/Arduino#1936. The commits are identical,except for some improved commit messages and one commit was droppedsince that change was already made by someone else in the meantime.I've provided some testcases by updating the string examples:arduino/Arduino#9239If this is ok to merge, I'll also provide a PR for the referencedocumentation.cherry-pick from:arduino/ArduinoCore-API#97
When working with the Arduino String class, I've found that I couldn'tefficiently combine it with some external libraries that explicitelypass char* and length around, without nul-terminating their strings.This prompted me to modify and expose the concat (const char* cstr,unsigned int length) method, add a new String(const char* cstr, unsignedint length) constructor. While I was going over the string class, Ifound some other minor improvements, which are included here.This is a port ofarduino/Arduino#1936. The commits are identical,except for some improved commit messages and one commit was droppedsince that change was already made by someone else in the meantime.I've provided some testcases by updating the string examples:arduino/Arduino#9239If this is ok to merge, I'll also provide a PR for the referencedocumentation.cherry-pick from:arduino/ArduinoCore-API#97
When working with the Arduino String class, I've found that I couldn't efficiently combine it with some external libraries that explicitely pass
char*and length around, without nul-terminating their strings. This prompted me to modify and expose theconcat (const char* cstr, unsigned int length)method, add a newString(const char* cstr, unsigned int length)constructor. While I was going over the string class, I found some other minor improvements, which are included here.