Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Strings without null-termination#1936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation

@matthijskooijman
Copy link
Collaborator

When working with the Arduino String class, I've found that I couldn't efficiently combine it with some external libraries that explicitely passchar* and length around, without nul-terminating their strings. This prompted me to modify and expose theconcat (const char* cstr, unsigned int length) method, add a newString(const char* cstr, unsigned int length) constructor. While I was going over the string class, I found some other minor improvements, which are included here.

@matthijskooijman
Copy link
CollaboratorAuthor

I just added one more commit: "Add String::concat(const uint8_t *, unsigned int) version".

@PaulStoffregen
Copy link
Contributor

If a concat(pointer, len) is going to be added, perhaps "const void *" would be better?

@matthijskooijman
Copy link
CollaboratorAuthor

@PaulStoffregen Hmm, that might make things easier. On the other hand, having a void* version might complicate overloading, or make it easier for people to accidentally pass a different kind of pointer without the compiler warning them about that. These might not be important for theconcat(void*, len) case, but if we do this here, we should (be prepared to) also apply this in other cases where we want to accept a (string) buffer, where this might be more relevant.

Does this make sense to you, or do you see things differntly?

@matthijskooijmanmatthijskooijman added Component: CoreRelated to the code for the standard Arduino API Version: 1.5.x feature requestA request to make an enhancement (not a bug fix) labelsSep 10, 2014
@matthijskooijman
Copy link
CollaboratorAuthor

@PaulStoffregen, would be great if you could respond to my last comment here.

@matthijskooijman
Copy link
CollaboratorAuthor

@cmaglie, any chance you get merge this for the next release? I just ran into trouble again by not having aconcat(str, len) function...

@matthijskooijman
Copy link
CollaboratorAuthor

I've just rebased this PR, so it is again current. I also realized this changed the AVR WString code, but not SAM, so I updated each commit to apply the same changes to both.

@ffissore
Copy link
Contributor

@ArduinoBot build this please

@matthijskooijman
Copy link
CollaboratorAuthor

I tested this PR against all examples in the repository, which all still work as before (78% even compiles to the same binary). Some sketches become bigger, some sketches became smaller (on average all sketches became 3 bytes bigger. Limiting to only the sketches that actually changed, so the sketches using String in some way, these are 14 bytes bigger on average).

All size increases seem to be caused becauseString::concat(const char*, unsigned int) now callsmemcpy instead ofstrcopy, making it remember thelength parameter, causing the compiler to store data on the stack instead of just in registers, which has a little overhead. All size decrease is becausestrcpy() can be left out, becauseString::concat(char) is simplifed and because a bunch ofstrlen() calls are removed. Nothing out of the ordinary or unexpected stood out.

@Chris--A
Copy link
Contributor

The features here would combine well with my PR here:#3096

In the future, rather than relying on the len input (diff),validate() could be called to ensure the String internals reflect its contents as a null-terminated string, rather than what was copied directly in.

This would resolve problems if someone was to copy or concatenate a buffer which has trailing nulls.

char buff[10] = {'d','a','t','a'};String s;//User could easily use:s.concat( buff );//But now has the option to use:s.concat( buff, sizeof(buff) );

Whether it be a logical error (buffer isn't as full as expected), or intention (nulls mean something else),validate() will ensures.len reflects:

  • a null terminated string.
  • what is output with Print members.
  • and when used in other c-string function usings.c_str();, or memory functions requiring a length.

@matthijskooijman
Copy link
CollaboratorAuthor

In the future, rather than relying on the len input (diff), validate() could be called to ensure the String internals reflect its contents as a null-terminated string, rather than what was copied directly in.

One of the goals of this PR is to allow a String to store embedded NUL characters as well (e.g. to store a binary packet in it). IIUC, your suggestion would exactly break that possibility.

Having avalidate() function sounds reasonable, but I don't think the String class should be calling it automatically - the user should be in control when he wants to fix the length, I think.

Instead of calling strlen in a dozen places, instead just callString::concat(s), which will then call strlen. This shrinks the codesize of these calls significantly, the StringAppendOperator example onthe Arduino Uno shrinks by 72 bytes. This change does incur a slightruntime cost, because there is one extra function call.
Previously, these methods took anul-terminated string and appended it to the current buffer. The lengthargument (or length of the passed String object) was used to allocatememory, but was not used when actually copying the string. This meantthat: - If the length was not correct, or the string passed in was not   nul-terminated, the buffer would overflow. - If the string passed in contained embedded nul characters (e.g, in   among the first length characters), any characters after the embedded   nul would not be copied (but len would be updated to indicate they   were).In practice, neither of the above would occur, since the length passed isalways the known length of the string, usually as returned by strlen.However, to allow using this concat method to pass in strings that arenot nul-terminated, its behaviour should change.This commit changes the method to use memcpy instead of strcpy, copyingexactly the number of bytes passed in. For the current calls to thismethod, which pass a nul-terminated string, without embedded nulcharacters and a correct length, this behaviour should not change.However, this concat method can now also be used in the two casesmentioned above. Non-nul-terminated strings now work as expected and forstrings with embedded newlines the entire string is copied as-is,instead of leaving uninitialized memory after the embedded nul byte.Note that a lot of operations will still only see the string up to theembedded nul byte, but that's not really fixable unless we reimplementfunctions like strcmp.
Before, it used strncpy, but that has undefined behaviour when thesource and destination strings overlap. memove is guaranteed to workcorrectly in this case.Note that memmove simply copies all bytes, whereas strncpy stopped atthe first nul-byte. This means this commit also makes remove work forstrings that contain embedded nul bytes.
Now concat(const char*, unsigned int) no longer requires anul-terminated string, we can simplify the concat(char) method to justpass the address of the single character instead of having to createbuffer with nul-termination.
This method is useful when receiving data from external sources thatpass an explicit length instead of a NUL-terminated string.
This constructor allows converting a buffer containing anon-nul-terminated string to a String object, by explicitely passing thelength.
Before, substring would (temporarily) add a \0 in the original string atthe end of the substring, so the substring could be copied into a newString object. However, now that the String::copy() method can work withnon-nul-terminated strings (by passing an explicit length), this trickis not needed and we can just call the copy method instead.
This just calls the char* version, but allows calling the method with auint8_t* as well (which is not uncommon for buffers).
This allows creating a String from a uint8_t[] or uint8_t* as well,without having to add explicit casts.
@matthijskooijmanmatthijskooijmanforce-pushed thestringnonull branch 2 times, most recently fromf821d03 todeadaa2CompareAugust 4, 2015 08:10
@matthijskooijman
Copy link
CollaboratorAuthor

Rebased this PR again, and added one more commit to add aString(const uint8_t *, unsigned int) constructor.

@Chris--A
Copy link
Contributor

Having a validate() function sounds reasonable, but I don't think the String class should be calling it automatically - the user should be in control when he wants to fix the length, I think.

I agree, it should be an optional.

With regards to this PR, as it doesn't seem to change any existing functionality, and only add features, this can have a benefit (saw the issue you referenced above).

When working with the Arduino String class, I've found that I couldn't efficiently combine it with some external libraries that explicitely pass char* and length around

This seems to be a different problem to the issue. By combine, do you mean something like this:

String s ="Hello";s += Obj.func();

I'm not sure I understand what difficulties you are having.

@facchinmfacchinm added the Print and Stream classThe Arduino core library's Print and Stream classes labelJan 24, 2017
@matthijskooijman
Copy link
CollaboratorAuthor

matthijskooijman commentedJul 25, 2017
edited
Loading

When working with the Arduino String class, I've found that I couldn't efficiently combine it with some external libraries that explicitely pass char* and length around

For example, consider a library that calls a callback function when data is received, passing in a (non-nul-terminated)char* buffer and a length. If I want to accumulate the data in a String object, now I have to allocate memory for a copy of the buffer with added nul-termination, and then concat that to the existing String. Instead, I would like to write something like:

String accum;void onReceive(const char*buf, unsigned len) {     accum.concat(buf, len);}

Which should just copy the data into the String's buffer, without needing another intermediate copy.

For another (pretty much the same) example, see #6546.

@facchinm
Copy link
Member

@matthijskooijman do you think it still makes sense to port this PR to API repo?

@facchinmfacchinm added the Waiting for feedbackMore information must be provided before we can proceed labelSep 18, 2019
@matthijskooijman
Copy link
CollaboratorAuthor

I think these changes are still useful. They probably need a bit of rebasing and cleaning up. I'm willing to have a look at resubmitting this, if you (or someone else) will review and merge it?

@facchinm
Copy link
Member

I can't assure anything about merging but for sure we'll review them (I think it's a good patch so the merge chances are high).

@matthijskooijman
Copy link
CollaboratorAuthor

I can't assure anything about merging but for sure we'll review them (I think it's a good patch so the merge chances are high).

Yeah, that's what I meant. Thanks! Pullrequest created atarduino/ArduinoCore-API#97, so I'm closing this one.

@matthijskooijmanmatthijskooijman deleted the stringnonull branchMarch 28, 2020 16:25
mysterywolf pushed a commit to RTduino/RTduino that referenced this pull requestSep 15, 2024
When working with the Arduino String class, I've found that I couldn'tefficiently combine it with some external libraries that explicitelypass char* and length around, without nul-terminating their strings.This prompted me to modify and expose the concat (const char* cstr,unsigned int length) method, add a new String(const char* cstr, unsignedint length) constructor. While I was going over the string class, Ifound some other minor improvements, which are included here.This is a port ofarduino/Arduino#1936. The commits are identical,except for some improved commit messages and one commit was droppedsince that change was already made by someone else in the meantime.I've provided some testcases by updating the string examples:arduino/Arduino#9239If this is ok to merge, I'll also provide a PR for the referencedocumentation.cherry-pick from:arduino/ArduinoCore-API#97
mysterywolf pushed a commit to RTduino/RTduino that referenced this pull requestSep 15, 2024
When working with the Arduino String class, I've found that I couldn'tefficiently combine it with some external libraries that explicitelypass char* and length around, without nul-terminating their strings.This prompted me to modify and expose the concat (const char* cstr,unsigned int length) method, add a new String(const char* cstr, unsignedint length) constructor. While I was going over the string class, Ifound some other minor improvements, which are included here.This is a port ofarduino/Arduino#1936. The commits are identical,except for some improved commit messages and one commit was droppedsince that change was already made by someone else in the meantime.I've provided some testcases by updating the string examples:arduino/Arduino#9239If this is ok to merge, I'll also provide a PR for the referencedocumentation.cherry-pick from:arduino/ArduinoCore-API#97
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

@cmagliecmaglie

Labels

Component: CoreRelated to the code for the standard Arduino APIfeature requestA request to make an enhancement (not a bug fix)Print and Stream classThe Arduino core library's Print and Stream classesWaiting for feedbackMore information must be provided before we can proceed

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

7 participants

@matthijskooijman@PaulStoffregen@ffissore@ArduinoBot@Chris--A@facchinm@cmaglie

[8]ページ先頭

©2009-2025 Movatter.jp