In Go, astring
is a primitive type, which means it is read-only, and every manipulation of it will create a new string.
So if I want to concatenate strings many times without knowing the length of the resulting string, what's the best way to do it?
The naive way would be:
var s stringfor i := 0; i < 1000; i++ { s += getShortStringFromSomewhere()}return s
but that does not seem very efficient.
- 12
- 1Note: This question and most answers seem to have been written before
append()
came into the language, which is a good solution for this. It will perform fast likecopy()
but will grow the slice first even if that means allocating a new backing array if the capacity isn't enough.bytes.Buffer
still makes sense if you want its additional convenience methods or if the package you're using expects it.thomasrutter– thomasrutter2017-08-10 01:29:53 +00:00CommentedAug 10, 2017 at 1:29 - 10It doesn't just "seem very inefficient"; it has a specific problem that every new non-CS hire we have ever gotten runs into in the first few weeks on the job. It's quadratic - O(n*n). Think of the number sequence:
1 + 2 + 3 + 4 + ...
. It'sn*(n+1)/2
, the area of a triangle of basen
. You allocate size 1, then size 2, then size 3, etc when you append immutable strings in a loop. This quadratic resource consumption manifests itself in more ways than just this.Rob– Rob2017-11-16 15:22:15 +00:00CommentedNov 16, 2017 at 15:22
20 Answers20
New Way:
From Go 1.10 there is astrings.Builder
type,please take a look at this answer for more detail.
Old Way:
Use thebytes
package. It has aBuffer
type which implementsio.Writer
.
package mainimport ( "bytes" "fmt")func main() { var buffer bytes.Buffer for i := 0; i < 1000; i++ { buffer.WriteString("a") } fmt.Println(buffer.String())}
This does it in O(n) time.
10 Comments
buffer := bytes.NewBufferString("")
, you can dovar buffer bytes.Buffer
. You also don't need any of those semicolons :).In Go 1.10+ there isstrings.Builder
,here.
A Builder is used to efficiently build a string using Write methods. It minimizes memory copying. The zero value is ready to use.
Example
It's almost the same withbytes.Buffer
.
package mainimport ( "strings" "fmt")func main() { // ZERO-VALUE: // // It's ready to use from the get-go. // You don't need to initialize it. var sb strings.Builder for i := 0; i < 1000; i++ { sb.WriteString("a") } fmt.Println(sb.String())}
Click to see this on the playground.
Supported Interfaces
strings.Builder
's methods are being implemented with the existing interfaces in mind so that you can switch to the newBuilder
type easily in your code.
Method Signature | Interface | Description |
---|---|---|
Grow(int) | bytes.Buffer | Grows the buffer's capacity by the specified amount. Seebytes.Buffer#Grow for more information. |
Len() int | bytes.Buffer | Returns the number of bytes in the buffer. Seebytes.Buffer#Len for more information. |
Reset() | bytes.Buffer | Resets the buffer to be empty. Seebytes.Buffer#Reset for more information. |
String() string | fmt.Stringer | Returns the contents of the buffer as a string. Seefmt.Stringer for more information. |
Write([]byte) (int, error) | io.Writer | Writes the given bytes to the buffer. Seeio.Writer for more information. |
WriteByte(byte) error | io.ByteWriter | Writes the given byte to the buffer. Seeio.ByteWriter for more information. |
WriteRune(rune) (int, error) | bufio.Writer orbytes.Buffer | Writes the given rune to the buffer. Seebufio.Writer#WriteRune orbytes.Buffer#WriteRune for more information. |
WriteString(string) (int, error) | io.stringWriter | Writes the given string to the buffer. Seeio.stringWriter for more information. |
Differences from bytes.Buffer
- It can only grow or reset.
- It has a
copyCheck
mechanism built-in that prevents accidentally copying it. Inbytes.Buffer
, one can access the underlying bytes like this:(*Buffer).Bytes()
.strings.Builder
prevents this problem. Sometimes, this is not a problem, though, and is desired instead. For example:For the peeking behavior when the bytes are passed to anio.Reader
etc. bytes.Buffer.Reset()
rewinds and reuses the underlying buffer whereas thestrings.Builder.Reset()
does not, it detaches the buffer.
Note
- Do not copy a
strings.Builder
value as it caches the underlying data. - If you want to share a
strings.Builder
value, use a pointer to it.
Check out its source code for more details,here.
7 Comments
strings.Builder
implements its methods using a pointer receiver, which threw me for a moment. As a result, I would probably create one usingnew
.strings.Builder.Reset()
sets the underling slice tonil
(no memory reuse). Wherebytes.Buffer.Reset()
sets the[]bytes
to zero length, keeping the underlying array allocated. This bit me when reusingstrings.Builder
in async.Pool
, which appeared to be completely useless.If you know the total length of the string that you're going to preallocate then the most efficient way to concatenate strings may be using the builtin functioncopy
. If you don't know the total length before hand, do not usecopy
, and read the other answers instead.
In my tests, that approach is ~3x faster than usingbytes.Buffer
and much much faster (~12,000x) than using the operator+
. Also, it uses less memory.
I've createda test case to prove this and here are the results:
BenchmarkConcat 1000000 64497 ns/op 502018 B/op 0 allocs/opBenchmarkBuffer 100000000 15.5 ns/op 2 B/op 0 allocs/opBenchmarkCopy 500000000 5.39 ns/op 0 B/op 0 allocs/op
Below is code for testing:
package mainimport ( "bytes" "strings" "testing")func BenchmarkConcat(b *testing.B) { var str string for n := 0; n < b.N; n++ { str += "x" } b.StopTimer() if s := strings.Repeat("x", b.N); str != s { b.Errorf("unexpected result; got=%s, want=%s", str, s) }}func BenchmarkBuffer(b *testing.B) { var buffer bytes.Buffer for n := 0; n < b.N; n++ { buffer.WriteString("x") } b.StopTimer() if s := strings.Repeat("x", b.N); buffer.String() != s { b.Errorf("unexpected result; got=%s, want=%s", buffer.String(), s) }}func BenchmarkCopy(b *testing.B) { bs := make([]byte, b.N) bl := 0 b.ResetTimer() for n := 0; n < b.N; n++ { bl += copy(bs[bl:], "x") } b.StopTimer() if s := strings.Repeat("x", b.N); string(bs) != s { b.Errorf("unexpected result; got=%s, want=%s", string(bs), s) }}// Go 1.10func BenchmarkStringBuilder(b *testing.B) { var strBuilder strings.Builder b.ResetTimer() for n := 0; n < b.N; n++ { strBuilder.WriteString("x") } b.StopTimer() if s := strings.Repeat("x", b.N); strBuilder.String() != s { b.Errorf("unexpected result; got=%s, want=%s", strBuilder.String(), s) }}
11 Comments
buffer.Write
(bytes) is 30% faster thanbuffer.WriteString
. [useful if you can get the data as[]byte
]b.N
, and so you're not comparing the execution time of the same task to be carried out (e.g. one function might append1,000
strings, another one might append10,000
which can make a big difference in the average time of 1 append, inBenchmarkConcat()
for example). You should use the same append count in each case (certainly notb.N
), and do all the concatenation inside the body of thefor
ranging tob.N
(that is, 2for
loops embedded).If you have a string slice that you want to efficiently convert to a string then you can use this approach. Otherwise, take a look at the other answers.
There is a library function in the strings package calledJoin
:http://golang.org/pkg/strings/#Join
A look at the code ofJoin
shows a similar approach to Append function Kinopiko wrote:https://golang.org/src/strings/strings.go#L420
Usage:
import ( "fmt"; "strings";)func main() { s := []string{"this", "is", "a", "joined", "string\n"}; fmt.Printf(strings.Join(s, " "));}$ ./test.binthis is a joined string
1 Comment
I just benchmarked the top answer posted above in my own code (a recursive tree walk) and the simple concat operator is actually faster than theBufferString
.
func (r *record) String() string { buffer := bytes.NewBufferString(""); fmt.Fprint(buffer,"(",r.name,"[") for i := 0; i < len(r.subs); i++ { fmt.Fprint(buffer,"\t",r.subs[i]) } fmt.Fprint(buffer,"]",r.size,")\n") return buffer.String()}
This took 0.81 seconds, whereas the following code:
func (r *record) String() string { s := "(\"" + r.name + "\" [" for i := 0; i < len(r.subs); i++ { s += r.subs[i].String() } s += "] " + strconv.FormatInt(r.size,10) + ")\n" return s}
only took 0.61 seconds. This is probably due to the overhead of creating the newBufferString
.
Update: I also benchmarked thejoin
function and it ran in 0.54 seconds.
func (r *record) String() string { var parts []string parts = append(parts, "(\"", r.name, "\" [" ) for i := 0; i < len(r.subs); i++ { parts = append(parts, r.subs[i].String()) } parts = append(parts, strconv.FormatInt(r.size,10), ")\n") return strings.Join(parts,"")}
3 Comments
buffer.WriteString("\t");
buffer.WriteString(subs[i]);
package mainimport ( "fmt")func main() { var str1 = "string1" var str2 = "string2" out := fmt.Sprintf("%s %s ",str1, str2) fmt.Println(out)}
3 Comments
This is the fastest solution that does not requireyou to know or calculate the overall buffer size first:
var data []bytefor i := 0; i < 1000; i++ { data = append(data, getShortStringFromSomewhere()...)}return string(data)
By mybenchmark, it's 20% slower than the copy solution (8.1ns perappend rather than 6.72ns) but still 55% faster than using bytes.Buffer.
Comments
You could create a big slice of bytes and copy the bytes of the short strings into it using string slices. There is a function given in "Effective Go":
func Append(slice, data[]byte) []byte { l := len(slice); if l + len(data) > cap(slice) { // reallocate // Allocate double what's needed, for future growth. newSlice := make([]byte, (l+len(data))*2); // Copy data (could use bytes.Copy()). for i, c := range slice { newSlice[i] = c } slice = newSlice; } slice = slice[0:l+len(data)]; for i, c := range data { slice[l+i] = c } return slice;}
Then when the operations are finished, usestring ( )
on the big slice of bytes to convert it into a string again.
Note added in 2018
From Go 1.10 there is astrings.Builder
type,please take a look at this answer for more detail.
Pre-201x answer
The benchmark code of @cd1 and other answers are wrong.b.N
is not supposed to be set in benchmark function. It's set by the go test tool dynamically to determine if the execution time of the test is stable.
A benchmark function should run the same testb.N
times and the test inside the loop should be the same for each iteration. So I fix it by adding an inner loop. I also add benchmarks for some other solutions:
package mainimport ( "bytes" "strings" "testing")const ( sss = "xfoasneobfasieongasbg" cnt = 10000)var ( bbb = []byte(sss) expected = strings.Repeat(sss, cnt))func BenchmarkCopyPreAllocate(b *testing.B) { var result string for n := 0; n < b.N; n++ { bs := make([]byte, cnt*len(sss)) bl := 0 for i := 0; i < cnt; i++ { bl += copy(bs[bl:], sss) } result = string(bs) } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) }}func BenchmarkAppendPreAllocate(b *testing.B) { var result string for n := 0; n < b.N; n++ { data := make([]byte, 0, cnt*len(sss)) for i := 0; i < cnt; i++ { data = append(data, sss...) } result = string(data) } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) }}func BenchmarkBufferPreAllocate(b *testing.B) { var result string for n := 0; n < b.N; n++ { buf := bytes.NewBuffer(make([]byte, 0, cnt*len(sss))) for i := 0; i < cnt; i++ { buf.WriteString(sss) } result = buf.String() } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) }}func BenchmarkCopy(b *testing.B) { var result string for n := 0; n < b.N; n++ { data := make([]byte, 0, 64) // same size as bootstrap array of bytes.Buffer for i := 0; i < cnt; i++ { off := len(data) if off+len(sss) > cap(data) { temp := make([]byte, 2*cap(data)+len(sss)) copy(temp, data) data = temp } data = data[0 : off+len(sss)] copy(data[off:], sss) } result = string(data) } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) }}func BenchmarkAppend(b *testing.B) { var result string for n := 0; n < b.N; n++ { data := make([]byte, 0, 64) for i := 0; i < cnt; i++ { data = append(data, sss...) } result = string(data) } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) }}func BenchmarkBufferWrite(b *testing.B) { var result string for n := 0; n < b.N; n++ { var buf bytes.Buffer for i := 0; i < cnt; i++ { buf.Write(bbb) } result = buf.String() } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) }}func BenchmarkBufferWriteString(b *testing.B) { var result string for n := 0; n < b.N; n++ { var buf bytes.Buffer for i := 0; i < cnt; i++ { buf.WriteString(sss) } result = buf.String() } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) }}func BenchmarkConcat(b *testing.B) { var result string for n := 0; n < b.N; n++ { var str string for i := 0; i < cnt; i++ { str += sss } result = str } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) }}
Environment is OS X 10.11.6, 2.2 GHz Intel Core i7
Test results:
BenchmarkCopyPreAllocate-8 20000 84208 ns/op 425984 B/op 2 allocs/opBenchmarkAppendPreAllocate-8 10000 102859 ns/op 425984 B/op 2 allocs/opBenchmarkBufferPreAllocate-8 10000 166407 ns/op 426096 B/op 3 allocs/opBenchmarkCopy-8 10000 160923 ns/op 933152 B/op 13 allocs/opBenchmarkAppend-8 10000 175508 ns/op 1332096 B/op 24 allocs/opBenchmarkBufferWrite-8 10000 239886 ns/op 933266 B/op 14 allocs/opBenchmarkBufferWriteString-8 10000 236432 ns/op 933266 B/op 14 allocs/opBenchmarkConcat-8 10 105603419 ns/op 1086685168 B/op 10000 allocs/op
Conclusion:
CopyPreAllocate
is the fastest way;AppendPreAllocate
is pretty close to No.1, but it's easier to write the code.Concat
has really bad performance both for speed and memory usage. Don't use it.Buffer#Write
andBuffer#WriteString
are basically the same in speed, contrary to what @Dani-Br said in the comment. Consideringstring
is indeed[]byte
in Go, it makes sense.- bytes.Buffer basically use the same solution as
Copy
with extra book keeping and other stuff. Copy
andAppend
use a bootstrap size of 64, the same as bytes.BufferAppend
use more memory and allocs, I think it's related to the grow algorithm it use. It's not growing memory as fast as bytes.Buffer
Suggestion:
- For simple task such as what OP wants, I would use
Append
orAppendPreAllocate
. It's fast enough and easy to use. - If need to read and write the buffer at the same time, use
bytes.Buffer
of course. That's what it's designed for.
Comments
My original suggestion was
s12 := fmt.Sprint(s1,s2)
But above answer usingbytes.Buffer - WriteString() is the most efficient way.
My initial suggestion uses reflection and a type switch.See(p *pp) doPrint
and(p *pp) printArg
There is no universal Stringer() interface for basic types, as I had naively thought.
At least though, Sprint()internally uses a bytes.Buffer. Thus
`s12 := fmt.Sprint(s1,s2,s3,s4,...,s1000)`
is acceptable in terms of memory allocations.
=> Sprint() concatenation can be used for quick debug output.
=> Otherwise use bytes.Buffer ... WriteString
3 Comments
Expanding on cd1's answer:You might use append() instead of copy().append() makes ever bigger advance provisions, costing a little more memory, but saving time.I addedtwo more benchmarks at the top of yours.Run locally with
go test -bench=. -benchtime=100ms
On my thinkpad T400s it yields:
BenchmarkAppendEmpty 50000000 5.0 ns/opBenchmarkAppendPrealloc 50000000 3.5 ns/opBenchmarkCopy 20000000 10.2 ns/op
Comments
This is actual version of benchmark provided by @cd1 (Go 1.8
,linux x86_64
) with the fixes of bugs mentioned by @icza and @PickBoy.
Bytes.Buffer
is only7
times faster than direct string concatenation via+
operator.
package performance_testimport ( "bytes" "fmt" "testing")const ( concatSteps = 100)func BenchmarkConcat(b *testing.B) { for n := 0; n < b.N; n++ { var str string for i := 0; i < concatSteps; i++ { str += "x" } }}func BenchmarkBuffer(b *testing.B) { for n := 0; n < b.N; n++ { var buffer bytes.Buffer for i := 0; i < concatSteps; i++ { buffer.WriteString("x") } }}
Timings:
BenchmarkConcat-4 300000 6869 ns/opBenchmarkBuffer-4 1000000 1186 ns/op
5 Comments
b.N
is a public variable?b.N
dynamically, you'll wind up with a strings of a different length in different test-cases. Seecomment func JoinBetween(in []string, separator string, startIndex, endIndex int) string { if in == nil { return "" } noOfItems := endIndex - startIndex if noOfItems <= 0 { return EMPTY } var builder strings.Builder for i := startIndex; i < endIndex; i++ { if i > startIndex { builder.WriteString(separator) } builder.WriteString(in[i]) } return builder.String()}
Comments
Since go1.20 you can copy strings data into bytes slice and then create the string without copying using unsafe (for older versions seehttps://stackoverflow.com/a/59210739/1723695 answer):
import "unsafe"func concat(ss []string) string { var n int for _, v := range ss { n += len(v) } b := make([]byte, n) var i int for _, v := range ss { i += copy(b[i:], v) } return unsafe.String(unsafe.SliceData(b), n)}
But actually it's almost the same asstrings.Join()
:
func Join(elems []string, sep string) string { // ... var b Builder b.Grow(n) b.WriteString(elems[0]) for _, s := range elems[1:] { b.WriteString(sep) b.WriteString(s) } return b.String()}
Wherestrings.Builder
returns the string the same way:
// String returns the accumulated string.func (b *Builder) String() string { return unsafe.String(unsafe.SliceData(b.buf), len(b.buf))}
So you can use:
strings.Join
- the most simple and fast solutionstrings.Builder
withGrow()
- more flexible and fast, but more codecopy
+unsafe
- the most flexible and fast but much more code
All implementation has about the same performance and one memory allocation per operation.
If you don't know the total length of strings ahead, you can change implementation fromcopy()
toappend()
or usestrings.Builder
maybe with some approximate length to grow for better performance.
Comments
I do it using the following :-
package mainimport ( "fmt" "strings")func main (){ concatenation:= strings.Join([]string{"a","b","c"},"") //where second parameter is a separator. fmt.Println(concatenation) //abc}
2 Comments
package mainimport ("fmt")func main() { var str1 = "string1" var str2 = "string2" result := make([]byte, 0) result = append(result, []byte(str1)...) result = append(result, []byte(str2)...) result = append(result, []byte(str1)...) result = append(result, []byte(str2)...) fmt.Println(string(result))}
1 Comment
Simple and easy to digest solution. Details in the comments.Copy overwrites the elements of slice. We are slicing single-single element and overwriting it.
package mainimport ( "fmt")var N int = 100000func main() { slice1 := make([]rune, N, N) //Efficient with fast performance, Need pre-allocated memory //We can add a check if we reached the limit then increase capacity //using append, but would be fined for data copying to new array. Also append happens after the length of current slice. for i := 0; i < N; i++ { copy(slice1[i:i+1], []rune{'N'}) } fmt.Println(slice1) //Simple but fast solution, Every time the slice capacity is reached we get a fine of effort that goes //in copying data to new array slice2 := []rune{} for i := 0; i <= N; i++ { slice2 = append(slice2, 'N') } fmt.Println(slice2)}
Comments
benchmark result with memory allocation statistics. check benchmark code atgithub.
use strings.Builder to optimize performance.
go test -bench . -benchmemgoos: darwingoarch: amd64pkg: github.com/hechen0/goexp/expsBenchmarkConcat-8 1000000 60213 ns/op 503992 B/op 1 allocs/opBenchmarkBuffer-8 100000000 11.3 ns/op 2 B/op 0 allocs/opBenchmarkCopy-8 300000000 4.76 ns/op 0 B/op 0 allocs/opBenchmarkStringBuilder-8 1000000000 4.14 ns/op 6 B/op 0 allocs/opPASSok github.com/hechen0/goexp/exps 70.071s
2 Comments
strings.Join()
from the "strings" package
If you have a type mismatch(like if you are trying to join an int and a string), you do RANDOMTYPE (thing you want to change)
EX:
package mainimport ( "fmt" "strings")var intEX = 0var stringEX = "hello all you "var stringEX2 = "people in here"func main() { s := []string{stringEX, stringEX2} fmt.Println(strings.Join(s, ""))}
Output :
hello all you people in here
3 Comments
strings.Join()
takes only 2 parameters: a slice and a separatorstring
.s := fmt.Sprintf("%s%s", []byte(s1), []byte(s2))
2 Comments
[]byte(s1)
conversion. Comparing it with other solutions posted, can you name a single advantage of your solution?Explore related questions
See similar questions with these tags.