Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Abhijit Hota
Abhijit Hota

Posted on • Originally published atabhijithota.me on

     

Nerd-sniping myself over Go range expressions

Originally published onabhijithota.me

The background

I stumbled across a situation while writing a Golang project where I needed to loop over a slice returned by a function.
The function beingstrings.Split. No big deal. I wrote the following:

for_,id:=rangestrings.Split(path,"/"){// Something()}
Enter fullscreen modeExit fullscreen mode

There's another way of writing the same thing:

splitPath:=strings.Split(path,"/")for_,id:=rangesplitPath{// Something()}
Enter fullscreen modeExit fullscreen mode

To the novice-eyed, like me, it would seem the latter was anoptimized version as the function is only called once before the loop and hence moreperformant.

I did not care about the negligible performance gain I would get as thepath variable passed to the split function would be limited in length by application constraints.

But what if it wasn't?

tldr; It doesn't matter how you write it. They are both the same.
From theGolang spec,

The range expression x is evaluated once before beginning the loop, with one exception: if at most one iteration variable is present and len(x) is constant, the range expression is not evaluated.

This seems obvious once you know about it because it's such a low-hanging fruit for the compiler. It's not even compiler optimization. It's compiler duty.

From here, I'm going to write about how Inerd-sniped myself into finding the answer to this and naturally, spent around 4 hours benchmarking, trying to figure out assembly, checking the compiler optimization wiki, some trivial experimentation that confirmed my theory and one line in the spec that made it all come together.

Before we proceed into this, have some context about the original scenario. The function should split a separated string calledpath with/ as delimiter and extract all integers into an array like this:

funcExtractInts(pathstring)[]int{pathIds:=[]int{}for_,v:=rangestrings.Split(path,"/"){ifv!=""{val,_:=strconv.Atoi(v)pathIds=append(pathIds,val)}}returnpathIds}
Enter fullscreen modeExit fullscreen mode

Benchmarking the two functions

I put together a quick small project to bench the thing consisting ofmain.go andmain_test.go:

main.go

packagemainimport("strconv""strings")funcWithOptim(pathstring)int{pathIds:=[]int{}pathIdsStr:=strings.Split(path,"/")for_,v:=rangepathIdsStr{ifv!=""{val,_:=strconv.Atoi(v)pathIds=append(pathIds,val)}}returnlen(pathIds)}funcWithoutOptim(pathstring)int{pathIds:=[]int{}for_,v:=rangestrings.Split(path,"/"){ifv!=""{val,_:=strconv.Atoi(v)pathIds=append(pathIds,val)}}returnlen(pathIds)}
Enter fullscreen modeExit fullscreen mode

main_test.go

packagemainimport("strings""testing")varpath=strings.Repeat("734/956/831/811/",100_000)varresultWithOptimintfuncBenchmarkWithOptim(b*testing.B){varrintforn:=0;n<b.N;n++{r=WithOptim(path)}resultWithOptim=r}varresultWithoutOptimintfuncBenchmarkWithoutOptim(b*testing.B){varrintforn:=0;n<b.N;n++{r=WithoutOptim(path)}resultWithoutOptim=r}
Enter fullscreen modeExit fullscreen mode

The variablesr,resultWithOptim andresultWithoutOptim are needed to escape compiler optimizations and get reliable benchmarks.

Check outthis section from theawesome benchmarking article by Dave Cheney.

Results

I ran the benchmark a couple of times.1

In most cases, it was neck-and-neck:

BenchmarkWithoutOptim-8     63   18067614 ns/opBenchmarkWithOptim-8        64   17833483 ns/op3.282sBenchmarkWithoutOptim-8     64   17879142 ns/opBenchmarkWithOptim-8        66   17943536 ns/op3.287sBenchmarkWithoutOptim-8     66   17915000 ns/opBenchmarkWithOptim-8        66   17567501 ns/op3.292sBenchmarkWithoutOptim-8     56   18344452 ns/opBenchmarkWithOptim-8        57   17649186 ns/op2.090s
Enter fullscreen modeExit fullscreen mode

In some cases the one withoptimizations performed better:

BenchmarkWithoutOptim-8     64   18446009 ns/opBenchmarkWithOptim-8        73   18054308 ns/op3.548sBenchmarkWithoutOptim-8     63   18466071 ns/opBenchmarkWithOptim-8        70   18600002 ns/op3.422sBenchmarkWithOptim-8        64   18917323 ns/opBenchmarkWithoutOptim-8     58   18092279 ns/op3.215s
Enter fullscreen modeExit fullscreen mode

But in some cases the one without anyoptimizations performed faster!

BenchmarkWithOptim-8        55   18848131 ns/opBenchmarkWithoutOptim-8     68   19239865 ns/op2.395sBenchmarkWithOptim-8        57   18323290 ns/opBenchmarkWithoutOptim-8     63   17598465 ns/op2.206s
Enter fullscreen modeExit fullscreen mode

Digging into the why

The first resource I came across was theCompiler And Runtime Optimizations wiki which stated nothing about such behaviour.

I also tried converting the code into Assembly usingCompiler Explorer but couldn't understand any of it.

Was the compiler calling the function only once? Are the 2 ways of writing same? I had no way of knowing. Then I thought of something so trivial I felt dumb.

The trivial experiment

Consider the following code:

funcNums()[]int{print("Nums called\n")return[]int{1,2,3,4,5}}funcLoop(){for_,v:=rangeNums(){print(v)}}funcmain(){Loop()}
Enter fullscreen modeExit fullscreen mode

And surely enough,Nums called is only printed once toSTDOUT.
This felt so obvious once I realised it.

Eureka + Facepalm moment

Googling"go range internals" gave methis article by Robbie V which sent me to theGo language specification where I found this:

The range expression x is evaluated once before beginning the loop, with one exception: if at most one iteration variable is present and len(x) is constant, the range expression is not evaluated.

Conclusion

There's nothing much to conclude but there's a lesson to learn. In Robbie V's article, the Step 1 was toRTFM.Unlike that, I dove right into benchmarking which was like comparing two sum functions which returneda + b vs.b + a.

I discovered a lot but this shouldn't have taken so much of my time. It's a reminder on how we can take some slightly interesting things, like the one I talked about here, for granted and never realise the machinery of it.

As to which function is better to write, that would be an opinion.

Other languages

I tried the same thing in a few other languages I've coded in because I never realised this:

JavaScript

functionnums(){console.debug('Nums called');return[1,2,3,4,5];}functionloop(){for(constvofnums()){console.debug(v);}}loop();
Enter fullscreen modeExit fullscreen mode

Python

defnums():print("Nums called")return[1,2,3,4,5]defloop():forlinnums():print(l)loop()
Enter fullscreen modeExit fullscreen mode

  1. Usinggo test -bench=. -run=xxx 

    Specs:
    goos: linux
    goarch: amd64
    cpu: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz

    Order of function execution was changed a few times.

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

I love almost everything about computers, specifically software. Diving deep into Web development and cloud and loving it! 💛👨‍💻
  • Location
    India
  • Education
    Indian Institute of Technology, Madras
  • Joined

Trending onDEV CommunityHot

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp