- Notifications
You must be signed in to change notification settings - Fork18.4k
redefining for loop variable semantics#56010
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Update 2023-06-06: Go 1.21 is expected to support We have been looking at what to do about the for loop variable problem (#20733), gathering data about what a change would mean and how we might deploy it.This discussion aims to gather early feedback about this idea, to understand concerns and aspects we have not yet considered. Thanks for keeping this discussion respectful and productive! To recap#20733 briefly, the problem is that loops like this one don’t do what they look like they do:
That is, this code has a bug. After this loop executes,
This bug also often happens in code with closures that capture the address of item implicitly, like:
This code prints 3, 3, 3, because all the closures print the same v, and at the end of the loop, v is set to 3. Note that there is no explicit &v to signal a potential problem. Again the fix is the same: add v := v. Goroutines are also often involved, although as these examples show, they need not be. See also theGo FAQ entry. We have talked for a long time about redefining these semantics, to make loop variablesper-iteration instead ofper-loop. That is, the change would effectively be to add an implicit “x := x” at the start of every loop body for each iteration variable x, just like people do manually today. Making this change would remove the bugs from the programs above. In theGo 2 transitions document we gave the general rule that language redefinitions like what I just described are not permitted. I believe that is the right general rule, but I have come to also believe that the for loop variable case is strong enough to motivate a one-time exception to that rule. Loop variables being per-loop instead of per-iteration is the only design decision I know of in Go that makes programs incorrect more often than it makes them correct. Since it is the only such design decision, I do not see any plausible candidates for additional exceptions. To make the breakage completely user controlled, the way the rollout would work is to change the semantics based on the go line in each package’s go.mod file, the same line we already use for enabling language features (you can only use generics in packages whose go.mod says “go 1.18” or later). Just this once, we would use the line for changing semantics instead of for adding a feature or removing a feature. If we hypothetically made the change in go 1.30, then modules that say “go 1.30” or later get the per-iteration variables, while modules with earlier versions get the per-loop variables: In a given code base, the change would be “gradual” in the sense that each module can update to the new semantics independently, avoiding a bifurcation of the ecosystem. The specific semantics of the redefinition would be that both range loops and three-clause for loops get per-iteration variables. So in addition to the program above being fixed, this one would be fixed too:
In the 3-clause form, the start of the iteration body copies the per-loop Adjusting the 3-clause form may seem strange to C programmers, but the same capture problems that happen in range loops also happen in three-clause for loops. Changing both forms eliminates that bug from the entire language, not just one place, and it keeps the loops consistent in their variable semantics. That consistency means that if you change a loop from using range to using a 3-clause form or vice versa, you only have to think about whether the iteration visits the same items, not whether a subtle change in variable semantics will break your code. It is also worth noting that JavaScript is using per-iteration semantics for 3-clause for loops using let, with no problems. I think the semantics are a smaller issue than the idea of making this one-time gradual breaking change. I’ve posted this discussion to gather early feedback on the idea of making a change here at all, because that’s something we’ve previously treated as off the table. I’ve outlined the reasons I believe this case merits an exception below. I’m hoping this discussion can surface concerns, good ideas, and other feedback about the idea of making the change at all (not as much the semantics). I know that C# 5 made this change as well, but I’ve been unable to find any retrospectives about how it was rolled out or how it went. If anyone knows more about how the C# transition went or has links to that information, please post that too. Thanks! The case for making the change: A decade of experience shows the cost of the current semanticsItalked at Gophercon once about how we need agreement about the existence of a problem before we move on to solutions. When we examined this issue in the run up to Go 1, it did not seem like enough of a problem. The general consensus was that it was annoying but not worth changing. Since then, I suspect every Go programmer in the world has made this mistake in one program or another. I certainly have done it repeatedly over the past decade, despite being the one who argued for the current semantics and then implemented them. (Sorry!) The current cures for this problem are worse than the disease. I ran a program to process the git logs of the top 14k modules, from about 12k git repos and looked for commits with diff hunks that were entirely “x := x” lines being added. I found about 600 such commits. On close inspection, approximately half of the changes were unnecessary, done probably either at the insistence of inaccurate static analysis, confusion about the semantics, or an abundance of caution. Perhaps the most striking was this pair of changes from different projects:
One of these two changes is unnecessary and the other is a real bug fix, but you can’t tell which is which without more context. (In one, the loop variable is an interface value, and copying it has no effect; in the other, the loop variable is a struct, and the method takes a pointer receiver, so copying it ensures that the receiver is a different pointer on each iteration.) And then there are changes like this one, which is unnecessary regardless of context (there is no opportunity for hidden address-taking):
This kind of confusion and ambiguity is the exact opposite of the readability we are aiming for in Go. People are clearly having enough trouble with the current semantics that they choose overly conservative tools and adding “x := x” lines by rote in situations not flagged by tools, preferring that to debugging actual problems. This is an entirely rational choice, but it is also an indictment of the current semantics. We’ve also seen production problems caused in part by these semantics, both inside Google and at other companies (for example,this problem at Let’s Encrypt). It seems likely to me that, world-wide, the current semantics have easily cost many millions of dollars in wasted developer time and production outages. Old code is unaffected, compiling exactly as beforeThe go lines in go.mod give us a way to guarantee that all old code is unaffected, even in a build that also contains new code. Only when you change your go.mod line do the packages in that module get the new semantics, and you control that. In general this one reason is not sufficient, as laid out in the Go 2 transitions document. But it is a key property that contributes to the overall rationale, with all the other reasons added in. Changing the semantics is usually a no-op, and when it’s not, it fixes buggy code far more often than it breaks correct codeWe built a toolchain with the change and tested a subset of Google’s Go tests and analyzed the resulting failures. The rate of new test failures was approximately 1 in 2,000, but nearly all were previously undiagnosed actual bugs. The rate of spurious test failures (correct code actually broken by the change) was 1 in 50,000. To start, there were only 58 failures out of approximately 100,000 tests executed, covering approximately 1.3M for loops. Of the failures, 36 (62%) were tests not testing what they looked like they tested because of bad interactions with t.Parallel: the new semantics made the tests actually run correctly, and then the tests failed because they found actual latent bugs in the code under test. The next most common mistake was appending &v on each iteration to a slice, which makes a slice of N identical pointers. The rest were other kinds of bugs canceling out to make tests pass incorrectly. We found only 2 instances out of the 58 where code correctly depended on per-loop semantics and was actually broken by the change. One involved a handler registered using once.Do that needed access to the current iteration’s values on each invocation. The other involved low-level code running in a context when allocation is disallowed, and the variable escaped the loop (but not the function), so that the old semantics did not allocate while the new semantics did. Both were easily adjusted. Of course, there is always the possibility that Google’s tests may not be representative of the overall ecosystem’s tests in various ways, and perhaps this is one of them. But there is no indication from this analysis ofany common idiom at all where per-loop semantics are required. The git log analysis points in the same direction: parts of the ecosystem are adopting tools with very high false positive rates and doing what the tools say, with no apparent problems. There is also the possibility that while there’s no semantic change, existing loops would, when updated to the new Go version, allocate one variable per iteration instead of once per loop. This problem would show up in memory profiles and is far easier to track down than the silent corruption we get when things go wrong with today’s semantics. Benchmarking of the public “bent” bench suite showed no statistically significant performance difference over all, so we expect most programs to be unaffected. Good tooling can help users identify exactly the loops that need the most scrutiny during the transitionOur experience analyzing the failures in Google’s Go tests shows that we can use compiler instrumentation (adjusted -m output) to identify loops that may be compiling differently, because the compiler thinks the loop variables escape. Almost all the time, this identifies a very small number of loops, and one of those loops is right next to the failure. That experience can be wrapped up into a good tool for directing any debugging sessions. Another possibility is a compilation mode where the compiled code consults an array of bits to decide during execution whether each loop gets old or new semantics. Package testing could provide a mode that implements binary search on that array to identify exactly which loops cause a test to fail. So if a test fails, you run the “loop finding mode” and then it tells you: “applying the semantic change to these specific loops causes the failure”. All the others are fine. Static analysis is not a viable alternativeWhether a particular loop is “buggy” due to the current behavior depends on whether the address of an iteration value is takenand then that pointer is used after the next iteration begins. It is impossible in general for analyzers to see where the pointer lands and what will happen to it. In particular, analyzers cannot see clearly through interface method calls or indirect function calls. Different tools have made different approximations. Vet recognizes a few definitely bad patterns, and we are adding a new one checking for mistakes using t.Parallel in Go 1.20. To avoid false positives, it also has many false negatives. Other checkers in the ecosystem err in the other direction. The commit log analysis showed some checkers were producing over 90% false positive rates in real code bases. (That is, when the checker was added to the code base, the “corrections” submitted at the same time were not fixing actual problems over 90% of the time in some commits.) There is no perfect way to catch these bugs statically. Changing the semantics, on the other hand, eliminates them all. Changing loop syntax entirely would cause unnecessary churnWe have talked in the past about introducing a different syntax for loops (for example,#24282), and then giving the new syntax the new semantics while deprecating the current syntax. Ultimately this would cause a very significant amount of churn disproportionate to the benefit: the vast majority of existing loops are correct and do not need any fixes. It seems like an extreme response to force an edit of every for loop that exists today while invalidating all existing documentation and then having two different for loops that Go programmers need to understand, especially compared to changing the semantics to match what people overwhelmingly expect when they write the code. My goal for this discussion is to gather early feedback on the idea of making a change here at all, because that’s something we’ve previously treated as off the table, as well as any feedback on expected impact and what would help users most in a roll-out strategy. Thanks! |
BetaWas this translation helpful?Give feedback.
All reactions
👍 668🎉 113😕 1❤️ 196🚀 5👀 53
Replies: 50 comments 241 replies
-
I work on the C# team and can offer perspective here. The C# 5 rollout unconditionally changed the This change was not taken lightly. It had been discussed internally for several years,blogs were written about it, lots of analysis of customer code, upper management buy off, etc ... In end though the change was rather anticlimactic. Yes it did break a small number of customers but it was smaller than expected. For the customers impacted they responded positively to our justifications and accepted the proposed code fixes to move forward. I'm one of the main people who does customer feedback triage as well as someone who helps customers migrating to newer versions of the compiler that stumble onto unexpected behavior changes. That gives me a good sense of whatpain points exist for tooling migration. This was a small blip when it was first introduced but quickly faded. Even as recently as a few years ago I was helping large code bases upgrade from C# 4. While they do hit other breaking changes we've had, they rarely hit this one. I'm honestly struggling to remember the last time I worked with a customer hitting this. It's been ~10 years since this change was taken to the language and a lot has changed in that time. Projects have a property
This separation has been very successful for us and allowed us to make changes that would not have been possible in the past. If we were doing this change today we'd almost certainly tie the break to a |
BetaWas this translation helpful?Give feedback.
All reactions
👍 121❤️ 191
-
Thank you very much@jaredpar for sharing this perspective! |
BetaWas this translation helpful?Give feedback.
All reactions
👍 31❤️ 8
-
@jaredpar Thank you very much! That was very informative and insightful. Do you have any further perspective to share on the decision to change Internally within the Go compiler team, this issue has recurringly been discussed as: (1) Go's The compiler team has been leaning towards changing both, mostly motivated to make sure users can safely switch between |
BetaWas this translation helpful?Give feedback.
All reactions
👍 6
-
For what it's worth, i'm satisfied with the answers you and Russ provided elsethread. As long as the solution can be narrowly tailored to fix the closure-in-a-loop problem without otherwise changing |
BetaWas this translation helpful?Give feedback.
All reactions
-
It was considered but rejected. One issue is that The other thought was that in
We haven't had any serious discussions about it that I remember. A bit of context that may not be as obvious from the outside. At that time the C# team wasextremely sensitive to breaking changes. Given we couldn't leverage |
BetaWas this translation helpful?Give feedback.
All reactions
❤️ 33
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Are there any valid serious use cases preferring the current semantics? If there are none, the C# way is better than the go.mod way in my opinion. Most code readers don't care about the go version set in go.mod at all. The fact that the same piece of code behaves differently is a bad user experience and a potential security risk for many projects. |
BetaWas this translation helpful?Give feedback.
All reactions
👎 19
-
IIRC, I think the meaning of an absent go.mod or absent
A somewhat minor detail, but as part of this discussion, it might be nice to explicitly state either that assumed version won’t change again, or that if this range variable change was hypothetically in go 1.30, then a missing go.mod or missing |
BetaWas this translation helpful?Give feedback.
All reactions
👍 5
-
Yes, we kept the implied version for an absent go line / go.mod file bumping forward for as long as the releases were completely compatible (or close enough). That stopped at Go 1.17, as you note. Since then, all releases assume that code without a go line gets Go 1.16 semantics. That is set in stone now and will never change. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 33
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
We actually had an in-depth discussion about precisely this problem this morning in the Go Brasil telegram group (we are now joking you guys are spying on us hehe). One thing to notice in this discussion is that even after having this problem explained multiple times by different people multiple developers were still having trouble understanding exactly what caused it and how to avoid it, and even the people that understood the problem in one context often failed to notice it would affect other types of for loops. So we could argue that the current semantics make the learning curve for Go steeper. PS: I have also had problems with this multiple times, once in production, thus, I am very in favor of this change even considering the breaking aspect of it. |
BetaWas this translation helpful?Give feedback.
All reactions
-
This exactly matches my experience. It's relatively easy to understand the first example (taking the same address each time), but somewhat trickier to understand in the closure/goroutine case. And even when you do understand it, one forgets (apparently even Russ forgets!). In addition, issues with this often don't show up right away, and then when debugging an issue, I find it always takes a while to realize that it's "that old loop variable issue again". |
BetaWas this translation helpful?Give feedback.
All reactions
👍 8
-
I also find there to be a frustrating split of “workarounds” for closures. Some people recommend redefinition, and others assignment through a function parameter:
vs.
I’ve tried to explain that the first option, while seemingly non-functional, is the better option: it uses type inference so a change in type name does not require a change in this loop; it shadows the per-loop In the end, both patterns “solve” the problem, but neither of them in a wholly satisfactory way. The per-iteration redefinition provides the most concise and robust code, but really calls for documentation every time because maybe this is the first time a person has come across this pattern, and might think it superfluous. All in all, I’m pretty happy to see this semantic change. Work around for more efficient only-once allocation being a welcome burden for not having to explain this extremely common bug every time it pops up. The only-once allocation being far more clear as a result as well, which doesn’t end up looking unnecessary:
|
BetaWas this translation helpful?Give feedback.
All reactions
👍 9
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
I'm totally on board with changing The case for changing the 3-clause for loop is less clear to me. I'm worried about code which intentionally modifies the loop variable in the loop body. E.g. fori:=0;i<len(a);i++ {ifa[i]=="" {a=append(a[:i],a[i+1:]...)i-- }} Maybe not the best example -- but the point is that such code, while rare, almost certainly exists and could be silently broken by this change. Note that this argument does not apply to I suppose the fix in this case is to move the variable declaration out of the for loop, though that seems almost as inscrutable a change as adding -for i := 0; i < len(a); i++ {+i := 0+for ; i < len(a); i++ { In general, 3-clause for loops are what you reach for when doing something "tricky". The exception, of course, is when you want a simple loop over a range of integers, in which case the 3-clause form is the only option. If there were a dedicated |
BetaWas this translation helpful?Give feedback.
All reactions
👍 18
-
@DeedleFake This discussion is only about changing loops that explicitly declare variables. Loops that do not declare variables, like your example, are unaffected |
BetaWas this translation helpful?Give feedback.
All reactions
👍 1
-
Oh, duh. After all, without a declaration, how would the compiler know which variable to copy and update on each iteration? |
BetaWas this translation helpful?Give feedback.
All reactions
👍 1
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
I think the argument in#56010 (reply in thread) is a stronger reason to prefer leaving 3-argument for as-is, than the correctness arguments here. It seems that the 3-argument for loop construction looking identical to many other languages, but being in fact slightly different, would be something difficult to teach and difficult for new learners to retain (since it generally 'just works' regardless). |
BetaWas this translation helpful?Give feedback.
All reactions
👍 1❤️ 1
-
So would the 3 argument for loop looking similar to the range loop and working slightly differently |
BetaWas this translation helpful?Give feedback.
All reactions
-
The proposal that I read (unless rsc updated it between your comment and me reading it), is clear that in the 3 argument Thus, your presented code should still work fine:
|
BetaWas this translation helpful?Give feedback.
All reactions
👍 2
-
vendored dependenciesHow would vendored dependencies be handled? My concern is that since |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
For modules at |
BetaWas this translation helpful?Give feedback.
All reactions
❤️ 7
-
Thanks,@bcmills. I thought that was the case, but couldn't find it. I realize now that the reason I couldn't find it in How would loop variables be treated in those kinds of cases? |
BetaWas this translation helpful?Give feedback.
All reactions
-
Un-Go-versioned dependencies are treated as being written for Go 1.16 already for certain module analyses. The same would apply here. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 2
-
If we do this, I would like us to be extra-loud about it. That is, if I'd also like a tool that removes all my |
BetaWas this translation helpful?Give feedback.
All reactions
👍 22
-
Any message that does not break the build process is going to be overlooked by at least some people and all automation. There’s kind of the idea in Go that if something is worth warning about, then it’s worth breaking the build/automation. That would mean we would want to stop any |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
This is what tests are for, not obscure warning messages to the final user that they then must go spelunking about because theremight be a problem somewhere. |
BetaWas this translation helpful?Give feedback.
All reactions
👎 4
-
For what it's worth, I really like this change. I can easily teach to this and it strengthens the value semantic aspects of that loop. The ideas of how to use go.mod seem reasonable and valid. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 28
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
This is a good change and would make using table tests a lot easier to handle due to the per-instance rather than per-loop expectations. As a further enhancement, perhaps better discussed in a different topic, would it be possible for the Go compiler to flag when such upgrades will change the way the code works and inform the users about it? In your example:
What do you think? Edit: |
BetaWas this translation helpful?Give feedback.
All reactions
👍 7
-
We can definitely provide tools for people to analyze their code to understand likely sources of changes, and we would absolutely do that. I'm not sure it makes sense to track "was this code ever compiled with Go 1.30 before?" and print information during an ordinary build. That question can't be answered reliably (what if the upgrade happened on a different machine and was checked into the repo and you just ran git pull?). |
BetaWas this translation helpful?Give feedback.
All reactions
👍 2
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
That's true. Thinking a bit more about this, maybe a (standalone?) Go tool that takes the compiler, with all its flags and, for a given codebase, runs all the Go versions between "current" and "target" and produces the following output:
The tool could keep track of more than just compiler changes. Deprecations, library behavior changes for functions, etc, could all be part of this. Then the developers could run this tool and provide a clear upgrade path. This would make it easy to determine if/what work is required for developers to upgrade a codebase. Edit: Integration with editors/other tools would then help people discover these issues easier. |
BetaWas this translation helpful?Give feedback.
All reactions
-
There are two aspects of this that I'd like more details on, either as part of this discussion or in a future proposal:
|
BetaWas this translation helpful?Give feedback.
All reactions
👍 1
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
|
BetaWas this translation helpful?Give feedback.
All reactions
👍 5
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
The once.Do example would have been better written by hoisting the shared variable declarations outside the loop and doing the stubbing once, there. I thought it was very confusing code. With the current implementation, if there is no apparent capture, the loop is not changed. I say "apparent" because the detector has to run before escape analysis and thus it is over-conservative; it looks for mention within closures and address-taking (address-taking also occurs when passing a V to a *V method or constructing a method value). I can see how in the future we'll have a discussion about "technical debt in the compiler" because its internal representation of for/for-range will have old-style capture, but doing it this way ensures usual-case no-overhead and makes it easier to detect where the change occurs. If there is apparent capture in for-range, but escape analysis determines there is no escape, then it is very nearly the same loop. If there is escape, then there will tend to be a new allocation for each iteration, and if that is bad, hoist the declaration prior to the loop and use "=" in the range, as in:
but escape analysis works well. 3-clause for loops with capture are much rarer (97% less common than range loops, over non-test Google code, and they didn't show up in tests at all), but the same treatment of escaping variables applies. The transformation we copied from JavaScript introduces a little branchy overhead that might not come out in optimization (but we might target that optimization in the future), but (1) if there is no increment clause, it can be omitted (that's already implemented) and (2) the same declaration-hoisting change that works for for-range also eliminates the 3-clause per-iteration variable, and thus eliminates the transformation and thus eliminates the branchy code. So the workaround is not hard in either case. There's also the possibility of a tool that could, if this change causes failure and you can't easily figure out where, pinpoint exactly the function and loop where it goes wrong, if we could figure out the right packaging for the tool. This would use the |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
varonce sync.Oncefor_,tt:=rangetestCases {once.Do(func() {http.HandleFunc("/handler",func(w http.ResponseWriter,r*http.Request) {w.Write(tt.Body) }) })result:=get("/handler")ifresult!= tt.Body {... }} @rsc could you please explain this? Sorry for late (and possibly already answered) question. |
BetaWas this translation helpful?Give feedback.
All reactions
-
@aklepatc I assume that's an error introduced by the reproduction from memory. You can either replace The explanation of that code is that when A cleaner way to do that would be varbodystringhttp.HandleFunc("/handler",func(w http.ResponseWriter,r*http.Request) {io.WriteString(w,body)})for_,tt:=rangetestCases {body=tt.Bodyresult:=get("/handler")ifresult!= tt.Body {... }} |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Ty@Merovius ! Bytes equality check was the only thing that didn't make sense about this code snippet. The rest was reasonably clear. |
BetaWas this translation helpful?Give feedback.
All reactions
-
I am glad to see this. Currently the However, for this transition, this won't be the case. Am I understanding it correct? More specifically, in the example that hypothetically assume the change is made in go 1.30, I think any attempt to compile or import the work module (with And non-module based build systems (e.g. bazel) will also need a plan to move forward. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 2
-
I think we will want authors of modules to be able to upgrade their module to 1.30 without breaking their users (my understanding of forward compatibility). I do not feel we should consider this a breaking API change as it is an internal implementation detail.@rsc discussed forward compatible changes in#55092. It could be reasonable for tooling to start warning about "may escape for/for range vars" once you have a mixed version workspace. The basis would be that it is a readability challenge to switch back and forth while navigating code.
To facilitate upgrading from python 2 to 3, bazel has a python_version field for pybinary and srcs_version for pylibrary. Something similar can be added to go_library to facilitate large scale bazel migrations. |
BetaWas this translation helpful?Give feedback.
All reactions
-
This is discussed in#55092
|
BetaWas this translation helpful?Give feedback.
All reactions
-
Yes, I believe shipping#55092 first would make it safer to do this change. |
BetaWas this translation helpful?Give feedback.
All reactions
👎 1
-
@rsc: reading through that proposal, it sounds like it would only upgrade the toolchain to the Go version of the main module of the project. If the installed toolchain is Go 1.X, the main module's go.mod says |
BetaWas this translation helpful?Give feedback.
All reactions
-
This question puzzles me as well, and I can't think of a way to make it work without upgrading the toolchain from 1.X to 1.Z (which may not be possible). Go 1.X can't know about the semantic change in 1.Z, so it couldn't even emit a warning other than a generic version-mismatch. |
BetaWas this translation helpful?Give feedback.
All reactions
-
The proposal for the 3-clause form, with implied copies at the startand end of the iteration, seems inherently racy in the presence of goroutines that outlive the iteration. Is there any remedy for this, or do we live with the raciness of possibly changing the variable before the iteration proceeds? |
BetaWas this translation helpful?Give feedback.
All reactions
-
😬 I don’t feel well with calling the proposed code “non-racy” just because it is not triggering the race-condition detection. I would in fact strongly assert that this is absolutely already racy code despite not being caught by the race detector. That one can design a precarious piece of code that is both racy but does not trigger the race detector is indisputable, but one kind of has to expect that subtle changes can always easily “unexpectedly” break such precarious code. |
BetaWas this translation helpful?Give feedback.
All reactions
-
@puellanivis, maybe you missed this detail, but the original is non-racy because the iteration only runs once ( Increment is on a goroutine started by the original goroutine, so initialization happens-before increment. As those are the only two accesses, it's fine. This change adds a third access at the end of the iteration, which races with the child goroutine. |
BetaWas this translation helpful?Give feedback.
All reactions
-
No, I did not miss that the code does not trigger the race detector because it’s secretly basically single-threaded. However, changing that This is specifically why I called the code “precarious”. It isinherently racey even though it does not trigger the race detector, which it does only because it has been specially designed to evade the race detector. |
BetaWas this translation helpful?Give feedback.
All reactions
-
It doesn't trigger the race detector because there is no data race, not for a flaw in the race detector. That the example is contrived is explicitly stated, and acknowledged in the reactions. PS: adding an access to |
BetaWas this translation helpful?Give feedback.
All reactions
-
@puellanivis FWIW I agree with@ncruces and@mdempsky. The code is not racy. That it is easy to change it into racy code does not change that. Most race-free code can easily be modified into being racy. It's notgood code and it's certainly precarious code, which is why@mdempsky said it's contrived. But it currently has well-defined semantics and does not contain a race and it would contain a race under the proposed change. However, when entertaining this change we are already accepting that it would break compatibility and trying to quantify that breakage. I hope we all agree that this example is sufficiently contrived not to measurably change "how breaking" changing loop-variables would be. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 3❤️ 2
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
If theres a desire to de-risk this against breakage to existing code, perhaps one option would be to make capturing a loop variable a compilation error for a version or two?
|
BetaWas this translation helpful?Give feedback.
All reactions
👍 1👎 2
-
Not all loop captures are bugs. The vast majority of loop captures today are fine. The "compilation error" phase would break all of today's correct code, causing unnecessary churn. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 7👎 1
-
I like this change and think it would eliminate one of the biggest footguns in Go. Is it possible to mechanically rewrite old code so that it compiles, under the new semantics, to have exactly the pre-change behavior? Granted, this tool would need to be conservative in the cases where a static analysis tool can't be sure whether the change in For example,
is probably broken code. But suppose that we wanted to compile this under the proposed new
(assuming that
If such a rewrite tool existed, then a conservative workflow for transitioning a project to the new
Even better would be for the rewrite tool to delete any now-provably-unnecessary "foo := foo" lines, so that old code can benefit from this simplification. Such a tool would make it easy for people to find the places in their code that might be affected by the new semantics, Though I also see some danger that many people might just commit the rewritten code, thus perpetuating any pre-existing bugs that might otherwise have been fixed by the transition to the new semantics. Therefore, another option might be for the "rewrite" tool to insert commented-out code, or just add comments that highlight the places where the behavior might change under the new |
BetaWas this translation helpful?Give feedback.
All reactions
👍 1
-
That tool can be written and probably will be. Given how exceedingly rare the old behavior is the correct one, I am not 100% sure that using such a tool is the right way to approach a migration. The "git diff" is going to be error-prone and tedious. If you have good tests, testing and looking into failures is probably a better approach. But the tool will be important to have nonetheless. |
BetaWas this translation helpful?Give feedback.
All reactions
-
To have a good implementation of such a tool it is important to have somewhat good escape analysis. You do not want all conversions to an interface or method calls to a pointer receiver to cause a "may escape" warning. If "(adjusted -m output)" happens like Russ mentioned, we have some preliminary evidence such a tool is likely to be reasonable for many people but not churn free though. (Take a look at the "Changing the semantics is usually a no-op" section for the evidence available.)
I had not yet considered creating a TODO list via comments yet. Such comments could searchable if the text is somewhat unique. It can be applied pre-transition. This would not be churn free, but it could be tackled over time, given as an onboarding project, additional tooling, etc. It may be a nice alternative to hosting the declaration before the loop, which could be forgotten (and is more likely to keep old bugs). Thanks for the idea. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 3
-
Not sure, but I am worry about that this change will work correctly when using "go generate" between mixed versions. For example, if we generate Go codes for an older version with this change in effect. Of course, this is the generator's responsibility to consider older versions, but we believe it must be considered. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 1
-
A fair point, and yet another reason for the general rule that we don't make breaking changes. Generators would need to emit code that works with either semantics, but given the very low rate of code that is correct today and incorrect with the changed semantics, I suspect the vast majority of generators are fine already. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 4
-
It could be helpful to provide a link to this related prior work around determining the scope of the problem on GitHub. |
BetaWas this translation helpful?Give feedback.
All reactions
-
Thanks. I don't remember seeing that project before. I'm curious what analyzer it is using. I clicked on three issues at random from the first page of issues in rangeloop-pointer-findings, and only one of them is a real bug:
|
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Yep! The goal of this project was to findall the instances with false positives and rely on crowdsourcing to filter through the false positives. That didn't pan out. IIRC, the analyzer is based on looppointer with some additional customizations. It's been a while since I worked on it. The initial pass didn't use the type-checker. I started bringing in type-checking before being pulled away by a new job. |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
I really do not see this as a useful change. These changes always have the best intentions, but the reality is that the language works just fine now. This well intended change slowly creep in over time, until you wind up with the C++ language yet again. If someone can't understand a relatively simple design decision like this one, they are not going to understand how to properly use channels and other language features of Go. Buryinga change to the semantics of the language in go.mod is absolutely bonkers . It's supposed to be controlling modules, not the semantics of the language. If you really want to do this, then |
BetaWas this translation helpful?Give feedback.
All reactions
👍 1👎 21
-
We already use |
BetaWas this translation helpful?Give feedback.
All reactions
👍 9
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Honestly, this should just be a //go:newfor comment or a OR we can have a import that enables the semantics per file I for one despise breaking changes to the language that arent opt in (i like opt in so they can be reviewed at my own convenience without losing other new features of the language). Per file would be better if possible |
BetaWas this translation helpful?Give feedback.
All reactions
👍 1👎 20
-
This suggestionis addressed in the Go 2 transition document:
If I understand you correctly, this is addressed in the "Changing loop syntax entirely would cause unnecessary churn" section of the top post. If you have new data or new, so far not considered arguments, they would be welcome. |
BetaWas this translation helpful?Give feedback.
All reactions
-
Per file controls is somewhat doable via build tags or a similar mechanism. That does not really solve what you are discussing though. My main concern about per file control is that this is not super compatible with the rest of the tooling and build ecosystem. (Kinda minor technical point but it does matter.) Per package controls would be quite a bit smoother to build today. That is a smaller unit that per module. But it is not as fine grained as per file or per loop. FWIW there are discussions of having a tool that would annotate/rewrite existing loops that may escape. This could be done before or during an update. Doing this step would be opt-in. This would let folks that want to review over time do so. I realize this is not quite what you are requesting, but it may cover some concerns. |
BetaWas this translation helpful?Give feedback.
All reactions
-
If the author knows to use special annotations, they at least probably already know how to avoid this. It's a better outcome to make it be less surprising for developers less familiar with Go. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 4
-
It seems like a good idea, but what about making some small syntactical change to for loops at the same time, such that the old syntax doesn't compile and the new syntax works after changing go.mod? Otherwise, when reading for loops in Go code (which might be in a context where we don't have quick access to go.mod - consider a code review or someone asking for help), we won't know which semantics apply: the old way or the new way? If there's a change in syntax (which can be made automatically at the same time as changing go.mod) then the old syntax will look increasingly dated and people will be aware that they're looking at code that's doing it the old way. Or at least, that something weird is going on, if they're only familiar with the new syntax. I don't have any specific suggestion for improving the syntax, though; the difficulty would be coming up with something people like. |
BetaWas this translation helpful?Give feedback.
All reactions
-
Please see the section "Changing loop syntax entirely would cause unnecessary churn" in the top post. |
BetaWas this translation helpful?Give feedback.
All reactions
-
I 100% support this. |
BetaWas this translation helpful?Give feedback.
All reactions
-
I am generally positive about this proposition. i:=0for ;i<x;i++ { which is far from elegant In the cases that the go team decides to keeps, the two types of loops consistent I oppose this proposition |
BetaWas this translation helpful?Give feedback.
All reactions
👀 1
-
From the top post:
The loop variables are copied back-and-forth so nothing changes. You can still modify |
BetaWas this translation helpful?Give feedback.
All reactions
👍 3
-
Oh, I've missed that. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 2
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
I'm only a casual user of Go so please take my comment with an appropriate grain of salt. I'd welcome a change in semantics in the range loop, but caution against a change in the three-clause form. I think there's an important difference between the range loop and the 3-clause loop: a non-expert reader of the range loop can't see "how often declaration happens", while in the 3-clause loop declaration clearly seems to happen just once. Take a range loop like this:
This loop can easily be read by non-experts either as (pseudocode!):
Or alternatively:
The syntax offers only weak hints to decide which interpretation is the right one. The reader of the range loop needs to know which interpretation is right before they can tell whether there is a bug. On the other hand, take a 3-clause loop like this:
This syntax, to me, implies that:
So to me, this loop reads unambiguously as (pseudocode):
This means the error in the 3-clause loop (under current semantics) is clear upon close inspection, if one knows how closures work in Go. Changing the semantics and adding an implicit new declaration of a separate Conretely, I'd caution against a change in the three-clause loop for 3 reasons:
|
BetaWas this translation helpful?Give feedback.
All reactions
-
for_,pItem:=range&items {all=append(all,pItem)} |
BetaWas this translation helpful?Give feedback.
All reactions
😕 4
-
Is it too bad having 2 keywords for iterating? varall []*Itemforeach_,item:=rangeitems {all=append(all,&item)} |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Thanks for clarifying. I do disagree with the how the arguments associated with that branch of the dichotomy apply to the example of W.r.t. undo proliferation of ways to express loops causing confusion, I do not feel that |
BetaWas this translation helpful?Give feedback.
All reactions
-
I don't know what you mean by "functional". The generic iterator design does indeed add a new looping construct and thus does add the same level of overhead, yes. And it has to pay for that cost by demonstrating a commensurate benefit. Note that I also pointed out that your suggestion does not actually help solving the problem, as it still requires a programmer toknow to use it - and if they know, they can already use |
BetaWas this translation helpful?Give feedback.
All reactions
-
It would help me solve the problem, as outlined previously, by providing a convenient construct which encourages avoiding I disagree that the problem is requiring an understanding of allocations and scope -- I don't get that impression from the top doc either. Personally, I like that Go encourages understanding of scope and would discourage Both of these problems can be alleviated without changing semantics. Alleviating these problems would in turn reduce the backward incompatible impact of changing the loop semantics, should that occur. |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
You can already do that with |
BetaWas this translation helpful?Give feedback.
All reactions
👍 2
-
@scott-cotton "or incur calculating a slice of pointers." was what I was alluding to. This is obviously not always appropriate, and Personally I actually worry a lot less about the "&v" cases. Those make you write an "&" to give you a hint that an address was just taken and to think about storage duration. The hard to locally reason about cases are closures, interface capture, and method receivers. Those just use FWIW an unfortunate characteristic of have some range statement that produces pointers like |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
The argument from the top post is that this breaking change fixes more bugs than it causes, even for existing code. And this is an assertion based on having looked at code containing 1 million loops, and only finding 2 instances of current semantics being correctly depended upon, with Ian arguing that"for sure in one case, [probably] in the other case, that the code worked by accident." At this point, I think it would be very helpful if the people who disagree, to post actual counter examples of this from real code bases. That is: situations where the current semantics are being correctly depended upon (edit:and which the new semantics would break). Otherwise, we seem to be discussing how a certain rule prevents us from fixing a real bug. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 6
-
Do you have a real program which depends on the current semantics? That is the question of this thread. Your original answer was "a program which relies on the spec guaranteeing a copy not happening would", which is simply wrong. The spec makes no such guarantee. You can discuss that if you find it interesting, but i suggest doing it on a different forum. |
BetaWas this translation helpful?Give feedback.
All reactions
-
You seem to be talking now about the number of declarations, whereas your original statement was about the number of copies. I think that is a source of confusion here. The spec currently guarantees that a program executes as if there is one copy from a ranged collection to the second range variable per execution of the loop, and that guarantee will not change with the proposal. The spec guarantees that a range loop declares its variables exactly once, which is exactly the thing proposed to be changed. So, your argument now reads as "a program that currently depends on there being one declaration per loop would be an example of a program that currently depends on there being one declaration per loop." |
BetaWas this translation helpful?Give feedback.
All reactions
👍 3
-
To maybe get back to the original question in the thread, I'll give an example of the sort of thing being requested:
This is an implementation that sums a slice of integers that is correct under the current semantics that would be broken by the new semantics. However this is a contrived example. Do folks have examples fromreal code (historical or current) that would transition from correct to broken? |
BetaWas this translation helpful?Give feedback.
All reactions
👍 4
-
This is very impressive, IMHO. I didn't think a meaningful (however contrived) example like this was even possible. |
BetaWas this translation helpful?Give feedback.
All reactions
-
There's alsoanother contrived example here anda real example, which I really liked here. However, even the real example just worked accidentally, not because the author sat down and considered the implications of loop variable allocation and decided to do it this way. And yes, that was the point of this thread, AIUI, trying to find a case where an author fully intentionally and correctly relied on this behavior. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 2
-
One idea would be to release a GOEXPERIMENT flag to turn on compiling using the new semantics. This would allow for folks to run their own unit tests, integration tests, canaries, performance monitoring, etc. under the new semantics. This could be a source of additional data from the community if we think we need more data to make a decision. Later this could be helpful before making a transition for performance sensitive projects. There is some risk projects would start to rely on this, but this may be relatively small if it is appropriately marked as going away. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 1
-
I think@timothy-king makes a good point that real data from more people would be worthwhile. In case anyone missed it, there is already a CL available for an early prototype from@dr2chase that is very easy to try out: (And sorry, posting "in case you missed it" comments in a long discussion usually just makes the discussion even longer and even harder to follow, but maybe worth it this one time 😅, including because those comments from@dr2chase are currently stuck inside the GitHub "hidden items" wormhole). |
BetaWas this translation helpful?Give feedback.
All reactions
👍 1❤️ 1
-
One thing to watch for with the prototype is that it is a work in progress, and it may at some point acquire Go-version awareness. Because of inlining, if/when this happens, the compiler will need to know thatthis for loop is from version 1.19 andthat for loop is from version 1.hasTheChange and compile them differently. IF/WHEN I do this, most likely I'll pretend that whatever the current dev version is has the change, though this is guaranteed to be a lie well into next year (cannot happen before 1.22, assuming proposal approved). Real data would be interesting, and do note that negative results are valuable here -- if we only hear from people with problems, we'll only have numerator, not denominator. (Negative results -- for the benchmarks in "bent", some CLs out from what's checked in now so it includes a couple more, no notable performance problems, no differences in test failures.) |
BetaWas this translation helpful?Give feedback.
All reactions
-
Support it absolutely. But can we have a hint when using different go version? |
BetaWas this translation helpful?Give feedback.
All reactions
-
I just unexpectedly ran into this problem today and figured I'd chime in with my example and my 2¢. Semantically, my code is registering callbacks to receive notifications on a bunch of different key/value updates from a database. typeSubscriptionstruct {keystringfnfunc(ctx context.Context,valuestring,okbool)error}subscriptions:= []Subscription{Subscription{key1,key1handler},Subscription{key2,key2handler},...}for_,subscription:=rangesubscriptions {database.Subscribe(subscription.key,func(k,vstring) {// ... do some common stuff ...subscription.fn(k,v) })} When I ran it, counter-intuitively, Not sure if this adds much to the discussion but it's a real world example where the change would have eliminated at least a couple of hours of debugging. Count me in as in favor of this proposal! |
BetaWas this translation helpful?Give feedback.
All reactions
👍 1
-
This discussion has been very helpful. Thanks to everyone. I think everything people want to say has been said, as evidenced by recent repetition of older comments, so I'm going to close this discussion. Thanks again! |
BetaWas this translation helpful?Give feedback.