blob: bf206fb6d4203accc533d82d1d1fc9f2742043db [file] [log] [blame] [view]
---
title: Rangefunc Experiment
---
This page originally described an experimental range-over-function
language feature.
The feature was [added to Go 1.23](/doc/go1.23#language).
There is a [blog post](/blog/range-functions) describing it.
This page answers a few frequently asked questions about the change.
### What is a simple example of how range over function runs?
Consider this function for iterating a slice backwards:
package slices
func Backward[E any](s []E) func(func(int, E) bool) {
return func(yield func(int, E) bool) {
for i := len(s)-1; i >= 0; i-- {
if !yield(i, s[i]) {
return
}
}
}
}
It can be invoked as:
s := []string{"hello", "world"}
for i, x := range slices.Backward(s) {
fmt.Println(i, x)
}
This program would translate inside the compiler to a program more like:
slices.Backward(s)(func(i int, x string) bool {
fmt.Println(i, x)
return true
})
The `return true` at the end of the body is the implicit `continue` at
the end of the loop body.
An explicit continue would translate to `return true` as well.
A break statement would translate to `return false` instead.
Other control structures are more complicated but still possible.
### What will idiomatic APIs with range functions look like?
We don't know yet, and that's really part of the eventual standard
library proposal.
One convention we've adopted is that a container's `All` method should
return an iterator:
func (t *Tree[V]) All() iter.Seq[V]
Specific containers might provide other iterator methods as well.
Maybe a list would provide backward iteration too:
func (l *List[V]) All() iter.Seq[V]
func (l *List[V]) Backward() iter.Seq[V]
These examples are meant to show that the library can be written in a
way that should make these kinds of functions readable and
understandable.
### How are more complicated loops implemented?
Beyond simple break and continue, other control flow (labeled break,
continue, goto out of the loop, return) requires setting a variable
that the code outside the loop can consult when the loop breaks.
For example a `return` might turn into something like `doReturn = true;
return false` where the `return false` is the `break` implementation,
and then when the loop finishes, other generated code would do
`if (doReturn) return`.
The full rewrite is explained at the top of
[cmd/compile/internal/rangefunc/rewrite.go](https://go.googlesource.com/go/+/c7ea20195a3415668047eebdc488a4af1f629f04/src/cmd/compile/internal/rangefunc/rewrite.go)
in the implementation.
### What if the iterator function ignores yield returning false?
For range-over-function loops, the yield function generated for the
body checks if it is called after it has returned false or after the
loop itself has exited.
In either case, it will panic.
### Why are yield functions limited to at most two arguments?
There has to be a limit; otherwise people file bugs against the
compiler when it rejects ridiculous programs.
If we were designing in a vacuum, perhaps we would say it was
unlimited but that implementations only had to allow up to 1000, or
something like that.
However, we are not designing in a vacuum: [go/ast](/pkg/go/ast) and
[go/parser](/pkg/go/parser) exist, and they can only represent and
parse up to two range values.
We clearly need to support two values to simulate existing range
usages.
If it were important to support three or more values, we could change
those packages, but there does not appear to be a terribly strong
reason to support three or more, so the simplest choice is to stop at
two and leave those packages unchanged.
If we find a strong reason for more in the future, we can revisit that
limit.
Another reason to stop at two is to have a more limited number of
function signatures for general code to define.
Today the (iter)[/pkg/iter] package can easily define names for
iterators:
package iter
type Seq[V any] func(yield func(V) bool) bool
type Seq2[K, V any] func(yield func(K, V) bool) bool
### What do stack traces look like in the loop body?
The loop body is called from the iterator function, which is called
from the function in which the loop body appears.
The stack trace will show that reality.
This will be important for debugging iterators, aligning with stack
traces in debuggers, and so on.
### What happens if the loop body defers a call? Or if the iterator function defers a call?
If a range-over-func loop body defers a call, it runs when the outer
function containing the loop returns, just as it would for any other
kind of range loop.
That is, the semantics of defer do not depend on what kind of value is
being ranged over.
It would be quite confusing if they did.
That kind of dependence seems unworkable from a design perspective.
Some people have proposed disallowing defer in a range-over-func loop
body, but this would be a semantic change based on the kind of value
being ranged over and seems similarly unworkable.
The loop body's defer runs exactly when it looks like it would if you
didn't know anything special was happening in range-over-func.
If an iterator function defers a call, the call runs when the iterator
function returns.
The iterator function returns when it runs out of values or is told to
stop by the loop body (because the loop body hit a `break` statement
which translated to `return false`).
This is exactly what you want for most iterator functions.
For example an iterator that returns lines from a file can open the
file, defer closing the file, and then yield lines.
The iterator function's defer runs exactly when it looks like it would
if you didn't know the function was being used in a range loop at all.
This pair of answers can mean the calls run in a different time order
than the defer statements executed, and here the goroutine analogy is
useful.
Think of the main function running in one goroutine and the iterator
running in another, sending values over a channel.
In that case, the defers can run in a different order than they were
created because the iterator returns before the outer function does,
even if the outer function loop body defers a call after the iterator
does.
### What happens if the loop body panics? Or if the iterator function panics?
The deferred calls run in the same order for panic that they would in
an ordinary return: first the calls deferred by the iterator, then the
calls deferred by the loop body and attached to the outer function.
It would be very surprising if ordinary returns and panics ran
deferred calls in different orders.
Again there is an analogy to having the iterator in its own
goroutine.
If before the loop started the main function deferred a cleanup of the
iterator, then a panic in the loop body would run the deferred cleanup
call, which would switch over to the iterator, run its deferred calls,
and then switch back to continue the panic on the main goroutine.
That's the same order the deferred calls run in an ordinary iterator,
even without the extra goroutine.
See [this
comment](https://github.com/golang/go/issues/61405#issuecomment-1641050065)
for a more detailed rationale for these defer and panic semantics.
### What happens if the iterator function recovers a panic in the loop body?
The compiler and runtime will detect this case and trigger a [run-time
panic](/ref/spec#Run_time_panics).
### Can range over a function perform as well as hand-written loops?
In principle, yes.
Consider the slices.Backward example again, which first translates to:
slices.Backward(s)(func(i int, x string) bool {
fmt.Println(i, x)
return true
})
The compiler can recognize that slices.Backward is trivial and inline
it, producing:
func(yield func(int, E) bool) bool {
for i := len(s)-1; i >= 0; i-- {
if !yield(i, s[i]) {
return false
}
}
return true
}(func(i int, x string) bool {
fmt.Println(i, x)
return true
})
Then it can recognize a function literal being immediately called and
inline that:
{
yield := func(i int, x string) bool {
fmt.Println(i, x)
return true
}
for i := len(s)-1; i >= 0; i-- {
if !yield(i, s[i]) {
goto End
}
}
End:
}
Then it can devirtualize yield:
{
for i := len(s)-1; i >= 0; i-- {
if !(func(i int, x string) bool {
fmt.Println(i, x)
return true
})(i, s[i]) {
goto End
}
}
End:
}
Then it can inline that func literal:
{
for i := len(s)-1; i >= 0; i-- {
var ret bool
{
i := i
x := s[i]
fmt.Println(i, x)
ret = true
}
if !ret {
goto End
}
}
End:
}
From that point the SSA backend can see through all the unnecessary
variables and treats that code the same as
for i := len(s)-1; i >= 0; i-- {
fmt.Println(i, s[i])
}
This looks like a fair amount of work, but it only runs for simple
bodies and simple iterators, below the inlining threshold, so the work
involved is small.
For more complex bodies or iterators, the overhead of the function
calls is insignificant.
In any given release the compiler may or may not implement this series
of optimizations.
We continue to improve the compiler with every release.
### Can you provide more motivation for range over functions?
The most recent motivation is the addition of generics, which we
expect will lead to custom containers such as ordered maps, and it
would be good for those custom containers to work well with range
loops.
Another equally good motivation is to provide a better answer for the
many functions in the standard library that collect a sequence of
results and return the whole thing as a slice.
If the results can be generated one at a time, then a representation
that allows iterating over them scales better than returning an entire
slice.
We do not have a standard signature for functions that represent this
iteration.
Adding support for functions in range would both define a standard
signature and provide a real benefit that would encourage its use.
For example, here are a few functions from the standard library that
return slices but probably merit forms that return iterators instead:
- strings.Split
- strings.Fields
- bytes variants of the above
- regexp.Regexp.FindAll and friends
There are also functions we were reluctant to provide in slices form
that probably deserve to be added in iterator form.
For example, there should be a strings.Lines(text) that iterates over
the lines in a text.
Similarly, iteration over lines in a bufio.Reader or bufio.Scanner is
possible, but you have to know the pattern, and the pattern is
different for those two and tends to be different for each
type.
Establishing a standard way to express iteration will help converge
the many different approaches that exist today.
For additional motivation for iterators, see [#54245](/issue/54245).
For additional motivation specifically for range over functions, see
[#56413](/issue/56413).
### Will Go programs using range over functions be readable?
We think they can be.
For example using slices.Backward instead of the explicit count-down
loop should be easier to understand, especially for developers who
don't see count-down loops every day and have to think carefully
through the boundary conditions to make sure they are correct.
It is true that the possibility of range over a function means that
when you see range x, if you don't know what x is, you don't know
exactly what code it will run or how efficient it will be.
But slice and map iteration are already fairly different as far as the
code that runs and how fast it is, not to mention channels.
Ordinary function calls have this problem too - in general we have no
idea what the called function will do - and yet we find ways to write
readable, understandable code, and even to build an intuition for
performance.
The same will certainly happen with range over functions.
We will build up useful patterns over time, and people will recognize
the most common iterators and know what they do.
### Why are the semantics not exactly like if the iterator function ran in a coroutine or goroutine?
Running the iterator in a separate coroutine or goroutine is more
expensive and harder to debug than having everything on one
stack.
Since we're going to have everything on one stack, that fact will
change certain visible details.
We saw the first above: stack traces show the calling function and the
iterator function interleaved, as well as showing the explicit yield
function that does not exist on the page in the program.
It can be helpful to think about running the iterator function in its
own coroutine or goroutine as an analogy or mental model, but in some
cases the mental model doesn't give the best answer, because it uses
two stacks, and the real implementation is defined to use one.