cmd/compile: optimize switch on strings

When compiling expression switches, we try to optimize runs of
constants into binary searches. The ordering used isn't visible to the
application, so it's unimportant as long as we're consistent between
sorting and searching.

For strings, it's much cheaper to compare string lengths than strings
themselves, so instead of ordering strings by "si <= sj", we currently
order them by "len(si) < len(sj) || len(si) == len(sj) && si <= sj"
(i.e., the lexicographical ordering on the 2-tuple (len(s), s)).

However, it's also somewhat cheaper to compare strings for equality
(i.e., ==) than for ordering (i.e., <=). And if there were two or
three string constants of the same length in a switch statement, we
might unnecessarily emit ordering comparisons.

For example, given:

    switch s {
    case "", "1", "2", "3": // ordered by length then content
        goto L
    }

we currently compile this as:

    if len(s) < 1 || len(s) == 1 && s <= "1" {
        if s == "" { goto L }
        else if s == "1" { goto L }
    } else {
        if s == "2" { goto L }
        else if s == "3" { goto L }
    }

This CL switches to using a 2-level binary search---first on len(s),
then on s itself---so that string ordering comparisons are only needed
when there are 4 or more strings of the same length. (4 being the
cut-off for when using binary search is actually worthwhile.)

So the above switch instead now compiles to:

    if len(s) == 0 {
        if s == "" { goto L }
    } else if len(s) == 1 {
        if s == "1" { goto L }
        else if s == "2" { goto L }
        else if s == "3" { goto L }
    }

which is better optimized by walk and SSA. (Notably, because there are
only two distinct lengths and no more than three strings of any
particular length, this example ends up falling back to simply using
linear search.)

Test case by khr@ from CL 195138.

Fixes #33934.

Change-Id: I8eeebcaf7e26343223be5f443d6a97a0daf84f07
Reviewed-on: https://go-review.googlesource.com/c/go/+/195340
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2 files changed
tree: eb22bbf44df9df35f27492ee6267ad2d32d0453a
  1. .github/
  2. api/
  3. doc/
  4. lib/
  5. misc/
  6. src/
  7. test/
  8. .gitattributes
  9. .gitignore
  10. AUTHORS
  11. CONTRIBUTING.md
  12. CONTRIBUTORS
  13. favicon.ico
  14. LICENSE
  15. PATENTS
  16. README.md
  17. robots.txt
  18. SECURITY.md
README.md

The Go Programming Language

Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.

Gopher image Gopher image by Renee French, licensed under Creative Commons 3.0 Attributions license.

Our canonical Git repository is located at https://go.googlesource.com/go. There is a mirror of the repository at https://github.com/golang/go.

Unless otherwise noted, the Go source files are distributed under the BSD-style license found in the LICENSE file.

Download and Install

Binary Distributions

Official binary distributions are available at https://golang.org/dl/.

After downloading a binary release, visit https://golang.org/doc/install or load doc/install.html in your web browser for installation instructions.

Install From Source

If a binary distribution is not available for your combination of operating system and architecture, visit https://golang.org/doc/install/source or load doc/install-source.html in your web browser for source installation instructions.

Contributing

Go is the work of thousands of contributors. We appreciate your help!

To contribute, please read the contribution guidelines: https://golang.org/doc/contribute.html

Note that the Go project uses the issue tracker for bug reports and proposals only. See https://golang.org/wiki/Questions for a list of places to ask questions about the Go language.