| Go Dynamic Tools |
| Gophercon 2015, July 9, 2015 |
| |
| Dmitry Vyukov |
| Google |
| dvyukov@ |
| |
| |
| * Video |
| |
| A video of this talk was recorded at GopherCon in Denver. |
| |
| .link https://www.youtube.com/watch?v=a9xrxRsIbSU Watch the talk on YouTube |
| |
| |
| * About me |
| |
| Did a bunch of work on Go: |
| |
| - Scalable goroutine scheduler |
| - Integrated network poller |
| - Parallel GC, concurrent sweeping |
| - Memory allocator speed/space improvements |
| - Sync primitives |
| - Race detector |
| - Blocking profile |
| - 800+ commits, filed 500+ bugs |
| |
| But actually on dynamic testing tools team: |
| |
| - Thread/Address/MemorySanitizer |
| |
| * Agenda |
| |
| - Data race detector |
| - Go-fuzz, randomized testing system |
| - Execution tracer |
| |
| * Data race detector |
| |
| .image dynamic-tools/philosoraptor.png |
| |
| |
| * What is a data race? |
| |
| A data race occurs when two goroutines access the same variable concurrently and at least one of the accesses is a write. |
| |
| *All*bets*are*off*!* |
| |
| Any data race can destroy the memory/type-safety of a Go program. |
| |
| * There are no "benign" data races |
| |
| // goroutine 1 // goroutine 2 |
| m[k1] = v1 m[k2] = v2 |
| |
| Bad! |
| |
| // goroutine 1 // goroutine 2 |
| stat++ stat++ |
| |
| Also bad! |
| |
| Compilers assume race-free programs and do aggressive optimizations |
| based on that assumption (e.g. assume "ownership" over written-to variables). |
| |
| Races are non-deterministic and hard to debug. |
| |
| * Usage |
| |
| $ go test -race mypkg // to test the package |
| $ go run -race mysrc.go // to run the source file |
| $ go build -race mycmd // to build the command |
| $ go install -race mypkg // to install the package |
| |
| That's it! |
| |
| * Example |
| |
| package main |
| |
| func main() { |
| m := make(map[int]int) |
| go func() { |
| m[1] = 1 |
| }() |
| m[2] = 2 |
| } |
| |
| * Example report |
| |
| WARNING: DATA RACE |
| Write by goroutine 5: |
| runtime.mapassign1() |
| runtime/hashmap.go:411 +0x0 |
| main.main.func1() |
| race.go:6 +0x60 |
| |
| Previous write by main goroutine: |
| runtime.mapassign1() |
| runtime/hashmap.go:411 +0x0 |
| main.main() |
| race.go:8 +0xb6 |
| |
| Goroutine 5 (running) created at: |
| main.main() |
| race.go:7 +0x76 |
| |
| * Achievements |
| |
| - 70+ bugs in std lib |
| - 350+??? bugs in Google internal code base |
| - ??? bugs found in the wild |
| |
| * Instrumentation |
| |
| Compiler instrumentation pass enabled by -race. |
| |
| func foo(p *int) { |
| *p = 1 |
| } |
| |
| Becomes: |
| |
| func foo(p *int) { |
| runtime.funcenter(caller_pc) |
| runtime.racewrite(p) |
| *p = 1 |
| runtime.funcexit() |
| } |
| |
| * Run-time module |
| |
| Handles: |
| |
| - memory accesses (to catch racy accesses) |
| - synchronization (to not produce false reports) |
| - function calls (to collect stack traces) |
| - goroutine creation/exit (to keep track of live goroutines) |
| |
| Algorithm is based on dynamic modelling of happens-before relation: |
| |
| - no false positives |
| - false negatives are possible |
| |
| * Usage tips |
| |
| Dynamic tools are only as good as your tests are. |
| |
| - write good *concurrent* tests |
| - have continuous build with race detector |
| - run integration tests |
| - run race-enabled canaries in production |
| |
| * Go-fuzz |
| |
| .image dynamic-tools/go-fuzz.png |
| |
| * Randomized testing |
| |
| A different approach to testing that finds [lots of] bugs that other testing approaches do not. Intended mostly for programs that parse complex inputs. |
| |
| Generate random blob -> feed into program -> see if it crashes -> profit! |
| |
| - cheap to use |
| - does not have any bias |
| |
| Completely random blobs won't uncover lots of bugs. |
| |
| How can we generate diverse but meaningful inputs that will trigger |
| nil derefs, off-by-ones, etc? |
| |
| * Coverage-guided fuzzing |
| |
| Genetic algorithms to the rescue! |
| |
| Instrument program for code coverage |
| Collect initial corpus of inputs |
| for { |
| Randomly mutate an input from the corpus |
| Execute and collect coverage |
| If the input gives new coverage, add it to corpus |
| } |
| |
| * Example |
| |
| The following code wants "ABCD" input: |
| |
| if input[0] == 'A' { |
| if input[1] == 'B' { |
| if input[2] == 'C' { |
| if input[3] == 'D' { |
| panic("input must not be ABCD") |
| } |
| } |
| } |
| } |
| |
| Corpus progression: |
| |
| "" |
| "", "A" |
| "", "A", "AB" |
| "", "A", "AB", "ABC" |
| "", "A", "AB", "ABC", "ABCD" |
| |
| * Game over |
| |
| CRC32 checksum verification in `image/png/reader.go` |
| |
| func (d *decoder) verifyChecksum() error { |
| if binary.BigEndian.Uint32(d.tmp[:4]) != d.crc.Sum32() { |
| return FormatError("invalid checksum") |
| } |
| return nil |
| } |
| |
| Probability that random mutations will alter input in an interesting way and |
| guess CRC32 at the same time is basically ZERO. |
| |
| * Sonar |
| |
| Don't need to guess, program knows it! |
| |
| + v1 := binary.BigEndian.Uint32(d.tmp[:4]) |
| + v2 := d.crc.Sum32() |
| + __go_fuzz.Sonar(v1, v2) |
| if v1 != v2 { |
| return FormatError("invalid checksum") |
| } |
| |
| Then, find v1 in the input and replace it with v2. Done! |
| |
| * Game over 2 |
| |
| Mutations and sonar do low-level changes ("bit-flipping"): |
| |
| Original: |
| |
| `<item name="foo"><prop name="price">100</prop></item>` |
| |
| Mutated: |
| |
| `<item name="foo"><prop name="price">100</prop><<item>` |
| |
| Also want high-level changes! |
| |
| * Versifier |
| |
| Versifier reverse-engineers [text] protocol and learns its _structure_. |
| |
| abc -> alphanum token |
| 123, 1e-2 -> number |
| "..." -> quoted |
| [...] -> parenthesized |
| ...,...,... -> list |
| ...\n...\n -> lines |
| |
| Then, applies _structural_ mutations to inputs. |
| |
| * Versifier example |
| |
| Original: |
| |
| `<item name="foo"><prop name="price">100</prop></item>` |
| |
| Versified (all valid xml): |
| |
| <item name="rb54ana"><item name="foo"><prop name="price"></prop><prop/></item></item> |
| <item name=""><prop name="price">=</prop><prop/> </item> |
| <item name=""><prop F="">-026023767521520230564132665e0333302100</prop><prop/></item> |
| <item SN="foo_P"><prop name="_G_nx">510</prop><prop name="vC">-9e-07036514</prop></item> |
| <item name="foo"><prop name="c8">prop name="p"</prop>/}<prop name="price">01e-6</prop></item> |
| <item name="foo"><item name="foo"><prop JY="">100</prop></item>8<prop/></item> |
| |
| * Algorithm |
| |
| .image dynamic-tools/algo.png |
| |
| * Achievements |
| |
| - 115 bugs in std lib (66 fixed) |
| - 43 bugs in golang.org/x/... (24 fixed) |
| - 134 elsewhere |
| |
| * Achievements |
| |
| fmt.Sprintf("%.[]") |
| panic: runtime error: index out of range |
| |
| regexp.MustCompile("((0(0){0}))").ReplaceAllString("00000", "00$00") |
| panic: runtime error: slice bounds out of range |
| |
| ioutil.ReadAll(flate.NewReader(strings.NewReader("4LJNIMK\a\x00\x00\xff..\xff.....\xff"))) |
| runs forever |
| |
| var x = 1/"."[0] |
| crashes compiler |
| |
| archive/tar: hang |
| archive/zip: cap out of range |
| encoding/gob: stack overflow |
| encoding/asn1: index out of range |
| image/jpeg: Decode hangs |
| image/png: nil deref |
| math/big: incorrect string->Float conversion |
| crypto/x509: division by zero |
| ... |
| |
| * Usage |
| |
| - go get github.com/dvyukov/go-fuzz/... |
| - write test: |
| |
| func Fuzz(data []byte) int { |
| gob.NewDecoder(bytes.NewReader(data)) |
| return 0 |
| } |
| |
| - build |
| |
| $ go-fuzz-build github.com/dvyukov/go-fuzz/examples/gob |
| |
| - collect corpus |
| - run |
| |
| $ go-fuzz -bin=gob-fuzz.zip -workdir=examples/gob |
| |
| * Execution tracer |
| |
| .image dynamic-tools/tracer.png |
| |
| * Execution tracer |
| |
| Gives insight into dynamic execution of a program. |
| |
| Captures with nanosecond precision: |
| |
| - goroutine creation/start/end |
| - goroutine blocking/unblocking |
| - network blocking |
| - system calls |
| - GC events |
| |
| * Execution tracer |
| |
| .image dynamic-tools/trace.png 450 _ |
| |
| * Recap |
| |
| - race detector: always use for testing (-race) |
| - go-fuzz: parsing of complex inputs (github.com/dvyukov/go-fuzz) |
| - execution tracer: deep dive into execution (-trace) |
| |
| |