maintner: propagate and use internal invariants in finishProcessingCL

This change simplifies and cleans up finishProcessingCL. It documents
a precondition for finishProcessingCL that cl.Meta must be non-nil.

finishProcessingCL is only ever called on CLs that have been noted as
dirty via noteDirtyCL, and CLs are always given a non-nil Meta before
noteDirtyCL is called on them.

foreachCommitParent sounded like it called f on all parents of commit,
not including commit itself. In reality, it called f on commit as well.
Rename it to foreachCommit and improve its documentation to make this
more visible.

As a result, it becomes more clear that cl.Metas will contain at least
one element (e.g., at least mostRecentMetaCommit). Then, cl.Metas[0]
can be used to find the CL creation time.

Also change foreachCommit to return an error if a parent commit isn't
found. Otherwise, we'd be cutting the history walk short and the caller
wouldn't know. This shouldn't happen, but if it does, it'll be logged
(by finishProcessingCL).

Define a CL to be complete if its Meta and Commit fields are non-nil,
and the Metas slice contains at least 1 element. Use this to simplify
OwnerName and OwnerID methods and remove the no longer needed
firstMetaCommit method.

Finally, use the simplified finishProcessingCL to set cl.Created to
the actual creation time of the CL (rather than "last updated" time).
It's easy now, because we can deduce that len(cl.Metas) >= 1 is
guaranteed to be true, and therefore cl.Metas[0] will not panic.
There's no need to add guards or think about what to do if cl.Metas
is empty.

Document GerritCL.Created field, since it's exported.

Background

This change was started with the goal of fixing golang/go#24744.
It was hard to fix that bug with a short diff, because there were
too few guarantees inside finishProcessingCL that could be relied on.
I wanted to improve that as part of my fix.

We already established some guarantees for the user-facing CLs (i.e.,
that their Commit, Meta fields are non-nil and len(cl.Metas) >= 1).
I started propagating and using that property in a few more places
in internal code.

It allowed simplifications such as:

	 // OwnerName returns the name of the CL’s owner or an empty string on error.
	 func (cl *GerritCL) OwnerName() string {
	-	m := cl.firstMetaCommit()
	-	if m == nil {
	-		return ""
	-	}
	-	return m.Author.Name()
	+	if !cl.complete() {
	+		return ""
	+	}
	+	return cl.Metas[0].Commit.Author.Name()
	 }
	-
	-func (cl *GerritCL) firstMetaCommit() *GitCommit {
	-	m := cl.Meta
	-	if m == nil { // TODO: Can this actually happen, besides in one of the contrived tests? Remove?
	-		return nil
	-	}
	-	c := m.Commit
	-	for c != nil && len(c.Parents) > 0 {
	-		c = c.Parents[0] // Meta commits don’t have more than one parent.
	-	}
	-	return c
	-}

	 // complete reports whether cl is complete.
	 // A CL is considered complete if its Meta and Commit fields are non-nil,
	 // and the Metas slice contains at least 1 element.
	 func (cl *GerritCL) complete() bool {
	 	return cl.Meta != nil &&
	 		len(cl.Metas) >= 1 &&
	 		cl.Commit != nil
	 }

Also, previously, it was hard to be sure that the
gc, ok := ... line wouldn't panic:

	// called with Corpus.mu Locked
	func (gp *GerritProject) finishProcessingCL(cl *GerritCL) {

		// What happens if cl.Meta is nil?
		// Should there be a if cl.Meta == nil guard here?
		// What should it do if cl.Meta is indeed nil?
		gc, ok := c.gitCommit[cl.Meta.Commit.Hash]
		if !ok {
			log.Printf("WARNING: GerritProject(%q).finishProcessingCL failed to find CL %v hash %s",
				gp.ServerSlashProject(), cl.Number, cl.Meta.Commit.Hash)
			return
		}

By documenting that cl.Meta must be non-nil as a precondition of
finishProcessingCL, it became possible to know it won't panic:

	// finishProcessingCL fixes up invariants before the cl can be returned back to the user.
	// cl.Meta must be non-nil.
	//
	// called with Corpus.mu Locked
	func (gp *GerritProject) finishProcessingCL(cl *GerritCL) {

		// cl.Meta can't be nil due to precondition.
		gc, ok := c.gitCommit[cl.Meta.Commit.Hash]
		if !ok {
			log.Printf("WARNING: GerritProject(%q).finishProcessingCL failed to find CL %v hash %s",
				gp.ServerSlashProject(), cl.Number, cl.Meta.Commit.Hash)
			return
		}

(It would be a bug to call finishProcessingCL with a cl where
cl.Meta is nil, because it'd be violating the precondition.)

I think stating preconditions and propagating more internal invariants
is a good general direction, because it allows the code to be simpler
and easier to verify as being correct. Otherwise, one needs to keep
checking if fields are non-nil everywhere before using them and think
what to do if they are nil, and that's unpleasant.

Fixes golang/go#24744.

Change-Id: I07e362d52e30089a9ba03c30b04ad19b2c385722
Reviewed-on: https://go-review.googlesource.com/111877
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2 files changed
tree: 1cf69a6d36bb6dce902bdc3627d1441428f20045
  1. app/
  2. auth/
  3. autocertcache/
  4. buildenv/
  5. buildlet/
  6. cmd/
  7. dashboard/
  8. devapp/
  9. doc/
  10. env/
  11. envutil/
  12. gerrit/
  13. internal/
  14. kubernetes/
  15. livelog/
  16. maintner/
  17. pargzip/
  18. revdial/
  19. status/
  20. tarutil/
  21. types/
  22. vcs-test/
  23. vendor/
  24. version/
  25. .dockerignore
  26. AUTHORS
  27. build.go
  28. codereview.cfg
  29. CONTRIBUTING.md
  30. CONTRIBUTORS
  31. LICENSE
  32. PATENTS
  33. README.md
  34. update-deps.sh
  35. update-readmes.go
README.md

Go Build Tools

This subrepository holds the source for various packages and tools that support Go's build system and the development of the Go programming language.

Report Issues / Send Patches

This repository uses Gerrit for code changes. To contribute, see https://golang.org/doc/contribute.html.

The main issue tracker for the blog is located at https://github.com/golang/go/issues. Prefix your issue with “x/build/DIR: ” in the subject line.

Overview

The main components of the Go build system are:

  • The dashboard, in app/, serves https://build.golang.org/. It runs on App Engine and holds the state for which builds passed or failed, and stores the build failure logs for post-submit failures. (Trybot build failure logs are stored elsewhere). The dashboard does not execute any builds on its own.

  • The coordinator, in cmd/coordinator/, serves https://farmer.golang.org/. It runs on GKE and coordinates the whole build system. It finds work to do (both pre-submit “TryBot” work, and post-submit work) and executes builds, allocating machines to run the builds. It is the owner of all machines.

  • The Go package in buildenv/ contains constants for where the dashboard and coordinator run, for prod, staging, and local development.

  • The buildlet, in cmd/buildlet/, is the HTTP server that runs on each worker machine to execute builds on the coordinator's behalf. This runs on every possible GOOS/GOARCH value. The buildlet binaries are stored on Google Cloud Storage and fetched per-build, so we can update the buildlet binary independently of the underlying machine images. The buildlet is the most insecure server possible: it has HTTP handlers to read & write arbitrary content to disk, and to execute any file on disk. It also has an SSH tunnel handler. The buildlet must never be exposed to the Internet. The coordinator provisions buildlets in one of three ways:

    1. by creating VMs on Google Compute Engine (GCE) with custom images configured to fetch & run the buildlet on boot, listening on port 80 in a private network.

    2. by running Linux containers (on either Google Kubernetes Engine or GCE with the Container-Optimized OS image), with the container images configured to fetch & run the buildlet on start, also listening on port 80 in a private network.

    3. by taking buildlets out of a pool of connected, dedicated machines. The buildlet can run in either listen mode (as on GCE and GKE) or in reverse mode. In reverse mode, the buildlet connects out to https://farmer.golang.org/ and registers itself with the coordinator. The TCP connection is then logically reversed (using revdial and when the coordinator needs to do a build, it makes HTTP requests to the coordinator over the already-open TCP connection.

    These three pools can be viewed at the coordinator's http://farmer.golang.org/#pools

  • The env/ directory describes build environments. It contains scripts to create VM images, Dockerfiles to create Kubernetes containers, and instructions and tools for dedicated machines.

  • maintner in maintner/ is a library for slurping all of Go's GitHub and Gerrit state into memory. The daemon maintnerd in maintner/maintnerd/ runs on GKE and serves https://maintner.golang.org/. The daemon watches GitHub and Gerrit and apps to a mutation log whenever it sees new activity. The logs are stored on GCS and served to clients.

  • The godata package in maintner/godata/ provides a trivial API to let anybody write programs against Go's maintner corpus (all of our GitHub and Gerrit history), live up to the second. It takes a few seconds to load into memory and a few hundred MB of RAM after it downloads the mutation log from the network.

  • pubsubhelper in cmd/pubsubhelper/ is a dependency of maintnerd. It runs on GKE, is available at https://pubsubhelper.golang.org/, and runs an HTTP server to receive Webhook updates from GitHub on new activity and an SMTP server to receive new activity emails from Gerrit. It then is a pubsub system for maintnerd to subscribe to.

  • The gitmirror server in cmd/gitmirror/ mirrors Gerrit to GitHub, and also serves a mirror of the Gerrit code to the coordinator for builds, so we don't overwhelm Gerrit and blow our quota.

  • The Go gopherbot bot logic runs on GKE. The code is in cmd/gopherbot. It depends on maintner via the godata package.

  • The developer dashboard at https://dev.golang.org/ runs on GKE. Its code is in devapp/. It also depends on maintner via the godata package.

  • cmd/retrybuilds: a Go client program to delete build results from the dashboard

Adding a Go Builder

If you wish to run a Go builder, please email golang-dev@googlegroups.com first. There is documentation at https://golang.org/wiki/DashboardBuilders, but depending on the type of builder, we may want to run it ourselves, after you prepare an environment description (resulting in a VM image) of it. See the env directory.