sweet: force-kill CockroachDB if SIGTERM isn't working

This change uses SIGKILL when benchmarks are finished running but for
whatever reason the CockroachDB cluster isn't responding to SIGTERM. At
this point, it's fine to forcibly kill the server, but let's also make
sure we kill all other instances too (since there's no clean shutdown).

Currently timeouts in the cockroachdb benchmark are causing loss of data
(a separate issue we should fix) but even if that wasn't the case, we'd
also be losing data for CockroachDB. This CL fixes the problem with the
benchmark: locally I couldn't get it to succeed with 20 runs, but with
this patch, it has no problem finishing. We should investigate why
CockroachDB isn't responding to SIGTERM and whether there's a cleaner
way to ensure a shutdown. This is OK for now, and we may want to keep
this behavior long-term anyway (useful when benchmarking unvetted
patches that cause a hang, for example).

This CL also adds a bunch more logging to the benchmark runner, too.

Change-Id: I57cf27f35b71b6c69a8ca2ec38107e1c912a5167
Cq-Include-Trybots: luci.golang.try:x_benchmarks-gotip-linux-amd64-longtest,x_benchmarks-go1.22-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/benchmarks/+/594775
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
1 file changed