internal/upload: fix upload locking
The upload locking logic left a brief race that could lead to duplicate
uploads, as the existence of an uploaded report was checked before
locking, not after. Fix this.
As described in golang/go#67737, more races exist that could lead to
broken or partial uploads, but this should prevent overcounting uploads,
as was originally intended.
This closes golang/go#65970, as it is the last thing I feel comfortable
doing relative to the upload process. More refactoring will have to wait
until 1.24.
For golang/go#67737
Fixes golang/go#65970
Change-Id: Iadd58402bbe3fb32b4daf479d8d800eaef47c370
Reviewed-on: https://go-review.googlesource.com/c/telemetry/+/598036
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Auto-Submit: Robert Findley <rfindley@google.com>
diff --git a/internal/upload/run_test.go b/internal/upload/run_test.go
index e788d9f..c2cb08e 100644
--- a/internal/upload/run_test.go
+++ b/internal/upload/run_test.go
@@ -586,7 +586,7 @@
for i, upload := range uploads {
var got telemetry.Report
if err := json.Unmarshal(upload, &got); err != nil {
- t.Fatal(err)
+ t.Fatalf("error unmarshalling uploaded report: %v\ncontents:%s", err, upload)
}
if got, want := len(got.Programs), 1; got != want {
t.Fatalf("got %d programs in upload #%d, want %d", got, i, want)
diff --git a/internal/upload/upload.go b/internal/upload/upload.go
index 2a3bf70..aa0f4f0 100644
--- a/internal/upload/upload.go
+++ b/internal/upload/upload.go
@@ -65,12 +65,6 @@
fdate = fdate[len(fdate)-len("2006-01-02"):]
newname := filepath.Join(u.dir.UploadDir(), fdate+".json")
- if _, err := os.Stat(newname); err == nil {
- // Another process uploaded but failed to clean up (or hasn't yet cleaned
- // up). Ensure that cleanup occurs.
- _ = os.Remove(fname)
- return false
- }
// Lock the upload, to prevent duplicate uploads.
{
@@ -84,6 +78,14 @@
defer os.Remove(lockname)
}
+ if _, err := os.Stat(newname); err == nil {
+ // Another process uploaded but failed to clean up (or hasn't yet cleaned
+ // up). Ensure that cleanup occurs.
+ u.logger.Printf("After acquire: report already uploaded")
+ _ = os.Remove(fname)
+ return false
+ }
+
endpoint := u.uploadServerURL + "/" + fdate
b := bytes.NewReader(buf)
resp, err := http.Post(endpoint, "application/json", b)