commit | 4837cb2a2c968d9af1aea5c5bd63eb0a9ef3d50a | [log] [tgz] |
---|---|---|
author | Julie Qiu <julie@golang.org> | Mon Apr 27 17:47:17 2020 -0400 |
committer | Julie Qiu <julieqiu@google.com> | Thu Apr 30 14:41:16 2020 +0000 |
tree | fd002cf0adf7f4d348581d998715383aa1012e52 | |
parent | 3c5ea2343246e593f11750c8bc9c0c6d1ddcc8a0 [diff] |
internal/postgres: change reprocessing logic At the moment, we reprocess and requeue modules using the following logic: 1. Set all modules to be reprocessed = 505. 2. Requeue modules with status=0 or status >= 500. Prioritize the following: - IsLatest: sorted by release vs prerelease modules - IsBig: hardcoded list of modules we know are big This poses the following problems: 1. Requeue order is not idempotent: priority is given to categories of modules, but within each category, the order of modules being queued can change each time requeue is called. This leads to many modules sitting in the task queue, and a lack of clarity as to how much progress we have made when looking at the logs. 2. Modules missing from isBig list: there are several modules missing from the isBig list, but these aren't being accounted for. We deproritize large modules because they take a really long time to process and can timeout if too many are being processed at once, so we want to process them at a slower rate than other modules. 3. Alternative modules have the same priority as non-alternative modules: we usually don't care about alternative modules, and they will be deleted from search_documents once identified. These should be processed after the lastest version of non-alternative modules are processed to prevent unnecessary deletes. To address these issues, reprocessing / requeue now follows the following logic: 1. All modules are reprocess with a 50x status code based on their last fetch status in module version states. 2. Modules are requeued in the following order (with the exception of large modules): - Latest version of modules previously with 20x status - Latest version of bad modules and alternative modules - Any version of modules previously with 20x status - Any version of bad modules and alternative modules - Any module with a status=0 or status=500 (we expect these to already be in the queue) 3. All large modules are queued last, since these take up a lot of time and need to be processed at a slower rate. Within each category, modules are sorted as follows: 1. num_packages 2. version DESC 3. module_path This keeps the order idempotent, and prioritizes smaller and newer modules. It also allows modules of similar sizes to be processed together. Change-Id: I49580ed75bf60cc2698b756882bfdc906f72d935 Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/725873 Reviewed-by: Jonathan Amsterdam <jba@google.com>
Pkg.go.dev is a website for discovering and evaluting Go packages and modules.
Pkg.go.dev launched in November 2019, and is currently under active development by the Go team.
Our current goal is to work towards redirecting godoc.org traffic to pkg.go.dev, and ensure that we address users' needs in the process. Read more about our plans for pkg.go.dev in 2020.
We encourage everyone to begin using pkg.go.dev today for all of their needs and file feedback! You can redirect all of your requests from godoc.org to pkg.go.dev, by visiting godoc.org/?redirect=on. Details at Go issue #37099.
If you are having issues with pkg.go.dev, please first check the known issues before following the troubleshooting guide. If that does not give you the information you need, reach out to us.
You can chat with us on the #tools slack channel on the Gophers slack.
If you think you have an issue that needs fixing, or a feature suggestion, then please make sure you follow the steps to file an issue with the right information to allow us to address it.
We would love your help!
Our canonical Git repository is located cat go.googlesource.com/discovery. There is a mirror of the repository at github.com/golang/discovery.
To contribute, please read our contributing guide.
Unless otherwise noted, the Go source files are distributed under the BSD-style license found in the LICENSE file.