| # Oscar, an open-source contributor agent architecture |
| |
| Oscar is a project aiming to improve open-source software development |
| by creating automated help, or “agents,” for open-source maintenance. |
| We believe there are many opportunities to reduce the |
| amount of toil involved with maintaining open-source projects |
| both large and small. |
| |
| The ability of large language models (LLMs) to do semantic analysis of |
| natural language (such as issue reports or maintainer instructions) |
| and to convert between natural language instructions and program code |
| creates new opportunities for agents to interact more smoothly with people. |
| LLMs will likely end up being only a small (but critical!) part of the picture; |
| the bulk of an agent's actions will be executing standard, deterministic code. |
| |
| Oscar differs from many development-focused uses of LLMs by not trying |
| to augment or displace the code writing process at all. |
| After all, writing code is the fun part of writing software. |
| Instead, the idea is to focus on the not-fun parts, like processing incoming issues, |
| matching questions to existing documentation, and so on. |
| |
| Oscar is very much an experiment. We don't know yet where it will go or what |
| we will learn. Even so, our first prototype, |
| the [@gabyhelp](https://github.com/gabyhelp) bot, has already had many |
| [successful interactions in the Go issue tracker](https://github.com/golang/go/issues?q=label%3Agabywins). |
| |
| For now, Oscar is being developed under the auspices of the Go project. |
| At some point in the future it may (or may not) be spun out into a separate project. |
| |
| The rest of this README explains Oscar in more detail. |
| |
| ## Goals |
| |
| The concrete goals for the Oscar project are: |
| |
| - Reduce maintainer effort to resolve issues |
| [note that resolve does not always mean fix] |
| - Reduce maintainer effort to resolve change lists (CLs) or pull requests (PRs) |
| [note that resolve does not always mean submit/merge] |
| - Reduce maintainer effort to resolve forum questions |
| - Enable more people to become productive maintainers |
| |
| It is a non-goal to automate away coding. |
| Instead we are focused on automating away maintainer toil. |
| |
| ## Approach |
| |
| Maintainer toil is not unique to the Go project, so we are aiming to build |
| an architecture that any software project can reuse and extend, |
| building their own agents customized to their project's needs. |
| Hence Oscar: _open-source contributor agent architecture_. |
| Exactly what that will mean is still something we are exploring. |
| |
| So far, we have identified three capabilities that will be an important part |
| of Oscar: |
| |
| 1. Indexing and surfacing related project context during |
| contributor interactions. |
| 2. Using natural language to control deterministic tools. |
| 3. Analyzing issue reports and CLs/PRs, to help improve them |
| in real time during or shortly after submission, |
| and to label and route them appropriately. |
| |
| It should make sense that LLMs have something to offer here, |
| because open-source maintenance is fundamentally about |
| interacting with people using natural language, and |
| natural language is what LLMs are best at. |
| So it's not surprising that all of these have an LLM-related component. |
| On the other hand, all of these are also backed by |
| significant amounts of deterministic code. |
| Our approach is to use LLMs for what they're good at—semantic |
| analysis of natural language and translation from |
| natural language into programs—and |
| rely on deterministic code to do the rest. |
| |
| The following sections look at each of those three important capabilities in turn. |
| Note that we are still experimenting, |
| and we expect to identify additional important capabilities as time goes on. |
| |
| ### Indexing and surfacing related project context |
| |
| Software projects are complex beasts. |
| Only at the very beginning can a maintainer expect |
| to keep all the important details and context in their head, |
| and even when that's possible, those being in one person's |
| head does not help when a new contributor arrives with |
| a bug report, a feature request, or a question. |
| To address this, maintainers write design documentation, |
| API references, FAQs, manual pages, blog posts, and so on. |
| Now, instead of providing context directly, a maintainer can |
| provide links to written context that already exists. |
| Serving as a project search engine is still not the best use of |
| the maintainer's time. |
| Once a project grows even to modest size, any single maintainer |
| cannot keep track of all the context that might be relevant, |
| making it even harder to serve as a project search engine. |
| |
| On the other hand, LLMs turn out to be a great platform for |
| building a project search engine. |
| LLMs can analyze documents and produce _embeddings_, |
| which are high-dimensional (for example, 768-dimensional) |
| floating point unit vectors with the property that documents |
| with similar semantic meaning are mapped to vectors that point in similar directions. |
| (For more about embeddings, see |
| [this blog post](https://cloud.google.com/blog/topics/developers-practitioners/meet-ais-multitool-vector-embeddings).) |
| Combined with a vector database to retrieve vectors similar |
| to an input vector, |
| LLM embeddings provide a very effective way to index |
| all of an open-source project's context, including |
| documentation, issue reports, and CLs/PRs, and forum discussions. |
| When a new issue report arrives, an agent can use the LLM-based |
| project context index to identify highly related context, |
| such as similar previous issues or relevant project documentation. |
| |
| Our prototype agent implements this functionality and replies to |
| new issues in the Go repository with a list of at most ten |
| highly related links that add context to the report. |
| (If the agent cannot find anything that looks related enough, |
| it stays quiet and does not reply at all.) |
| In the first few weeks we ran the agent, we identified the following |
| benefits of such an agent: |
| |
| 1. **The agent surfaces related context to contributors.** |
| |
| It is common for new issue reports to duplicate existing issue reports: |
| a new bug might be reported multiple times in a short time window, |
| or a non-bug might be reported every few months. |
| When an agent replies with a link to a duplicate report, |
| the contributor can close their new report and then watch that earlier issue. |
| When an agent replies with a link to a report that looks like a duplicate |
| but is not, the contributor can provide added context to distinguish their |
| report from the earlier one. |
| |
| For example, in [golang/go#68196](https://github.com/golang/go/issues/68196), |
| after the agent replied with a near duplicate, the original reporter commented: |
| |
| > Good bot :). Based on the discussion in this issue, I understand that |
| > it might not be possible to do what's being suggested here. |
| > If that's the case I'd still suggest to leave the issue open for a bit |
| > to see how many Go users care about this problem. |
| |
| As another example, on [golang/go#67986](https://github.com/golang/go/issues/67986), |
| after the agent replied with an exact duplicate, the original reporter commented: |
| |
| > Drats, I spent quite a bit of time searching existing issues. Not sure how I missed [that one]. |
| |
| 2. **The agent surfaces related context even to project maintainers.** |
| |
| Once a project reaches even modest size, no one person can remember all the context, |
| not even a highly dedicated project maintainer. |
| When an agent replies with a link to a related report, |
| that eliminates the time the maintainer must spend to find it. |
| If the maintainer has forgotten the related report entirely, |
| or never saw it in the first place (perhaps it was handled by someone else), |
| the reply is even more helpful, because it can point the maintainer |
| in the right direction and save them the effort of repeating the |
| analysis done in the earlier issue. |
| |
| For example, in [golang/go#68183](https://github.com/golang/go/issues/68183), |
| a project maintainer filed a |
| bug against the Go compiler for mishandling certain malformed identifiers. |
| The agent replied with a link to a similar report of the same bug, |
| filed almost four years earlier but triaged to low priority. |
| The added context allowed closing the earlier bug and |
| provided an argument for raising the priority of the new bug. |
| |
| As another example, in [golang/go#67938](https://github.com/golang/go/issues/67938), |
| a project maintainer filed a bug against the Go coverage tool |
| for causing the compiler to report incorrect sub-line position information. |
| The agent replied with an earlier related issue (incorrect line numbers) |
| from a decade earlier |
| as well as a more recent issue about coverage |
| not reporting sub-line position information at all. |
| The first bug was important context, |
| and the second bug's “fix” was the root cause of the bug in the new report: |
| the sub-line position information added then was not added correctly. |
| Those links pinpointed the exact code where the bug was. |
| Once that was identified, it was also easy to determine the fix. |
| |
| 3. **The agent interacts with bug reporters immediately.** |
| |
| In all of the previous examples, the fact that the agent replied only a minute or two |
| after the report was filed meant that the reporter was still available and engaged |
| enough to respond in a meaningful way: adding details to clarify the suggestion, |
| closing the report as a duplicate, raising bug priority based on past reports, |
| or identifying a fix. |
| In contrast, if hours or days (or more) go by after the initial report, |
| the original reporter may no longer be available, interested, or able |
| to provide context or additional details. |
| Immediately after the bug report is the best time to engage the reporter |
| and refine the report. |
| Maintainers cannot be expected to be engaged in this work all the time, |
| but an agent can. |
| |
| Finally, note that surfacing project context is extensible, |
| so that projects can incorporate their context no matter what form it takes. |
| Our prototype's context sources are tailored to the Go project, |
| reading issues from GitHub, documentation from [go.dev](https://go.dev), |
| and (soon) code reviews from Gerrit, |
| but the architecture makes it easy to add additional sources. |
| |
| ### Using natural language to control deterministic tools |
| |
| The second important agent capability is using natural |
| language to control deterministic tooling. |
| As open-source projects grow, the number of helpful tools increases, |
| and it can be difficult to keep track of all of them and remember |
| how to use each one. |
| For example, our prototype includes a general facility |
| for editing GitHub issue comments to add or fix links. |
| We envision also adding facilities for adding labels to |
| an issue or assigning or CC'ing people |
| when it matches certain criteria. |
| If a maintainer does not know this functionality exists |
| it might be difficult to find. |
| And even if they know it exists, perhaps they aren't familiar |
| with the specific API and don't want to take the time to learn it. |
| |
| On the other hand, LLMs are very good at translating between |
| intentions written in natural language |
| and executable forms of those intentions such as program code |
| or tool invocations. |
| We have done preliminary experiments with Gemini selecting from |
| and invoking available tools to satisfy natural language requests |
| made by a maintainer. |
| We don't have anything running for real yet, |
| but it looks like a promising approach. |
| |
| A different approach would be to rely more heavily on LLMs, |
| letting them edit code, issues, and so on entirely based on |
| natural language prompts with no deterministic tools. |
| This “magic wand” approach demands more of LLMs than they |
| are capable of today. |
| We believe it will be far more effective to use LLMs to convert |
| from natural language to deterministic tool use once |
| and then apply those deterministic tools automatically. |
| Our approach also limits the amount of “LLM supervision” needed: |
| a person can check that the tool invocation is correct |
| and then rely on the tool to operate deterministically. |
| |
| We have not built this part of Oscar yet, but when we do, |
| it will be extensible, so that projects can easily plug in their own tools. |
| |
| ### Analyzing issue reports and CLs/PRs |
| |
| The third important agent capability is analyzing issue reports |
| and CLs/PRs (change lists / pull requests). |
| Posting about related issues is a limited form of analysis, |
| but we plan to add other kinds of semantic analysis, |
| such as determining that an issue is primarily about performance |
| and should have a “performance” label added. |
| |
| We also plan to explore whether it is possible to analyze reports |
| well enough to identify whether more information is needed to |
| make the report useful. For example, if a report does not include |
| a link to a reproduction program on the [Go playground](https://go.dev/play), |
| the agent could ask for one. |
| And if there is such a link, the agent could make sure to inline the code |
| into the report to make it self-contained. |
| The agent could potentially also run a sandboxed execution tool |
| to identify which Go releases contain the bug and even use `git bisect` |
| to identify the commit that introduced the bug. |
| |
| As discussed earlier, all of these analyses and resulting interactions |
| work much better when they happen immediately after the report |
| is filed, when the reporter is still available and engaged. |
| Automated agents can be on duty 24/7. |
| |
| We have not built this part of Oscar yet, but when we do, |
| it too will be extensible, so that projects can easily define their own |
| analyses customized to the reports they receive. |
| |
| ## Prototype |
| |
| Our first prototype to explore open-source contributor agents is called Gaby (for “Go AI bot”) |
| and runs in the [Go issue tracker](https://github.com/golang/go/issues), |
| posting as [@gabyhelp](https://github.com/gabyhelp). |
| The source code is in [internal/gaby](internal/gaby) in this repository. |
| The [gaby package's documentation](https://pkg.go.dev/golang.org/x/oscar/internal/gaby) |
| explains the overall structure of the code in the repository as well. |
| |
| So far, Gaby indexes Go issue content from GitHub |
| as well as Go documentation from [go.dev](https://go.dev) |
| and replies to new issues with relevant links. |
| We plan to add Gerrit code reviews in the near future. |
| |
| Gaby's structure makes it easy to run on any kind of hosting service, |
| using any LLM, any storage layer, and any vector database. |
| Right now, it runs on a local workstation, using Google's Gemini LLM, |
| [Pebble](https://github.com/cockroachdb/pebble) key-value storage files, |
| and an in-memory vector database. |
| |
| We plan to add support for a variety of other options, including |
| [Ollama](https://ollama.com/) for local LLMs |
| and [Google Cloud Firestore](https://firebase.google.com/docs/firestore) |
| for key-value storage and vector database. |
| Firestore in particular will make it easy to run Gaby on hosted platforms |
| like [Cloud Run](https://cloud.google.com/run). |
| |
| Running on hosted platforms with their own URLs |
| (as opposed to a local workstation) |
| will enable subscribing to |
| [GitHub webhooks](https://docs.github.com/en/webhooks/about-webhooks), |
| so that Gaby can respond even more quickly to issues |
| and also carry on conversations. |
| |
| Our experience with all of this will inform the eventual generalized Oscar design. |
| |
| There is much work left to do. |
| |
| ## Relationship to Gopherbot |
| |
| The Go project has run its own completely deterministic agent, |
| [@gopherbot](https://github.com/gopherbot), for many years. |
| That agent is configured by writing, reviewing, and checking in Go code in the |
| [golang.org/x/build/cmd/gopherbot](https://pkg.go.dev/golang.org/x/build/cmd/gopherbot) |
| package. |
| Having the agent has been an incredible help to the Go project |
| and is part of the inspiration for Oscar. |
| At the same time, we are aiming for an even lighter-weight |
| way to configure new agent behaviors: using natural language |
| to control general behaviors. |
| Over time, our goal is to merge @gabyhelp back into @gopherbot |
| by re-building @gopherbot as an Oscar agent. |
| |
| ## Discussion and Feedback |
| |
| We are excited about the opportunities here, but we recognize |
| that we may be missing important concerns as well as |
| important opportunities to reduce open-source maintainer toil. |
| We have created [this GitHub discussion](https://github.com/golang/go/discussions/68490) to discuss |
| both concerns and new ideas for ways that Oscar-based agent can |
| help improve open-source maintenance. |
| Feedback there is much appreciated. |