RAG stands for Retrieval Augmented Generation
Demos of implementing a “RAG Server” in Go, using Google AI for embeddings and language models and Weaviate as a vector database.
The server we're developing is a standard Go HTTP server, listening on a local port. See the next section for the request schema for this server. It supports adding new documents to its context, and getting queries that would use this context.
Weaviate has the be installed locally; the easiest way to do so is by using docker-compose
as described in the Usage section.
/add/: POST {"documents": [{"text": "..."}, {"text": "..."}, ...]} response: OK (no body) /query/: GET {"content": "..."} response: model response as a string
ragserver
: uses the Google AI Go SDK directly for LLM calls and embeddings, and the Weaviate Go client library directly for interacting with Weaviate.cd tests
and run docker-compose up
; This will start the weaviate service in the foreground.GEMINI_API_KEY=... go run .
in the tested ragserver
directory.ragserver
. The following instructions are for terminal window 3.Run cd tests
; then we can clear out the weaviate DB with ./weaviate-delete-objects.sh
. To add documents to the DB through ragserver
, run ./add-documents.sh
. For a sample query, run ./query.sh
Adjust the contents of these scripts as needed.
SERVERPORT
: the port this server is listening on (default 9020)WVPORT
: the port Weaviate is listening on (default 9035)GEMINI_API_KEY
: API key for the Gemini service at https://ai.google.dev