internal/gcp/gemini: use gemini-1.5-pro by default
gemini-1.5-pro has a context window of 2 million tokens, so
we are much less likely to hit limits.
Change-Id: I8907c38cca0556f2e185b471e9e1abd4905a7c00
Reviewed-on: https://go-review.googlesource.com/c/oscar/+/622837
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
diff --git a/internal/gcp/gemini/gemini.go b/internal/gcp/gemini/gemini.go
index 3b609a6..4f1063c 100644
--- a/internal/gcp/gemini/gemini.go
+++ b/internal/gcp/gemini/gemini.go
@@ -57,14 +57,14 @@
const (
DefaultEmbeddingModel = "text-embedding-004"
- DefaultGenerativeModel = "gemini-1.0-pro"
+ DefaultGenerativeModel = "gemini-1.5-pro"
)
// NewClient returns a connection to Gemini, using the given logger and HTTP client.
// It expects to find a secret of the form "AIza..." or "user:AIza..." in sdb
// under the name "ai.google.dev".
// The embeddingModel is the model name to use for embedding, such as text-embedding-004,
-// and the generativeModel is the model name to use for generation, such as gemini-1.0-pro.
+// and the generativeModel is the model name to use for generation, such as gemini-1.5-pro.
func NewClient(ctx context.Context, lg *slog.Logger, sdb secret.DB, hc *http.Client, embeddingModel, generativeModel string) (*Client, error) {
key, ok := sdb.Get("ai.google.dev")
if !ok {
diff --git a/internal/gcp/gemini/testdata/generatetext.httprr b/internal/gcp/gemini/testdata/generatetext.httprr
index 18e4232..574c9b5 100644
--- a/internal/gcp/gemini/testdata/generatetext.httprr
+++ b/internal/gcp/gemini/testdata/generatetext.httprr
@@ -1,39 +1,24 @@
httprr trace v1
-858 3622
-POST https://generativelanguage.googleapis.com/v1beta/models/gemini-1.0-pro:generateContent?%24alt=json%3Benum-encoding%3Dint HTTP/1.1
+858 3193
+POST https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro:generateContent?%24alt=json%3Benum-encoding%3Dint HTTP/1.1
Host: generativelanguage.googleapis.com
User-Agent: Go-http-client/1.1
Content-Length: 540
Content-Type: application/json
-x-goog-request-params: model=models%2Fgemini-1.0-pro
+x-goog-request-params: model=models%2Fgemini-1.5-pro
-{"model":"models/gemini-1.0-pro","contents":[{"parts":[{"text":"CanonicalHeaderKey returns the canonical format of the header key s. The canonicalization converts the first letter and any letter following a hyphen to upper case; the rest are converted to lowercase. For example, the canonical key for 'accept-encoding' is 'Accept-Encoding'. If s contains a space or invalid header field bytes, it is returned without modifications."},{"text":"When should I use CanonicalHeaderKey?"}],"role":"user"}],"generationConfig":{"candidateCount":1}}HTTP/2.0 200 OK
+{"model":"models/gemini-1.5-pro","contents":[{"parts":[{"text":"CanonicalHeaderKey returns the canonical format of the header key s. The canonicalization converts the first letter and any letter following a hyphen to upper case; the rest are converted to lowercase. For example, the canonical key for 'accept-encoding' is 'Accept-Encoding'. If s contains a space or invalid header field bytes, it is returned without modifications."},{"text":"When should I use CanonicalHeaderKey?"}],"role":"user"}],"generationConfig":{"candidateCount":1}}HTTP/2.0 200 OK
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Cache-Control: private
Content-Type: application/json; charset=UTF-8
-Date: Thu, 24 Oct 2024 15:16:59 GMT
+Date: Mon, 28 Oct 2024 18:02:47 GMT
Server: scaffolding on HTTPServer2
-Server-Timing: gfet4t7; dur=2345
+Server-Timing: gfet4t7; dur=8601
Vary: Origin
Vary: X-Origin
Vary: Referer
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
-X-Google-Backends: unix:/tmp/esfbackend.1729722122.514958.2643309,/bns/ma/borg/ma/bns/genai-api/prod.genai-api/7,/bns/lclgaa/borg/lclgaa/bns/blue-layer1-gfe-prod-edge/prod.blue-layer1-gfe.lga34s38/40
-X-Google-Dos-Service-Trace: main:genai-api-api-prod,main:GLOBAL_all_non_cloud
-X-Google-Esf-Cloud-Client-Params: backend_service_name: "generativelanguage.googleapis.com" backend_fully_qualified_method: "google.ai.generativelanguage.v1beta.GenerativeService.GenerateContent"
-X-Google-Gfe-Handshake-Trace: GFE: /bns/lclgaa/borg/lclgaa/bns/blue-layer1-gfe-prod-edge/prod.blue-layer1-gfe.lga34s38/40,Mentat oracle: [2002:a05:6692:824:b0:89:f617:8002]:9801
-X-Google-Gfe-Request-Trace: aclgaff4:443,/bns/ma/borg/ma/bns/genai-api/prod.genai-api/7,aclgaff4:443
-X-Google-Gfe-Response-Body-Transformations: chunked
-X-Google-Gfe-Response-Code-Details-Trace: response_code_set_by_backend
-X-Google-Gfe-Service-Trace: genai-api-api-prod/gfespec_googleapis-generativelanguage_generativelanguage-url-map-global_generativelanguage-genai-api-api-prod
-X-Google-Gfe-Version: 2.900.2
-X-Google-Netmon-Label: /bns/ma/borg/ma/bns/genai-api/prod.genai-api/7
-X-Google-Security-Signals: FRAMEWORK=ONE_PLATFORM,ENV=borg,ENV_DEBUG=borg_user:genai-api;borg_job:prod.genai-api
-X-Google-Security-Signals: FRAMEWORK=HTTPSERVER2,BUILD=GOOGLE3,BUILD_DEBUG=cl:688683199,ENV=borg,ENV_DEBUG=borg_user:genai-api;borg_job:prod.genai-api
-X-Google-Service: genai-api-api-prod/gfespec_googleapis-generativelanguage_generativelanguage-url-map-global_generativelanguage-genai-api-api-prod
-X-Google-Session-Info: GgQYECgLIAE6IxIhZ2VuZXJhdGl2ZWxhbmd1YWdlLmdvb2dsZWFwaXMuY29t
-X-Google-Shellfish-Status: CA0gBEBG
X-Xss-Protection: 0
{
@@ -42,7 +27,7 @@
"content": {
"parts": [
{
- "text": "Use CanonicalHeaderKey when you need to canonicalize the header key s according to HTTP/2 requirements. Canonicalizing a header key means converting the first letter and any letter following a hyphen to upper case; the rest are converted to lowercase. For example, the canonical key for 'accept-encoding' is 'Accept-Encoding'. This function is useful in the context of HTTP/2, where header names are case-insensitive and must be canonicalized before being sent over the wire. By using CanonicalHeaderKey, you can ensure that header keys are properly formatted and can be easily compared and manipulated."
+ "text": "You should use `CanonicalHeaderKey` **when you need to compare or interact with HTTP headers in a case-insensitive way, ensuring compatibility across different systems and implementations.**\n\nHere's a breakdown of when to use it:\n\n**Situations where `CanonicalHeaderKey` is beneficial:**\n\n* **Comparing HTTP Headers:** Different systems might use varying capitalization for header keys (e.g., \"Content-Type\" vs. \"content-type\"). `CanonicalHeaderKey` provides a standardized format, ensuring consistent comparisons.\n* **Accessing Headers in Maps/Dictionaries:** Many programming languages use case-sensitive keys for maps/dictionaries. Using the canonical form as the key when storing or retrieving headers avoids issues arising from inconsistent capitalization.\n* **Interacting with HTTP Libraries/Frameworks:** Some libraries might have specific requirements or expectations regarding header key capitalization. Using `CanonicalHeaderKey` ensures compliance.\n* **Generating HTTP Messages:** When constructing HTTP requests or responses, using canonicalized header keys promotes consistency and adherence to standards.\n\n**Example:**\n\nImagine you're building an HTTP client library. You want to allow users to access headers case-insensitively:\n\n```python\ndef get_header(headers, key):\n canonical_key = CanonicalHeaderKey(key)\n return headers.get(canonical_key)\n\n# Example usage\nheaders = {'Content-Type': 'application/json'}\ncontent_type = get_header(headers, 'content-type') # Returns 'application/json'\n```\n\n**Points to Note:**\n\n* **Standard Compliance:** The canonicalization process in `CanonicalHeaderKey` typically follows the HTTP standard (RFC 7230) for header key formatting.\n* **Language/Library Specifics:** The availability and exact implementation of `CanonicalHeaderKey` might vary depending on the programming language or HTTP library you're using.\n\nIn summary, **use `CanonicalHeaderKey` whenever you need to handle HTTP header keys in a standardized, case-insensitive manner to ensure interoperability and avoid potential issues due to varying capitalization.** \n"
}
],
"role": "model"
@@ -66,23 +51,13 @@
"category": 10,
"probability": 1
}
- ],
- "citationMetadata": {
- "citationSources": [
- {
- "startIndex": 149,
- "endIndex": 319,
- "uri": "https://tachingchen.com/blog/pitfall-of-golang-header-operation/",
- "license": ""
- }
- ]
- }
+ ]
}
],
"usageMetadata": {
"promptTokenCount": 81,
- "candidatesTokenCount": 120,
- "totalTokenCount": 201
+ "candidatesTokenCount": 433,
+ "totalTokenCount": 514
},
- "modelVersion": "gemini-1.0-pro"
+ "modelVersion": "gemini-1.5-pro-001"
}