Skip to content

Google Vertex AI

Google Cloud Vertex AI provider with support for chat, embeddings, and image generation.

Setup

bash
go get github.com/zendev-sh/goai@latest
go
import "github.com/zendev-sh/goai/provider/vertex"

Authentication

Vertex AI supports three authentication methods, resolved in this priority order:

  1. Explicit options - WithTokenSource(ts) or WithAPIKey(key) passed directly take highest priority.
  2. Application Default Credentials (ADC) - auto-detected when a GCP project is configured. Uses gcloud CLI credentials, service account JSON, or GCE metadata.
  3. API key from environment - falls back to GOOGLE_API_KEY, GEMINI_API_KEY, or GOOGLE_GENERATIVE_AI_API_KEY env vars. Routes requests to the Gemini API endpoint instead of Vertex AI.

Environment Variables

VariableDescription
GOOGLE_VERTEX_PROJECTGCP project ID (also reads GOOGLE_CLOUD_PROJECT, GCLOUD_PROJECT)
GOOGLE_VERTEX_LOCATIONGCP region (also reads GOOGLE_CLOUD_LOCATION, defaults to us-central1)
GOOGLE_VERTEX_BASE_URLOverride the Vertex AI endpoint
GOOGLE_API_KEYAPI key for Gemini API fallback
GEMINI_API_KEYAPI key for Gemini API fallback (alternative)
GOOGLE_GENERATIVE_AI_API_KEYAPI key for Gemini API fallback (alternative)

Models

  • gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash
  • gemini-3-flash-preview, gemini-3-pro-preview
  • text-embedding-004 (embeddings)
  • imagen-3.0-generate-002, imagen-4.0-fast-generate-001 (image generation)

Tested Models

Unit tested (mock HTTP server, 2026-03-15): gemini-2.5-pro, imagen-3.0-generate-002, text-embedding-004

Usage

Chat

go
model := vertex.Chat("gemini-2.5-pro",
    vertex.WithProject("my-gcp-project"),
    vertex.WithLocation("us-central1"),
)

result, err := goai.GenerateText(ctx, model, goai.WithPrompt("Explain quantum computing"))
if err != nil {
    log.Fatal(err)
}
fmt.Println(result.Text)

Chat with API Key (Gemini API fallback)

When no project is configured, the provider automatically routes to the Gemini API if an API key is available:

go
model := vertex.Chat("gemini-2.5-flash",
    vertex.WithAPIKey("your-api-key"),
)

Embeddings

go
embedModel := vertex.Embedding("text-embedding-004",
    vertex.WithProject("my-gcp-project"),
)

result, err := goai.Embed(ctx, embedModel, "hello world")

Embedding provider options (pass under the "vertex" key in EmbedParams.ProviderOptions):

OptionTypeDescription
outputDimensionalityintReduced dimension for output embedding
taskTypestringSEMANTIC_SIMILARITY, CLASSIFICATION, CLUSTERING, RETRIEVAL_DOCUMENT, RETRIEVAL_QUERY, QUESTION_ANSWERING, FACT_VERIFICATION, CODE_RETRIEVAL_QUERY
titlestringDocument title (only valid with RETRIEVAL_DOCUMENT task)
autoTruncateboolTruncate input text if too long (default: true)

Image Generation

go
imgModel := vertex.Image("imagen-3.0-generate-002",
    vertex.WithProject("my-gcp-project"),
)

result, err := goai.GenerateImage(ctx, imgModel,
    goai.WithImagePrompt("A sunset over mountains"),
    goai.WithImageCount(1),
)

Image provider options (pass under the "vertex" key in ImageParams.ProviderOptions):

OptionTypeDescription
negativePromptstringText describing what to avoid
personGenerationstringdont_allow, allow_adult, allow_all
safetySettingstringblock_low_and_above, block_medium_and_above, block_only_high, block_none
addWatermarkboolWhether to add a watermark
sampleImageSizestring1K or 2K
seedintSeed for reproducible generation

Options

OptionTypeDescription
WithAPIKey(key)stringSet a static API key (routes to Gemini API endpoint)
WithTokenSource(ts)provider.TokenSourceSet a dynamic token source (e.g., service account)
WithProject(project)stringSet the GCP project ID
WithLocation(location)stringSet the GCP region (default: us-central1)
WithBaseURL(url)stringOverride the API endpoint
WithHeaders(h)map[string]stringSet additional HTTP headers
WithHTTPClient(c)*http.ClientSet a custom *http.Client

Notes

  • For direct API key access without GCP, see the Google provider.
  • Chat uses the Vertex AI OpenAI-compatible endpoint. Embeddings and image generation use the native Vertex AI :predict endpoint.
  • Tool schemas are automatically sanitized to comply with Gemini schema restrictions (no additionalProperties, enum values must be strings, etc.).
  • Gemini-native provider options like thinkingConfig are stripped before sending to the OpenAI-compatible endpoint.
  • The ADCTokenSource() function is exported for direct use with other providers that need GCP credentials.
  • Max embedding batch size: 2048 values per call.

Released under the MIT License.