Google GenAI Embeddings¶

Using Google's google-genai package, LlamaIndex provides a GoogleGenAIEmbedding class that allows you to embed text using Google's GenAI models from both the Gemini and Vertex AI APIs.

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [ ]:

Copied!

%pip install llama-index-embeddings-google-genai
%pip install llama-index-embeddings-google-genai

In [ ]:

Copied!

import os

os.environ["GOOGLE_API_KEY"] = "..."
import os

os.environ["GOOGLE_API_KEY"] = "..."

Setup¶

GoogleGenAIEmbedding is a wrapper around the google-genai package, which means it supports both Gemini and Vertex AI APIs out of that box.

You can pass in the api_key directly, or pass in a vertexai_config to use the Vertex AI API.

Other options include embed_batch_size, model_name, and embedding_config.

The default model is text-embedding-004.

In [ ]:

Copied!





from llama_index.embeddings.google_genai import GoogleGenAIEmbedding
from google.genai.types import EmbedContentConfig

embed_model = GoogleGenAIEmbedding(
    model_name="text-embedding-004",
    embed_batch_size=100,
    # can pass in the api key directly
    # api_key="...",
    # or pass in a vertexai_config
    # vertexai_config={
    #     "project": "...",
    #     "location": "...",
    # }
    # can also pass in an embedding_config
    # embedding_config=EmbedContentConfig(...)
)
from llama_index.embeddings.google_genai import GoogleGenAIEmbedding
from google.genai.types import EmbedContentConfig

embed_model = GoogleGenAIEmbedding(
    model_name="text-embedding-004",
    embed_batch_size=100,
    # can pass in the api key directly
    # api_key="...",
    # or pass in a vertexai_config
    # vertexai_config={
    #     "project": "...",
    #     "location": "...",
    # }
    # can also pass in an embedding_config
    # embedding_config=EmbedContentConfig(...)
)

Usage¶

Sync¶

In [ ]:

Copied!

embeddings = embed_model.get_text_embedding("Google Gemini Embeddings.")
print(embeddings[:5])
print(f"Dimension of embeddings: {len(embeddings)}")
embeddings = embed_model.get_text_embedding("Google Gemini Embeddings.")
print(embeddings[:5])
print(f"Dimension of embeddings: {len(embeddings)}")

[0.031099992, 0.02192731, -0.06523498, 0.016788177, 0.0392835]
Dimension of embeddings: 768

In [ ]:

Copied!

embeddings = embed_model.get_query_embedding("Query Google Gemini Embeddings.")
print(embeddings[:5])
print(f"Dimension of embeddings: {len(embeddings)}")
embeddings = embed_model.get_query_embedding("Query Google Gemini Embeddings.")
print(embeddings[:5])
print(f"Dimension of embeddings: {len(embeddings)}")

[0.022199392, 0.03671178, -0.06874573, 0.02195774, 0.05475164]
Dimension of embeddings: 768

In [ ]:

Copied!





embeddings = embed_model.get_text_embedding_batch(
    [
        "Google Gemini Embeddings.",
        "Google is awesome.",
        "Llamaindex is awesome.",
    ]
)
print(f"Got {len(embeddings)} embeddings")
print(f"Dimension of embeddings: {len(embeddings[0])}")
embeddings = embed_model.get_text_embedding_batch(
    [
        "Google Gemini Embeddings.",
        "Google is awesome.",
        "Llamaindex is awesome.",
    ]
)
print(f"Got {len(embeddings)} embeddings")
print(f"Dimension of embeddings: {len(embeddings[0])}")

Got 3 embeddings
Dimension of embeddings: 768

Async¶

In [ ]:

Copied!

embeddings = await embed_model.aget_text_embedding("Google Gemini Embeddings.")
print(embeddings[:5])
print(f"Dimension of embeddings: {len(embeddings)}")
embeddings = await embed_model.aget_text_embedding("Google Gemini Embeddings.")
print(embeddings[:5])
print(f"Dimension of embeddings: {len(embeddings)}")

[0.031099992, 0.02192731, -0.06523498, 0.016788177, 0.0392835]
Dimension of embeddings: 768

In [ ]:

Copied!





embeddings = await embed_model.aget_query_embedding(
    "Query Google Gemini Embeddings."
)
print(embeddings[:5])
print(f"Dimension of embeddings: {len(embeddings)}")
embeddings = await embed_model.aget_query_embedding(
    "Query Google Gemini Embeddings."
)
print(embeddings[:5])
print(f"Dimension of embeddings: {len(embeddings)}")

[0.022199392, 0.03671178, -0.06874573, 0.02195774, 0.05475164]
Dimension of embeddings: 768

In [ ]:

Copied!





embeddings = await embed_model.aget_text_embedding_batch(
    [
        "Google Gemini Embeddings.",
        "Google is awesome.",
        "Llamaindex is awesome.",
    ]
)
print(f"Got {len(embeddings)} embeddings")
print(f"Dimension of embeddings: {len(embeddings[0])}")
embeddings = await embed_model.aget_text_embedding_batch(
    [
        "Google Gemini Embeddings.",
        "Google is awesome.",
        "Llamaindex is awesome.",
    ]
)
print(f"Got {len(embeddings)} embeddings")
print(f"Dimension of embeddings: {len(embeddings[0])}")

Got 3 embeddings
Dimension of embeddings: 768