DashVector Vector Store - Hybrid Search#
If you’re opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
!pip3 install llama-index dashvector dashtext
import logging
import sys
import os
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
Create a DashVector Collection#
import dashvector
api_key = os.environ["DASHVECTOR_API_KEY"]
endpoint = os.environ["DASHVECTOR_ENDPOINT"]
client = dashvector.Client(api_key=api_key, endpoint=endpoint)
# delete if needed
# client.delete("quickstart")
# dimensions are for text-embedding-ada-002
# NOTE: Sparse Vector only supported for dotproduct metric
client.create("quickstart", dimension=1536, metric="dotproduct")
dashvector_collection = client.get("quickstart")
Download data#
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
Load documents#
from llama_index import SimpleDirectoryReader
# load documents
documents = SimpleDirectoryReader("./data/paul_graham").load_data()
Build The VectorStoreIndex with DashVectorStore#
from llama_index import VectorStoreIndex
from llama_index.vector_stores import DashVectorStore
from llama_index.storage.storage_context import StorageContext
vector_store = DashVectorStore(
collection=dashvector_collection, support_sparse_vector=True
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context
)
Query with Default Query Engine#
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
from IPython.display import Markdown, display
display(Markdown(f"<b>{response}</b>"))
The author, growing up, worked on writing and programming outside of school. They wrote short stories and tried writing programs on an IBM 1401 computer. They later got a microcomputer and started programming more extensively, writing simple games and a word processor.
Query with Hybrid Query Mode#
query_engine = index.as_query_engine(vector_store_query_mode="hybrid")
response = query_engine.query("What did the author do growing up?")
display(Markdown(f"<b>{response}</b>"))
The author wrote short stories and also worked on programming, specifically on an IBM 1401 computer in 9th grade.