Predibase¶
This notebook shows how you can use Predibase-hosted LLM's within Llamaindex. You can add Predibase to your existing Llamaindex worklow to:
- Deploy and query pre-trained or custom open source LLM’s without the hassle
- Operationalize an end-to-end Retrieval Augmented Generation (RAG) system
- Fine-tune your own LLM in just a few lines of code
Getting Started¶
- Sign up for a free Predibase account here
- Create an Account
- Go to Settings > My profile and Generate a new API Token.
In [ ]:
Copied!
%pip install llama-index-llms-predibase
%pip install llama-index-llms-predibase
In [ ]:
Copied!
!pip install llama-index --quiet
!pip install predibase --quiet
!pip install sentence-transformers --quiet
!pip install llama-index --quiet
!pip install predibase --quiet
!pip install sentence-transformers --quiet
In [ ]:
Copied!
import os
os.environ["PREDIBASE_API_TOKEN"] = "{PREDIBASE_API_TOKEN}"
from llama_index.llms.predibase import PredibaseLLM
import os
os.environ["PREDIBASE_API_TOKEN"] = "{PREDIBASE_API_TOKEN}"
from llama_index.llms.predibase import PredibaseLLM
Flow 1: Query Predibase LLM directly¶
In [ ]:
Copied!
llm = PredibaseLLM(
model_name="llama-2-13b", temperature=0.3, max_new_tokens=512
)
# You can query any HuggingFace or fine-tuned LLM that's hosted on Predibase
llm = PredibaseLLM(
model_name="llama-2-13b", temperature=0.3, max_new_tokens=512
)
# You can query any HuggingFace or fine-tuned LLM that's hosted on Predibase
In [ ]:
Copied!
result = llm.complete("Can you recommend me a nice dry white wine?")
print(result)
result = llm.complete("Can you recommend me a nice dry white wine?")
print(result)
Flow 2: Retrieval Augmented Generation (RAG) with Predibase LLM¶
In [ ]:
Copied!
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.embeddings import resolve_embed_model
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.embeddings import resolve_embed_model
from llama_index.core.node_parser import SentenceSplitter
Download Data¶
In [ ]:
Copied!
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
Load Documents¶
In [ ]:
Copied!
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
Configure Predibase LLM¶
In [ ]:
Copied!
llm = PredibaseLLM(
model_name="llama-2-13b",
temperature=0.3,
max_new_tokens=400,
context_window=1024,
)
llm = PredibaseLLM(
model_name="llama-2-13b",
temperature=0.3,
max_new_tokens=400,
context_window=1024,
)
In [ ]:
Copied!
embed_model = resolve_embed_model("local:BAAI/bge-small-en-v1.5")
splitter = SentenceSplitter(chunk_size=1024)
embed_model = resolve_embed_model("local:BAAI/bge-small-en-v1.5")
splitter = SentenceSplitter(chunk_size=1024)
Setup and Query Index¶
In [ ]:
Copied!
index = VectorStoreIndex.from_documents(
documents, transformations=[splitter], embed_model=embed_model
)
query_engine = index.as_query_engine(llm=llm)
response = query_engine.query("What did the author do growing up?")
index = VectorStoreIndex.from_documents(
documents, transformations=[splitter], embed_model=embed_model
)
query_engine = index.as_query_engine(llm=llm)
response = query_engine.query("What did the author do growing up?")
In [ ]:
Copied!
print(response)
print(response)