Cohere Rerank¶
If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
In [ ]:
Copied!
%pip install llama-index-postprocessor-cohere-rerank
%pip install llama-index-postprocessor-cohere-rerank
In [ ]:
Copied!
!pip install llama-index
!pip install llama-index
In [ ]:
Copied!
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
pprint_response,
)
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
pprint_response,
)
/Users/suo/miniconda3/envs/llama/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm
Download Data
In [ ]:
Copied!
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
In [ ]:
Copied!
# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
# build index
index = VectorStoreIndex.from_documents(documents=documents)
# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
# build index
index = VectorStoreIndex.from_documents(documents=documents)
Retrieve top 10 most relevant nodes, then filter with Cohere Rerank¶
In [ ]:
Copied!
import os
from llama_index.postprocessor.cohere_rerank import CohereRerank
api_key = os.environ["COHERE_API_KEY"]
cohere_rerank = CohereRerank(api_key=api_key, top_n=2)
import os
from llama_index.postprocessor.cohere_rerank import CohereRerank
api_key = os.environ["COHERE_API_KEY"]
cohere_rerank = CohereRerank(api_key=api_key, top_n=2)
In [ ]:
Copied!
query_engine = index.as_query_engine(
similarity_top_k=10,
node_postprocessors=[cohere_rerank],
)
response = query_engine.query(
"What did Sam Altman do in this essay?",
)
query_engine = index.as_query_engine(
similarity_top_k=10,
node_postprocessors=[cohere_rerank],
)
response = query_engine.query(
"What did Sam Altman do in this essay?",
)
In [ ]:
Copied!
pprint_response(response)
pprint_response(response)
Final Response: Sam Altman agreed to become the president of Y Combinator in October 2013. He took over starting with the winter 2014 batch, and worked with the founders to help them get through Demo Day in March 2014. He then reorganised Y Combinator to be controlled by someone other than the founders, so that it could last for a long time. ______________________________________________________________________ Source Node 1/2 Document ID: c1baaa76-acba-453b-a8d1-fdffbde1f424 Similarity: 0.845305 Text: day in 2010, when he was visiting California for interviews, Robert Morris did something astonishing: he offered me unsolicited advice. I can only remember him doing that once before. One day at Viaweb, when I was bent over double from a kidney stone, he suggested that it would be a good idea for him to take me to the hospital. That was what it ... ______________________________________________________________________ Source Node 2/2 Document ID: abc0f1aa-464a-4ae1-9a7b-2d47a9dc967e Similarity: 0.6486889 Text: due to our ignorance about investing. We needed to get experience as investors. What better way, we thought, than to fund a whole bunch of startups at once? We knew undergrads got temporary jobs at tech companies during the summer. Why not organize a summer program where they'd start startups instead? We wouldn't feel guilty for being in a sense...
Directly retrieve top 2 most similar nodes¶
In [ ]:
Copied!
query_engine = index.as_query_engine(
similarity_top_k=2,
)
response = query_engine.query(
"What did Sam Altman do in this essay?",
)
query_engine = index.as_query_engine(
similarity_top_k=2,
)
response = query_engine.query(
"What did Sam Altman do in this essay?",
)
Retrieved context is irrelevant and response is hallucinated.
In [ ]:
Copied!
pprint_response(response)
pprint_response(response)
Final Response: Sam Altman was one of the founders of Y Combinator, a startup accelerator. He was part of the first batch of startups funded by Y Combinator, which included Reddit, Justin Kan and Emmett Shear's Twitch, and Aaron Swartz. He was also involved in the Summer Founders Program, which was a summer program where undergrads could start their own startups instead of taking a summer job at a tech company. He also helped to develop a new version of Arc, a programming language, and wrote a book on Lisp. ______________________________________________________________________ Source Node 1/2 Document ID: abc0f1aa-464a-4ae1-9a7b-2d47a9dc967e Similarity: 0.7940524933077708 Text: due to our ignorance about investing. We needed to get experience as investors. What better way, we thought, than to fund a whole bunch of startups at once? We knew undergrads got temporary jobs at tech companies during the summer. Why not organize a summer program where they'd start startups instead? We wouldn't feel guilty for being in a sense... ______________________________________________________________________ Source Node 2/2 Document ID: 5d696e20-b496-47f0-9262-7aa2667c1d96 Similarity: 0.7899270712205545 Text: at RISD, but otherwise I was basically teaching myself to paint, and I could do that for free. So in 1993 I dropped out. I hung around Providence for a bit, and then my college friend Nancy Parmet did me a big favor. A rent-controlled apartment in a building her mother owned in New York was becoming vacant. Did I want it? It wasn't much more tha...