Chain-of-Abstraction LlamaPack¶
The chain-of-abstraction (CoA) LlamaPack implements a generalized version of the strategy decsribed in the original CoA paper.
By prompting the LLM to write function calls in a chain-of-thought format, we can execute both simple and complex combinations of function calls needed to execute a task.
The LLM is prompted to write a response containing function calls, for example, a CoA plan might look like:
After buying the apples, Sally has [FUNC add(3, 2) = y1] apples.
Then, the wizard casts a spell to multiply the number of apples by 3,
resulting in [FUNC multiply(y1, 3) = y2] apples.
From there, the function calls can be parsed into a dependency graph, and executed.
Then, the values in the CoA are replaced with their actual results.
As an extension to the original paper, we also run the LLM a final time, to rewrite the response in a more readable and user-friendly way.
NOTE: In the original paper, the authors fine-tuned an LLM specifically for this, and also for specific functions and datasets. As such, only capable LLMs (OpenAI, Anthropic, etc.) will be (hopefully) reliable for this without finetuning.
Setup¶
First, lets install the pack, along with some extra dependencies.
%pip install llama-index-core llama-index-llms-openai llama-index-embeddings-openai
%pip install llama-index-agent-coa llama-parse
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
import nest_asyncio
nest_asyncio.apply()
from llama_index.core import Settings
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
Settings.embed_model = OpenAIEmbedding(
model="text-embedding-3-small", embed_batch_size=256
)
Settings.llm = OpenAI(model="gpt-4-turbo", temperature=0.1)
Tools setup¶
Next, we need some tools for our agent to use.
In this example, we use some classic SEC 10K fillings.
!mkdir -p 'data/10k/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'
from llama_index.core import StorageContext, load_index_from_storage
try:
storage_context = StorageContext.from_defaults(
persist_dir="./storage/lyft"
)
lyft_index = load_index_from_storage(storage_context)
storage_context = StorageContext.from_defaults(
persist_dir="./storage/uber"
)
uber_index = load_index_from_storage(storage_context)
index_loaded = True
except:
index_loaded = False
from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
# (OPTIONAL) -- Use LlamaParse for loading PDF documents
file_extractor = {
".pdf": LlamaParse(
result_type="markdown",
api_key="llx-...",
)
}
if not index_loaded:
# load data
lyft_docs = SimpleDirectoryReader(
input_files=["./data/10k/lyft_2021.pdf"],
file_extractor=file_extractor,
).load_data()
uber_docs = SimpleDirectoryReader(
input_files=["./data/10k/uber_2021.pdf"],
file_extractor=file_extractor,
).load_data()
# build index
lyft_index = VectorStoreIndex.from_documents(lyft_docs)
uber_index = VectorStoreIndex.from_documents(uber_docs)
# persist index
lyft_index.storage_context.persist(persist_dir="./storage/lyft")
uber_index.storage_context.persist(persist_dir="./storage/uber")
from llama_index.core.tools import QueryEngineTool
lyft_engine = lyft_index.as_query_engine(similarity_top_k=2)
uber_engine = uber_index.as_query_engine(similarity_top_k=2)
query_engine_tools = [
QueryEngineTool.from_defaults(
query_engine=lyft_engine,
name="lyft_10k",
description=(
"Provides information about Lyft financials for year 2021. "
"Use a detailed plain text question as input to the tool."
),
),
QueryEngineTool.from_defaults(
query_engine=uber_engine,
name="uber_10k",
description=(
"Provides information about Uber financials for year 2021. "
"Use a detailed plain text question as input to the tool."
),
),
]
Run the CoAAgentPack¶
With our tools ready, we can now run the agent pack!
%pip install llama-index-packs-agents-coa
# needs llama_index-packs-agents-coa
from llama_index.packs.agent.coa import CoAAgentPack
pack = CoAAgentPack(tools=query_engine_tools, llm=Settings.llm)
response = pack.run("How did Ubers revenue growth compare to Lyfts in 2021?")
==== Available Parsed Functions ==== def lyft_10k(input: string): """Provides information about Lyft financials for year 2021. Use a detailed plain text question as input to the tool.""" ... def uber_10k(input: string): """Provides information about Uber financials for year 2021. Use a detailed plain text question as input to the tool.""" ... ==== Generated Chain of Abstraction ==== To compare Uber's revenue growth to Lyft's in 2021, we need to obtain the revenue growth figures for both companies for that year. 1. Retrieve Uber's revenue growth for 2021 by querying the Uber financial tool with a specific question about revenue growth: - [FUNC uber_10k("What was Uber's revenue growth in 2021?") = y1] 2. Retrieve Lyft's revenue growth for 2021 by querying the Lyft financial tool with a similar question about revenue growth: - [FUNC lyft_10k("What was Lyft's revenue growth in 2021?") = y2] 3. Compare the revenue growth figures obtained (y1 and y2) to determine which company had higher growth in 2021. This comparison will be done by the reader after the function calls have been executed. ==== Executing uber_10k with inputs ["What was Uber's revenue growth in 2021?"] ==== ==== Executing lyft_10k with inputs ["What was Lyft's revenue growth in 2021?"] ====
print(str(response))
In 2021, Uber's revenue growth was higher than Lyft's. Uber's revenue grew by 57% compared to 2020, while Lyft's revenue increased by 36% compared to the prior year.
Lets recap the logs we just saw
- The tools get parsed into python-like definitions
- The agent is prompted to generate a CoA plan
- The function calls are parsed out of the plan and executed
- The values in the plan are filled in
- The agent generates a final response
[Advanced] -- Using the CoAAgentWorker¶
By installing the CoAAgentPack, you also get access to the underlying agent worker. With this, you can setup the agent manually, as well as customize the prompts and output parsing.
from llama_index.agent.coa import CoAAgentWorker
worker = CoAAgentWorker.from_tools(
tools=query_engine_tools,
llm=Settings.llm,
verbose=True,
)
agent = worker.as_agent()
agent.chat("How did Ubers revenue growth compare to Lyfts in 2021?")
==== Available Parsed Functions ==== def lyft_10k(input: string): """Provides information about Lyft financials for year 2021. Use a detailed plain text question as input to the tool.""" ... def uber_10k(input: string): """Provides information about Uber financials for year 2021. Use a detailed plain text question as input to the tool.""" ... ==== Generated Chain of Abstraction ==== To compare Uber's revenue growth to Lyft's in 2021, we need to obtain the revenue growth figures for both companies for that year. 1. Retrieve Uber's revenue growth for 2021 by querying the Uber financial tool with a specific question about revenue growth. This can be done using the function call: [FUNC uber_10k("What was Uber's revenue growth in 2021?") = y1]. 2. Similarly, retrieve Lyft's revenue growth for 2021 by querying the Lyft financial tool with a specific question about revenue growth. This can be done using the function call: [FUNC lyft_10k("What was Lyft's revenue growth in 2021?") = y2]. 3. Once both y1 and y2 are obtained, compare the values to determine which company had higher revenue growth in 2021. This comparison does not require a function call but involves a direct comparison of y1 and y2 to see which is greater. ==== Executing uber_10k with inputs ["What was Uber's revenue growth in 2021?"] ==== ==== Executing lyft_10k with inputs ["What was Lyft's revenue growth in 2021?"] ====
AgentChatResponse(response="In 2021, Uber's revenue growth was reported as 57%. To compare this with Lyft's revenue growth, we calculate the percentage increase for Lyft based on the provided figures: Lyft's revenue in 2021 was $3,208,323,000 compared to $2,364,681,000 in 2020. The growth in revenue for Lyft can be calculated as:\n\n\\[ \\text{Growth Percentage} = \\left( \\frac{\\text{Revenue in 2021} - \\text{Revenue in 2020}}{\\text{Revenue in 2020}} \\right) \\times 100 \\]\n\\[ \\text{Growth Percentage} = \\left( \\frac{3,208,323,000 - 2,364,681,000}{2,364,681,000} \\right) \\times 100 \\approx 35.7\\% \\]\n\nThus, comparing the two, Uber's revenue growth of 57% was higher than Lyft's growth of approximately 35.7% in 2021.", sources=[], source_nodes=[], is_dummy_stream=False)
[Advanced] -- How does this actually work?¶
So, under the hood we are prompting the LLM to first output the CoA, then we parse it and run functions, then we refine all that into a final output.
First, we parse the tools into python-like function defintions by parsing tool.metadata.fn_schema_str
, along with the tool name and description.
You can find that code in the utils.
What this looks like is we have a prompt like this:
REASONING_PROMPT_TEMPALTE = """Generate an abstract plan of reasoning using placeholders for the specific values and function calls needed.
The placeholders should be labeled y1, y2, etc.
Function calls should be represented as inline strings like [FUNC {{function_name}}({{input1}}, {{input2}}, ...) = {{output_placeholder}}].
Assume someone will read the plan after the functions have been executed in order to make a final response.
Not every question will require function calls to answer.
If you do invoke a function, only use the available functions, do not make up functions.
Example:
-----------
Available functions:
\`\`\`python
def add(a: int, b: int) -> int:
\"\"\"Add two numbers together.\"\"\"
...
def multiply(a: int, b: int) -> int:
\"\"\"Multiply two numbers together.\"\"\"
...
\`\`\`
Question:
Sally has 3 apples and buys 2 more. Then magically, a wizard casts a spell that multiplies the number of apples by 3. How many apples does Sally have now?
Abstract plan of reasoning:
After buying the apples, Sally has [FUNC add(3, 2) = y1] apples. Then, the wizard casts a spell to multiply the number of apples by 3, resulting in [FUNC multiply(y1, 3) = y2] apples.
Your Turn:
-----------
Available functions:
\`\`\`python
{functions}
\`\`\`
Question:
{question}
Abstract plan of reasoning:
"""
This will generate the chain-of-abstraction reasoning.
Then, the reasoning is parsed using the output parser.
After calling the functions and filling in values, we give the LLM a chance to refine the response, using this prompt:
REFINE_REASONING_PROMPT_TEMPALTE = """Generate a response to a question by using a previous abstract plan of reasoning. Use the previous reasoning as context to write a response to the question.
Example:
-----------
Question:
Sally has 3 apples and buys 2 more. Then magically, a wizard casts a spell that multiplies the number of apples by 3. How many apples does Sally have now?
Previous reasoning:
After buying the apples, Sally has [FUNC add(3, 2) = 5] apples. Then, the wizard casts a spell to multiply the number of apples by 3, resulting in [FUNC multiply(5, 3) = 15] apples.
Response:
After the wizard casts the spell, Sally has 15 apples.
Your Turn:
-----------
Question:
{question}
Previous reasoning:
{prev_reasoning}
Response:
"""