Retrieval-Augmented OpenAI Agent¶
In this tutorial, we show you how to use our OpenAIAgent
implementation with a tool retriever,
to build an agent on top of OpenAI's function API and store/index an arbitrary number of tools. Our indexing/retrieval modules help to remove the complexity of having too many functions to fit in the prompt.
Initial Setup¶
Let's start by importing some simple building blocks.
The main thing we need is:
- the OpenAI API
- a place to keep conversation history
- a definition for tools that our agent can use.
If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
%pip install llama-index-agent-openai-legacy
!pip install llama-index
import json
from typing import Sequence
from llama_index.core.tools import BaseTool, FunctionTool
/Users/suo/miniconda3/envs/llama/lib/python3.9/site-packages/deeplake/util/check_latest_version.py:32: UserWarning: A newer version of deeplake (3.6.7) is available. It's recommended that you update to the latest version using `pip install -U deeplake`. warnings.warn(
Let's define some very simple calculator tools for our agent.
def multiply(a: int, b: int) -> int:
"""Multiply two integers and returns the result integer"""
return a * b
def add(a: int, b: int) -> int:
"""Add two integers and returns the result integer"""
return a + b
def useless(a: int, b: int) -> int:
"""Toy useless function."""
pass
multiply_tool = FunctionTool.from_defaults(fn=multiply, name="multiply")
useless_tools = [
FunctionTool.from_defaults(fn=useless, name=f"useless_{str(idx)}")
for idx in range(28)
]
add_tool = FunctionTool.from_defaults(fn=add, name="add")
all_tools = [multiply_tool] + [add_tool] + useless_tools
all_tools_map = {t.metadata.name: t for t in all_tools}
Building an Object Index¶
We have an ObjectIndex
construct in LlamaIndex that allows the user to use our index data structures over arbitrary objects.
The ObjectIndex will handle serialiation to/from the object, and use an underying index (e.g. VectorStoreIndex, SummaryIndex, KeywordTableIndex) as the storage mechanism.
In this case, we have a large collection of Tool objects, and we'd want to define an ObjectIndex over these Tools.
The index comes bundled with a retrieval mechanism, an ObjectRetriever
.
This can be passed in to our agent so that it can perform Tool retrieval during query-time.
# define an "object" index over these tools
from llama_index.core import VectorStoreIndex
from llama_index.core.objects import ObjectIndex
obj_index = ObjectIndex.from_objects(
all_tools,
index_cls=VectorStoreIndex,
)
OpenAIAgent
w/ Tool Retrieval¶
We provide a OpenAIAgent
implementation in LlamaIndex, which can take in an ObjectRetriever
over a set of BaseTool
objects.
During query-time, we would first use the ObjectRetriever
to retrieve a set of relevant Tools. These tools would then be passed into the agent; more specifically, their function signatures would be passed into the OpenAI Function calling API.
from llama_index.agent.openai import OpenAIAgent
agent = OpenAIAgent.from_tools(
tool_retriever=obj_index.as_retriever(similarity_top_k=2), verbose=True
)
agent.chat("What's 212 multiplied by 122? Make sure to use Tools")
=== Calling Function === Calling function: multiply with args: { "a": 212, "b": 122 } Got output: 25864 ========================
Response(response='212 multiplied by 122 is 25,864.', source_nodes=[], metadata=None)
agent.chat("What's 212 added to 122 ? Make sure to use Tools")
=== Calling Function === Calling function: add with args: { "a": 212, "b": 122 } Got output: 334 ========================
Response(response='212 added to 122 is 334.', source_nodes=[], metadata=None)