Single-Turn Multi-Function Calling OpenAI Agents#

Open In Colab

With the latest OpenAI API (v. 1.1.0+), users can now execute multiple function calls within a single turn of User and Agent dialogue. We’ve updated our library to enable this new feature as well, and in this notebook we’ll show you how it all works!

NOTE: OpenAI refers to this as “Parallel” function calling, but the current implementation doesn’t invoke parallel computations of the multiple function calls. So, it’s “parallelizable” function calling in terms of our current implementation.

from llama_index.agent import OpenAIAgent
from llama_index.llms import OpenAI
from llama_index.tools import BaseTool, FunctionTool

Setup#

If you’ve seen any of our previous notebooks on OpenAI Agents, then you’re already familiar with the cookbook recipe that we have to follow here. But if not, or if you fancy a refresher then all we need to do (at a high level) are the following steps:

  1. Define a set of tools (we’ll use FunctionTool) since Agents work with tools

  2. Define the LLM for the Agent

  3. Define a OpenAIAgent

def multiply(a: int, b: int) -> int:
    """Multiple two integers and returns the result integer"""
    return a * b


multiply_tool = FunctionTool.from_defaults(fn=multiply)
def add(a: int, b: int) -> int:
    """Add two integers and returns the result integer"""
    return a + b


add_tool = FunctionTool.from_defaults(fn=add)
llm = OpenAI(model="gpt-3.5-turbo-1106")
agent = OpenAIAgent.from_tools(
    [multiply_tool, add_tool], llm=llm, verbose=True
)

Sync mode#

response = agent.chat("What is (121 * 3) + 42?")
print(str(response))
STARTING TURN 1
---------------

=== Calling Function ===
Calling function: multiply with args: {"a": 121, "b": 3}
Got output: 363
========================

=== Calling Function ===
Calling function: add with args: {"a": 363, "b": 42}
Got output: 405
========================

STARTING TURN 2
---------------

The result of (121 * 3) + 42 is 405.
response = agent.stream_chat("What is (121 * 3) + 42?")
STARTING TURN 1
---------------

=== Calling Function ===
Calling function: add with args: {"a":363,"b":42}
Got output: 405
========================

STARTING TURN 2
---------------

Async mode#

import nest_asyncio

nest_asyncio.apply()
response = await agent.achat("What is (121 * 3) + 42?")
print(str(response))
STARTING TURN 1
---------------

=== Calling Function ===
Calling function: add with args: {"a":363,"b":42}
Got output: 405
========================

STARTING TURN 2
---------------

The result of (121 * 3) + 42 is 405.
response = await agent.astream_chat("What is (121 * 3) + 42?")

response_gen = response.response_gen

async for token in response.async_response_gen():
    print(token, end="")
STARTING TURN 1
---------------

=== Calling Function ===
Calling function: multiply with args: {"a": 121, "b": 3}
Got output: 363
========================

=== Calling Function ===
Calling function: add with args: {"a": 363, "b": 42}
Got output: 405
========================

STARTING TURN 2
---------------

The result of (121 * 3) + 42 is 405.

Example from OpenAI docs#

Here’s an example straight from the OpenAI docs on Parallel function calling. (Their example gets this done in 76 lines of code, whereas with the llama_index library you can get that down to about 18 lines.)

import json


# Example dummy function hard coded to return the same weather
# In production, this could be your backend API or an external API
def get_current_weather(location, unit="fahrenheit"):
    """Get the current weather in a given location"""
    if "tokyo" in location.lower():
        return json.dumps(
            {"location": location, "temperature": "10", "unit": "celsius"}
        )
    elif "san francisco" in location.lower():
        return json.dumps(
            {"location": location, "temperature": "72", "unit": "fahrenheit"}
        )
    else:
        return json.dumps(
            {"location": location, "temperature": "22", "unit": "celsius"}
        )


weather_tool = FunctionTool.from_defaults(fn=get_current_weather)
llm = OpenAI(model="gpt-3.5-turbo-1106")
agent = OpenAIAgent.from_tools([weather_tool], llm=llm, verbose=True)
response = agent.chat(
    "What's the weather like in San Francisco, Tokyo, and Paris?"
)
STARTING TURN 1
---------------

=== Calling Function ===
Calling function: get_current_weather with args: {"location": "San Francisco", "unit": "fahrenheit"}
Got output: {"location": "San Francisco", "temperature": "72", "unit": "fahrenheit"}
========================

=== Calling Function ===
Calling function: get_current_weather with args: {"location": "Tokyo", "unit": "fahrenheit"}
Got output: {"location": "Tokyo", "temperature": "10", "unit": "celsius"}
========================

=== Calling Function ===
Calling function: get_current_weather with args: {"location": "Paris", "unit": "fahrenheit"}
Got output: {"location": "Paris", "temperature": "22", "unit": "celsius"}
========================

STARTING TURN 2
---------------

All of the above function calls that the Agent has done above were in a single turn of dialogue between the Assistant and the User. What’s interesting is that an older version of GPT-3.5 is not quite advanced enough compared to is successor — it will do the above task in 3 separate turns. For the sake of demonstration, here it is below.

llm = OpenAI(model="gpt-3.5-turbo-0613")
agent = OpenAIAgent.from_tools([weather_tool], llm=llm, verbose=True)
response = agent.chat(
    "What's the weather like in San Francisco, Tokyo, and Paris?"
)
STARTING TURN 1
---------------

=== Calling Function ===
Calling function: get_current_weather with args: {
  "location": "San Francisco"
}
Got output: {"location": "San Francisco", "temperature": "72", "unit": "fahrenheit"}
========================

STARTING TURN 2
---------------

=== Calling Function ===
Calling function: get_current_weather with args: {
  "location": "Tokyo"
}
Got output: {"location": "Tokyo", "temperature": "10", "unit": "celsius"}
========================

STARTING TURN 3
---------------

=== Calling Function ===
Calling function: get_current_weather with args: {
  "location": "Paris"
}
Got output: {"location": "Paris", "temperature": "22", "unit": "celsius"}
========================

STARTING TURN 4
---------------

Conclusion#

And so, as you can see the llama_index library can handle multiple function calls (as well as a single function call) within a single turn of dialogue between the user and the OpenAI agent!