OpenAI¶

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [ ]:

Copied!

%pip install llama-index-llms-openai
%pip install llama-index-llms-openai

In [ ]:

Copied!

!pip install llama-index
!pip install llama-index

Basic Usage¶

In [ ]:

Copied!

import os

os.environ["OPENAI_API_KEY"] = "sk-..."
import os

os.environ["OPENAI_API_KEY"] = "sk-..."

Call `complete` with a prompt¶

In [ ]:

Copied!

from llama_index.llms.openai import OpenAI

resp = OpenAI().complete("Paul Graham is ")
from llama_index.llms.openai import OpenAI

resp = OpenAI().complete("Paul Graham is ")

In [ ]:

Copied!

print(resp)
print(resp)

a computer scientist, entrepreneur, and venture capitalist. He is best known for co-founding the startup accelerator Y Combinator and for his work on Lisp, a programming language. Graham has also written several influential essays on startups, technology, and entrepreneurship.

Call `chat` with a list of messages¶

In [ ]:

Copied!





from llama_index.core.llms import ChatMessage
from llama_index.llms.openai import OpenAI

messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="What is your name"),
]
resp = OpenAI().chat(messages)
from llama_index.core.llms import ChatMessage
from llama_index.llms.openai import OpenAI

messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="What is your name"),
]
resp = OpenAI().chat(messages)

In [ ]:

Copied!

print(resp)
print(resp)

assistant: Ahoy matey! The name's Rainbow Roger, the most colorful pirate on the seven seas! What can I do for ye today?

Streaming¶

Using stream_complete endpoint

In [ ]:

Copied!

from llama_index.llms.openai import OpenAI

llm = OpenAI()
resp = llm.stream_complete("Paul Graham is ")
from llama_index.llms.openai import OpenAI

llm = OpenAI()
resp = llm.stream_complete("Paul Graham is ")

In [ ]:

Copied!

for r in resp:
    print(r.delta, end="")
for r in resp:
    print(r.delta, end="")

a computer scientist, entrepreneur, and venture capitalist. He is best known for co-founding the startup accelerator Y Combinator and for his work on programming languages and web development. Graham is also a prolific writer and has published several influential essays on technology, startups, and entrepreneurship.

Using stream_chat endpoint

In [ ]:

Copied!





from llama_index.llms.openai import OpenAI
from llama_index.core.llms import ChatMessage

llm = OpenAI()
messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="What is your name"),
]
resp = llm.stream_chat(messages)
from llama_index.llms.openai import OpenAI
from llama_index.core.llms import ChatMessage

llm = OpenAI()
messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="What is your name"),
]
resp = llm.stream_chat(messages)

In [ ]:

Copied!

for r in resp:
    print(r.delta, end="")
for r in resp:
    print(r.delta, end="")

Ahoy matey! The name's Captain Rainbowbeard! Aye, I be a pirate with a love for all things colorful and bright. Me beard be as vibrant as a rainbow, and me ship be the most colorful vessel on the seven seas! What can I do for ye today, me hearty?

Configure Model¶

In [ ]:

Copied!

from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")

In [ ]:

Copied!

resp = llm.complete("Paul Graham is ")
resp = llm.complete("Paul Graham is ")

In [ ]:

Copied!

print(resp)
print(resp)

a computer scientist, entrepreneur, and venture capitalist. He is best known for co-founding the startup accelerator Y Combinator and for his work on Lisp, a programming language. Graham has also written several influential essays on startups, technology, and entrepreneurship.

In [ ]:

Copied!





messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="What is your name"),
]
resp = llm.chat(messages)
messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="What is your name"),
]
resp = llm.chat(messages)

In [ ]:

Copied!

print(resp)
print(resp)

assistant: Ahoy matey! The name's Captain Rainbowbeard, the most colorful pirate on the seven seas! What can I do for ye today? Arrr!

Function Calling¶

OpenAI models have native support for function calling. This conveniently integrates with LlamaIndex tool abstractions, letting you plug in any arbitrary Python function to the LLM.

In the example below, we define a function to generate a Song object.

In [ ]:

Copied!

from pydantic import BaseModel
from llama_index.llms.openai.utils import to_openai_tool
from llama_index.core.tools import FunctionTool

class Song(BaseModel):
    """A song with name and artist"""

    name: str
    artist: str

def generate_song(name: str, artist: str) -> Song:
    """Generates a song with provided name and artist."""
    return Song(name=name, artist=artist)

tool = FunctionTool.from_defaults(fn=generate_song)
from pydantic import BaseModel
from llama_index.llms.openai.utils import to_openai_tool
from llama_index.core.tools import FunctionTool

class Song(BaseModel):
    """A song with name and artist"""

    name: str
    artist: str

def generate_song(name: str, artist: str) -> Song:
    """Generates a song with provided name and artist."""
    return Song(name=name, artist=artist)

tool = FunctionTool.from_defaults(fn=generate_song)

The strict parameter tells OpenAI whether or not to use constrained sampling when generating tool calls/structured outputs. This means that the generated tool call schema will always contain the expected fields.

Since this seems to increase latency, it defaults to false.

In [ ]:

Copied!





from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o-mini", strict=True)
response = llm.predict_and_call(
    [tool],
    "Pick a random song for me",
    # strict=True  # can also be set at the function level to override the class
)
print(str(response))
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o-mini", strict=True)
response = llm.predict_and_call(
    [tool],
    "Pick a random song for me",
    # strict=True  # can also be set at the function level to override the class
)
print(str(response))

name='Random Vibes' artist='DJ Chill'

We can also do multiple function calling.

In [ ]:

Copied!





llm = OpenAI(model="gpt-3.5-turbo")
response = llm.predict_and_call(
    [tool],
    "Generate five songs from the Beatles",
    allow_parallel_tool_calls=True,
)
for s in response.sources:
    print(f"Name: {s.tool_name}, Input: {s.raw_input}, Output: {str(s)}")
llm = OpenAI(model="gpt-3.5-turbo")
response = llm.predict_and_call(
    [tool],
    "Generate five songs from the Beatles",
    allow_parallel_tool_calls=True,
)
for s in response.sources:
    print(f"Name: {s.tool_name}, Input: {s.raw_input}, Output: {str(s)}")

Name: generate_song, Input: {'args': (), 'kwargs': {'name': 'Hey Jude', 'artist': 'The Beatles'}}, Output: name='Hey Jude' artist='The Beatles'
Name: generate_song, Input: {'args': (), 'kwargs': {'name': 'Let It Be', 'artist': 'The Beatles'}}, Output: name='Let It Be' artist='The Beatles'
Name: generate_song, Input: {'args': (), 'kwargs': {'name': 'Yesterday', 'artist': 'The Beatles'}}, Output: name='Yesterday' artist='The Beatles'
Name: generate_song, Input: {'args': (), 'kwargs': {'name': 'Come Together', 'artist': 'The Beatles'}}, Output: name='Come Together' artist='The Beatles'
Name: generate_song, Input: {'args': (), 'kwargs': {'name': 'Help!', 'artist': 'The Beatles'}}, Output: name='Help!' artist='The Beatles'

Structured Prediction¶

An important use case for function calling is extracting structured objects. LlamaIndex provides an intuitive interface for converting any LLM into a structured LLM - simply define the target Pydantic class (can be nested), and given a prompt, we extract out the desired object.

In [ ]:

Copied!





from llama_index.llms.openai import OpenAI
from llama_index.core.prompts import PromptTemplate
from pydantic.v1 import BaseModel
from typing import List


class MenuItem(BaseModel):
    """A menu item in a restaurant."""

    course_name: str
    is_vegetarian: bool


class Restaurant(BaseModel):
    """A restaurant with name, city, and cuisine."""

    name: str
    city: str
    cuisine: str
    menu_items: List[MenuItem]


llm = OpenAI(model="gpt-3.5-turbo")
prompt_tmpl = PromptTemplate(
    "Generate a restaurant in a given city {city_name}"
)
# Option 1: Use `as_structured_llm`
restaurant_obj = (
    llm.as_structured_llm(Restaurant)
    .complete(prompt_tmpl.format(city_name="Dallas"))
    .raw
)
# Option 2: Use `structured_predict`
# restaurant_obj = llm.structured_predict(Restaurant, prompt_tmpl, city_name="Miami")
from llama_index.llms.openai import OpenAI
from llama_index.core.prompts import PromptTemplate
from pydantic.v1 import BaseModel
from typing import List


class MenuItem(BaseModel):
    """A menu item in a restaurant."""

    course_name: str
    is_vegetarian: bool


class Restaurant(BaseModel):
    """A restaurant with name, city, and cuisine."""

    name: str
    city: str
    cuisine: str
    menu_items: List[MenuItem]


llm = OpenAI(model="gpt-3.5-turbo")
prompt_tmpl = PromptTemplate(
    "Generate a restaurant in a given city {city_name}"
)
# Option 1: Use `as_structured_llm`
restaurant_obj = (
    llm.as_structured_llm(Restaurant)
    .complete(prompt_tmpl.format(city_name="Dallas"))
    .raw
)
# Option 2: Use `structured_predict`
# restaurant_obj = llm.structured_predict(Restaurant, prompt_tmpl, city_name="Miami")

In [ ]:

Copied!

restaurant_obj
restaurant_obj

Out[ ]:

Restaurant(name='Tasty Bites', city='Dallas', cuisine='Italian', menu_items=[MenuItem(course_name='Appetizer', is_vegetarian=True), MenuItem(course_name='Main Course', is_vegetarian=False), MenuItem(course_name='Dessert', is_vegetarian=True)])

Structured Prediction with Streaming¶

Any LLM wrapped with as_structured_llm supports streaming through stream_chat.

In [ ]:

Copied!





from llama_index.core.llms import ChatMessage
from IPython.display import clear_output
from pprint import pprint

input_msg = ChatMessage.from_str("Generate a restaurant in Boston")

sllm = llm.as_structured_llm(Restaurant)
stream_output = sllm.stream_chat([input_msg])
for partial_output in stream_output:
    clear_output(wait=True)
    pprint(partial_output.raw.dict())
    restaurant_obj = partial_output.raw

restaurant_obj
from llama_index.core.llms import ChatMessage
from IPython.display import clear_output
from pprint import pprint

input_msg = ChatMessage.from_str("Generate a restaurant in Boston")

sllm = llm.as_structured_llm(Restaurant)
stream_output = sllm.stream_chat([input_msg])
for partial_output in stream_output:
    clear_output(wait=True)
    pprint(partial_output.raw.dict())
    restaurant_obj = partial_output.raw

restaurant_obj

{'city': 'Boston',
 'cuisine': 'American',
 'menu_items': [{'course_name': 'Appetizer', 'is_vegetarian': True},
                {'course_name': 'Main Course', 'is_vegetarian': False},
                {'course_name': 'Dessert', 'is_vegetarian': True}],
 'name': 'Boston Bites'}

Out[ ]:

Restaurant(name='Boston Bites', city='Boston', cuisine='American', menu_items=[MenuItem(course_name='Appetizer', is_vegetarian=True), MenuItem(course_name='Main Course', is_vegetarian=False), MenuItem(course_name='Dessert', is_vegetarian=True)])

Async¶

In [ ]:

Copied!

from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")

In [ ]:

Copied!

resp = await llm.acomplete("Paul Graham is ")
resp = await llm.acomplete("Paul Graham is ")

In [ ]:

Copied!

print(resp)
print(resp)

a computer scientist, entrepreneur, and venture capitalist. He is best known for co-founding the startup accelerator Y Combinator and for his work as an essayist and author on topics related to technology, startups, and entrepreneurship. Graham is also the co-founder of Viaweb, one of the first web-based applications, which was acquired by Yahoo in 1998. He has been a prominent figure in the tech industry for many years and is known for his insightful and thought-provoking writings on a wide range of subjects.

In [ ]:

Copied!

resp = await llm.astream_complete("Paul Graham is ")
resp = await llm.astream_complete("Paul Graham is ")

In [ ]:

Copied!

async for delta in resp:
    print(delta.delta, end="")
async for delta in resp:
    print(delta.delta, end="")


Paul Graham is an entrepreneur, venture capitalist, and computer scientist. He is best known for his work in the startup world, having co-founded the accelerator Y Combinator and investing in many successful startups such as Airbnb, Dropbox, and Stripe. He is also a prolific writer, having authored several books on topics such as startups, programming, and technology.

Async function calling is also supported.

In [ ]:

Copied!

llm = OpenAI(model="gpt-3.5-turbo")
response = await llm.apredict_and_call([tool], "Generate a song")
print(str(response))
llm = OpenAI(model="gpt-3.5-turbo")
response = await llm.apredict_and_call([tool], "Generate a song")
print(str(response))

name='Sunshine' artist='John Smith'

Set API Key at a per-instance level¶

If desired, you can have separate LLM instances use separate API keys.

In [ ]:

Copied!

from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", api_key="BAD_KEY")
resp = OpenAI().complete("Paul Graham is ")
print(resp)
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", api_key="BAD_KEY")
resp = OpenAI().complete("Paul Graham is ")
print(resp)

a computer scientist, entrepreneur, and venture capitalist. He is best known as the co-founder of the startup accelerator Y Combinator. Graham has also written several influential essays on startups and entrepreneurship, which have gained a wide following in the tech industry. He has been involved in the founding and funding of numerous successful startups, including Reddit, Dropbox, and Airbnb. Graham is known for his insightful and often controversial opinions on various topics, including education, inequality, and the future of technology.

Additional kwargs¶

Rather than adding same parameters to each chat or completion call, you can set them at a per-instance level with additional_kwargs.

In [ ]:

Copied!

from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", additional_kwargs={"user": "your_user_id"})
resp = OpenAI().complete("Paul Graham is ")
print(resp)
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", additional_kwargs={"user": "your_user_id"})
resp = OpenAI().complete("Paul Graham is ")
print(resp)

In [ ]:

Copied!





from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", additional_kwargs={"user": "your_user_id"})
messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="What is your name"),
]
resp = llm.chat(messages)
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", additional_kwargs={"user": "your_user_id"})
messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="What is your name"),
]
resp = llm.chat(messages)

OpenAI¶

Basic Usage¶

Call complete with a prompt¶

Call chat with a list of messages¶

Streaming¶

Configure Model¶

Function Calling¶

Structured Prediction¶

Structured Prediction with Streaming¶

Async¶

Set API Key at a per-instance level¶

Additional kwargs¶

Call `complete` with a prompt¶

Call `chat` with a list of messages¶