Open In Colab

Pydantic Tree Summarize#

In this notebook, we demonstrate how to use tree summarize with structured outputs. Specifically, tree summarize is used to output pydantic objects.

import os
import openai

os.environ["OPENAI_API_KEY"] = "sk-..."
openai.api_key = os.environ["OPENAI_API_KEY"]

Download Data#

!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

Load Data#

from llama_index.core import SimpleDirectoryReader
reader = SimpleDirectoryReader(
    input_files=["./data/paul_graham/paul_graham_essay.txt"]
)
docs = reader.load_data()
text = docs[0].text

Summarize#

from llama_index.core.response_synthesizers import TreeSummarize
from llama_index.core.types import BaseModel
from typing import List

Create pydantic model to structure response#

class Biography(BaseModel):
    """Data model for a biography."""

    name: str
    best_known_for: List[str]
    extra_info: str
summarizer = TreeSummarize(verbose=True, output_cls=Biography)
response = summarizer.get_response("who is Paul Graham?", [text])
5 text chunks after repacking
1 text chunks after repacking

Inspect the response#

Here, we see the response is in an instance of our Biography class.

print(response)
name='Paul Graham' best_known_for=['Writing', 'Programming', 'Art', 'Co-founding Viaweb', 'Co-founding Y Combinator', 'Essayist'] extra_info="Paul Graham is a multi-talented individual who has made significant contributions in various fields. He is known for his work in writing, programming, art, co-founding Viaweb, co-founding Y Combinator, and his essays on startups and programming. He started his career by writing short stories and programming on the IBM 1401 computer. He later became interested in artificial intelligence and Lisp programming. He wrote a book called 'On Lisp' and focused on Lisp hacking. Eventually, he decided to pursue art and attended art school. He is known for his paintings, particularly still life paintings. Graham is also a programmer, entrepreneur, and venture capitalist. He co-founded Viaweb, an early e-commerce platform, and Y Combinator, a startup accelerator. He has written influential essays on startups and programming. Additionally, he has made contributions to the field of computer programming and entrepreneurship."
print(response.name)
Paul Graham
print(response.best_known_for)
['Writing', 'Programming', 'Art', 'Co-founding Viaweb', 'Co-founding Y Combinator', 'Essayist']
print(response.extra_info)
Paul Graham is a multi-talented individual who has made significant contributions in various fields. He is known for his work in writing, programming, art, co-founding Viaweb, co-founding Y Combinator, and his essays on startups and programming. He started his career by writing short stories and programming on the IBM 1401 computer. He later became interested in artificial intelligence and Lisp programming. He wrote a book called 'On Lisp' and focused on Lisp hacking. Eventually, he decided to pursue art and attended art school. He is known for his paintings, particularly still life paintings. Graham is also a programmer, entrepreneur, and venture capitalist. He co-founded Viaweb, an early e-commerce platform, and Y Combinator, a startup accelerator. He has written influential essays on startups and programming. Additionally, he has made contributions to the field of computer programming and entrepreneurship.