Skip to content

Index

LLM #

Bases: BaseLLM

The LLM class is the main class for interacting with language models.

Attributes:

Name Type Description
system_prompt Optional[str]

System prompt for LLM calls.

messages_to_prompt Callable

Function to convert a list of messages to an LLM prompt.

completion_to_prompt Callable

Function to convert a completion to an LLM prompt.

output_parser Optional[BaseOutputParser]

Output parser to parse, validate, and correct errors programmatically.

pydantic_program_mode PydanticProgramMode

Pydantic program mode to use for structured prediction.

metadata abstractmethod property #

metadata: LLMMetadata

LLM metadata.

Returns:

Name Type Description
LLMMetadata LLMMetadata

LLM metadata containing various information about the LLM.

class_name classmethod #

class_name() -> str

Get the class name, used as a unique ID in serialization.

This provides a key that makes serialization robust against actual class name changes.

as_query_component #

as_query_component(partial: Optional[Dict[str, Any]] = None, **kwargs: Any) -> QueryComponent

Get query component.

chat abstractmethod #

chat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponse

Chat endpoint for LLM.

Parameters:

Name Type Description Default
messages Sequence[ChatMessage]

Sequence of chat messages.

required
kwargs Any

Additional keyword arguments to pass to the LLM.

{}

Returns:

Name Type Description
ChatResponse ChatResponse

Chat response from the LLM.

Examples:

from llama_index.core.llms import ChatMessage

response = llm.chat([ChatMessage(role="user", content="Hello")])
print(response.content)

complete abstractmethod #

complete(prompt: str, formatted: bool = False, **kwargs: Any) -> CompletionResponse

Completion endpoint for LLM.

If the LLM is a chat model, the prompt is transformed into a single user message.

Parameters:

Name Type Description Default
prompt str

Prompt to send to the LLM.

required
formatted bool

Whether the prompt is already formatted for the LLM, by default False.

False
kwargs Any

Additional keyword arguments to pass to the LLM.

{}

Returns:

Name Type Description
CompletionResponse CompletionResponse

Completion response from the LLM.

Examples:

response = llm.complete("your prompt")
print(response.text)

stream_chat abstractmethod #

stream_chat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponseGen

Streaming chat endpoint for LLM.

Parameters:

Name Type Description Default
messages Sequence[ChatMessage]

Sequence of chat messages.

required
kwargs Any

Additional keyword arguments to pass to the LLM.

{}

Yields:

Name Type Description
ChatResponse ChatResponseGen

A generator of ChatResponse objects, each containing a new token of the response.

Examples:

from llama_index.core.llms import ChatMessage

gen = llm.stream_chat([ChatMessage(role="user", content="Hello")])
for response in gen:
    print(response.delta, end="", flush=True)

stream_complete abstractmethod #

stream_complete(prompt: str, formatted: bool = False, **kwargs: Any) -> CompletionResponseGen

Streaming completion endpoint for LLM.

If the LLM is a chat model, the prompt is transformed into a single user message.

Parameters:

Name Type Description Default
prompt str

Prompt to send to the LLM.

required
formatted bool

Whether the prompt is already formatted for the LLM, by default False.

False
kwargs Any

Additional keyword arguments to pass to the LLM.

{}

Yields:

Name Type Description
CompletionResponse CompletionResponseGen

A generator of CompletionResponse objects, each containing a new token of the response.

Examples:

gen = llm.stream_complete("your prompt")
for response in gen:
    print(response.text, end="", flush=True)

achat abstractmethod async #

achat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponse

Async chat endpoint for LLM.

Parameters:

Name Type Description Default
messages Sequence[ChatMessage]

Sequence of chat messages.

required
kwargs Any

Additional keyword arguments to pass to the LLM.

{}

Returns:

Name Type Description
ChatResponse ChatResponse

Chat response from the LLM.

Examples:

from llama_index.core.llms import ChatMessage

response = await llm.achat([ChatMessage(role="user", content="Hello")])
print(response.content)

acomplete abstractmethod async #

acomplete(prompt: str, formatted: bool = False, **kwargs: Any) -> CompletionResponse

Async completion endpoint for LLM.

If the LLM is a chat model, the prompt is transformed into a single user message.

Parameters:

Name Type Description Default
prompt str

Prompt to send to the LLM.

required
formatted bool

Whether the prompt is already formatted for the LLM, by default False.

False
kwargs Any

Additional keyword arguments to pass to the LLM.

{}

Returns:

Name Type Description
CompletionResponse CompletionResponse

Completion response from the LLM.

Examples:

response = await llm.acomplete("your prompt")
print(response.text)

astream_chat abstractmethod async #

astream_chat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponseAsyncGen

Async streaming chat endpoint for LLM.

Parameters:

Name Type Description Default
messages Sequence[ChatMessage]

Sequence of chat messages.

required
kwargs Any

Additional keyword arguments to pass to the LLM.

{}

Yields:

Name Type Description
ChatResponse ChatResponseAsyncGen

An async generator of ChatResponse objects, each containing a new token of the response.

Examples:

from llama_index.core.llms import ChatMessage

gen = await llm.astream_chat([ChatMessage(role="user", content="Hello")])
async for response in gen:
    print(response.delta, end="", flush=True)

astream_complete abstractmethod async #

astream_complete(prompt: str, formatted: bool = False, **kwargs: Any) -> CompletionResponseAsyncGen

Async streaming completion endpoint for LLM.

If the LLM is a chat model, the prompt is transformed into a single user message.

Parameters:

Name Type Description Default
prompt str

Prompt to send to the LLM.

required
formatted bool

Whether the prompt is already formatted for the LLM, by default False.

False
kwargs Any

Additional keyword arguments to pass to the LLM.

{}

Yields:

Name Type Description
CompletionResponse CompletionResponseAsyncGen

An async generator of CompletionResponse objects, each containing a new token of the response.

Examples:

gen = await llm.astream_complete("your prompt")
async for response in gen:
    print(response.text, end="", flush=True)

structured_predict #

structured_predict(output_cls: BaseModel, prompt: PromptTemplate, **prompt_args: Any) -> BaseModel

Structured predict.

Parameters:

Name Type Description Default
output_cls BaseModel

Output class to use for structured prediction.

required
prompt PromptTemplate

Prompt template to use for structured prediction.

required
prompt_args Any

Additional arguments to format the prompt with.

{}

Returns:

Name Type Description
BaseModel BaseModel

The structured prediction output.

Examples:

from pydantic.v1 import BaseModel

class Test(BaseModel):
    \"\"\"My test class.\"\"\"
    name: str

from llama_index.core.prompts import PromptTemplate

prompt = PromptTemplate("Please predict a Test with a random name related to {topic}.")
output = llm.structured_predict(Test, prompt, topic="cats")
print(output.name)

astructured_predict async #

astructured_predict(output_cls: BaseModel, prompt: PromptTemplate, **prompt_args: Any) -> BaseModel

Async Structured predict.

Parameters:

Name Type Description Default
output_cls BaseModel

Output class to use for structured prediction.

required
prompt PromptTemplate

Prompt template to use for structured prediction.

required
prompt_args Any

Additional arguments to format the prompt with.

{}

Returns:

Name Type Description
BaseModel BaseModel

The structured prediction output.

Examples:

from pydantic.v1 import BaseModel

class Test(BaseModel):
    \"\"\"My test class.\"\"\"
    name: str

from llama_index.core.prompts import PromptTemplate

prompt = PromptTemplate("Please predict a Test with a random name related to {topic}.")
output = await llm.astructured_predict(Test, prompt, topic="cats")
print(output.name)

stream_structured_predict #

stream_structured_predict(output_cls: BaseModel, prompt: PromptTemplate, **prompt_args: Any) -> Generator[Union[Model, List[Model]], None, None]

Stream Structured predict.

Parameters:

Name Type Description Default
output_cls BaseModel

Output class to use for structured prediction.

required
prompt PromptTemplate

Prompt template to use for structured prediction.

required
prompt_args Any

Additional arguments to format the prompt with.

{}

Returns:

Name Type Description
Generator Generator[Union[Model, List[Model]], None, None]

A generator returning partial copies of the model or list of models.

Examples:

from pydantic.v1 import BaseModel

class Test(BaseModel):
    \"\"\"My test class.\"\"\"
    name: str

from llama_index.core.prompts import PromptTemplate

prompt = PromptTemplate("Please predict a Test with a random name related to {topic}.")
stream_output = llm.stream_structured_predict(Test, prompt, topic="cats")
for partial_output in stream_output:
    # stream partial outputs until completion
    print(partial_output.name)

astream_structured_predict async #

astream_structured_predict(output_cls: BaseModel, prompt: PromptTemplate, **prompt_args: Any) -> AsyncGenerator[Union[Model, List[Model]], None]

Async Stream Structured predict.

Parameters:

Name Type Description Default
output_cls BaseModel

Output class to use for structured prediction.

required
prompt PromptTemplate

Prompt template to use for structured prediction.

required
prompt_args Any

Additional arguments to format the prompt with.

{}

Returns:

Name Type Description
Generator AsyncGenerator[Union[Model, List[Model]], None]

A generator returning partial copies of the model or list of models.

Examples:

from pydantic.v1 import BaseModel

class Test(BaseModel):
    \"\"\"My test class.\"\"\"
    name: str

from llama_index.core.prompts import PromptTemplate

prompt = PromptTemplate("Please predict a Test with a random name related to {topic}.")
stream_output = await llm.astream_structured_predict(Test, prompt, topic="cats")
async for partial_output in stream_output:
    # stream partial outputs until completion
    print(partial_output.name)

predict #

predict(prompt: BasePromptTemplate, **prompt_args: Any) -> str

Predict for a given prompt.

Parameters:

Name Type Description Default
prompt BasePromptTemplate

The prompt to use for prediction.

required
prompt_args Any

Additional arguments to format the prompt with.

{}

Returns:

Name Type Description
str str

The prediction output.

Examples:

from llama_index.core.prompts import PromptTemplate

prompt = PromptTemplate("Please write a random name related to {topic}.")
output = llm.predict(prompt, topic="cats")
print(output)

stream #

stream(prompt: BasePromptTemplate, **prompt_args: Any) -> TokenGen

Stream predict for a given prompt.

Parameters:

Name Type Description Default
prompt BasePromptTemplate

The prompt to use for prediction.

required
prompt_args Any

Additional arguments to format the prompt with.

{}

Yields:

Name Type Description
str TokenGen

Each streamed token.

Examples:

from llama_index.core.prompts import PromptTemplate

prompt = PromptTemplate("Please write a random name related to {topic}.")
gen = llm.stream_predict(prompt, topic="cats")
for token in gen:
    print(token, end="", flush=True)

apredict async #

apredict(prompt: BasePromptTemplate, **prompt_args: Any) -> str

Async Predict for a given prompt.

Parameters:

Name Type Description Default
prompt BasePromptTemplate

The prompt to use for prediction.

required
prompt_args Any

Additional arguments to format the prompt with.

{}

Returns:

Name Type Description
str str

The prediction output.

Examples:

from llama_index.core.prompts import PromptTemplate

prompt = PromptTemplate("Please write a random name related to {topic}.")
output = await llm.apredict(prompt, topic="cats")
print(output)

astream async #

astream(prompt: BasePromptTemplate, **prompt_args: Any) -> TokenAsyncGen

Async stream predict for a given prompt.

prompt (BasePromptTemplate): The prompt to use for prediction. prompt_args (Any): Additional arguments to format the prompt with.

Yields:

Name Type Description
str TokenAsyncGen

An async generator that yields strings of tokens.

Examples:

from llama_index.core.prompts import PromptTemplate

prompt = PromptTemplate("Please write a random name related to {topic}.")
gen = await llm.astream_predict(prompt, topic="cats")
async for token in gen:
    print(token, end="", flush=True)

predict_and_call #

predict_and_call(tools: List[BaseTool], user_msg: Optional[Union[str, ChatMessage]] = None, chat_history: Optional[List[ChatMessage]] = None, verbose: bool = False, **kwargs: Any) -> AgentChatResponse

Predict and call the tool.

By default uses a ReAct agent to do tool calling (through text prompting), but function calling LLMs will implement this differently.

apredict_and_call async #

apredict_and_call(tools: List[BaseTool], user_msg: Optional[Union[str, ChatMessage]] = None, chat_history: Optional[List[ChatMessage]] = None, verbose: bool = False, **kwargs: Any) -> AgentChatResponse

Predict and call the tool.

as_structured_llm #

as_structured_llm(output_cls: BaseModel, **kwargs: Any) -> StructuredLLM

Return a structured LLM around a given object.