Tree Index#

Building the Tree Index

Tree-structured Index Data Structures.

llama_index.core.indices.tree.GPTTreeIndex#

alias of TreeIndex

class llama_index.core.indices.tree.TreeAllLeafRetriever(index: TreeIndex, callback_manager: Optional[CallbackManager] = None, object_map: Optional[dict] = None, verbose: bool = False, **kwargs: Any)#

GPT all leaf retriever.

This class builds a query-specific tree from leaf nodes to return a response. Using this query mode means that the tree index doesn’t need to be built when initialized, since we rebuild the tree for each query.

Parameters

text_qa_template (Optional[BasePromptTemplate]) – Question-Answer Prompt (see Prompt Templates).

as_query_component(partial: Optional[Dict[str, Any]] = None, **kwargs: Any) QueryComponent#

Get query component.

get_prompts() Dict[str, BasePromptTemplate]#

Get a prompt.

get_service_context() Optional[ServiceContext]#

Attempts to resolve a service context. Short-circuits at self.service_context, self._service_context, or self._index.service_context.

retrieve(str_or_query_bundle: Union[str, QueryBundle]) List[NodeWithScore]#

Retrieve nodes given query.

Parameters

str_or_query_bundle (QueryType) – Either a query string or a QueryBundle object.

update_prompts(prompts_dict: Dict[str, BasePromptTemplate]) None#

Update prompts.

Other prompts will remain in place.

class llama_index.core.indices.tree.TreeIndex(nodes: Optional[Sequence[BaseNode]] = None, objects: Optional[Sequence[IndexNode]] = None, index_struct: Optional[IndexGraph] = None, llm: Optional[LLM] = None, summary_template: Optional[BasePromptTemplate] = None, insert_prompt: Optional[BasePromptTemplate] = None, num_children: int = 10, build_tree: bool = True, use_async: bool = False, show_progress: bool = False, service_context: Optional[ServiceContext] = None, **kwargs: Any)#

Tree Index.

The tree index is a tree-structured index, where each node is a summary of the children nodes. During index construction, the tree is constructed in a bottoms-up fashion until we end up with a set of root_nodes.

There are a few different options during query time (see Querying an Index). The main option is to traverse down the tree from the root nodes. A secondary answer is to directly synthesize the answer from the root nodes.

Parameters
  • summary_template (Optional[BasePromptTemplate]) – A Summarization Prompt (see Prompt Templates).

  • insert_prompt (Optional[BasePromptTemplate]) – An Tree Insertion Prompt (see Prompt Templates).

  • num_children (int) – The number of children each node should have.

  • build_tree (bool) – Whether to build the tree during index construction.

  • show_progress (bool) – Whether to show progress bars. Defaults to False.

build_index_from_nodes(nodes: Sequence[BaseNode]) IS#

Build the index from nodes.

delete(doc_id: str, **delete_kwargs: Any) None#

Delete a document from the index. All nodes in the index related to the index will be deleted.

Parameters

doc_id (str) – A doc_id of the ingested document

delete_nodes(node_ids: List[str], delete_from_docstore: bool = False, **delete_kwargs: Any) None#

Delete a list of nodes from the index.

Parameters

doc_ids (List[str]) – A list of doc_ids from the nodes to delete

delete_ref_doc(ref_doc_id: str, delete_from_docstore: bool = False, **delete_kwargs: Any) None#

Delete a document and it’s nodes by using ref_doc_id.

property docstore: BaseDocumentStore#

Get the docstore corresponding to the index.

classmethod from_documents(documents: Sequence[Document], storage_context: Optional[StorageContext] = None, show_progress: bool = False, callback_manager: Optional[CallbackManager] = None, transformations: Optional[List[TransformComponent]] = None, service_context: Optional[ServiceContext] = None, **kwargs: Any) IndexType#

Create index from documents.

Parameters

documents (Optional[Sequence[BaseDocument]]) – List of documents to build the index from.

property index_id: str#

Get the index struct.

property index_struct: IS#

Get the index struct.

index_struct_cls#

alias of IndexGraph

insert(document: Document, **insert_kwargs: Any) None#

Insert a document.

insert_nodes(nodes: Sequence[BaseNode], **insert_kwargs: Any) None#

Insert nodes.

property ref_doc_info: Dict[str, RefDocInfo]#

Retrieve a dict mapping of ingested documents and their nodes+metadata.

refresh(documents: Sequence[Document], **update_kwargs: Any) List[bool]#

Refresh an index with documents that have changed.

This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or metadata. It will also insert any documents that previously were not stored.

refresh_ref_docs(documents: Sequence[Document], **update_kwargs: Any) List[bool]#

Refresh an index with documents that have changed.

This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or metadata. It will also insert any documents that previously were not stored.

set_index_id(index_id: str) None#

Set the index id.

NOTE: if you decide to set the index_id on the index_struct manually, you will need to explicitly call add_index_struct on the index_store to update the index store.

Parameters

index_id (str) – Index id to set.

update(document: Document, **update_kwargs: Any) None#

Update a document and it’s corresponding nodes.

This is equivalent to deleting the document and then inserting it again.

Parameters
  • document (Union[BaseDocument, BaseIndex]) – document to update

  • insert_kwargs (Dict) – kwargs to pass to insert

  • delete_kwargs (Dict) – kwargs to pass to delete

update_ref_doc(document: Document, **update_kwargs: Any) None#

Update a document and it’s corresponding nodes.

This is equivalent to deleting the document and then inserting it again.

Parameters
  • document (Union[BaseDocument, BaseIndex]) – document to update

  • insert_kwargs (Dict) – kwargs to pass to insert

  • delete_kwargs (Dict) – kwargs to pass to delete

class llama_index.core.indices.tree.TreeRootRetriever(index: TreeIndex, callback_manager: Optional[CallbackManager] = None, object_map: Optional[dict] = None, verbose: bool = False, **kwargs: Any)#

Tree root retriever.

This class directly retrieves the answer from the root nodes.

Unlike GPTTreeIndexLeafQuery, this class assumes the graph already stores the answer (because it was constructed with a query_str), so it does not attempt to parse information down the graph in order to synthesize an answer.

as_query_component(partial: Optional[Dict[str, Any]] = None, **kwargs: Any) QueryComponent#

Get query component.

get_prompts() Dict[str, BasePromptTemplate]#

Get a prompt.

get_service_context() Optional[ServiceContext]#

Attempts to resolve a service context. Short-circuits at self.service_context, self._service_context, or self._index.service_context.

retrieve(str_or_query_bundle: Union[str, QueryBundle]) List[NodeWithScore]#

Retrieve nodes given query.

Parameters

str_or_query_bundle (QueryType) – Either a query string or a QueryBundle object.

update_prompts(prompts_dict: Dict[str, BasePromptTemplate]) None#

Update prompts.

Other prompts will remain in place.

class llama_index.core.indices.tree.TreeSelectLeafEmbeddingRetriever(index: TreeIndex, embed_model: Optional[BaseEmbedding] = None, query_template: Optional[BasePromptTemplate] = None, text_qa_template: Optional[BasePromptTemplate] = None, refine_template: Optional[BasePromptTemplate] = None, query_template_multiple: Optional[BasePromptTemplate] = None, child_branch_factor: int = 1, verbose: bool = False, callback_manager: Optional[CallbackManager] = None, object_map: Optional[dict] = None, **kwargs: Any)#

Tree select leaf embedding retriever.

This class traverses the index graph using the embedding similarity between the query and the node text.

Parameters
  • query_template (Optional[BasePromptTemplate]) – Tree Select Query Prompt (see Prompt Templates).

  • query_template_multiple (Optional[BasePromptTemplate]) – Tree Select Query Prompt (Multiple) (see Prompt Templates).

  • text_qa_template (Optional[BasePromptTemplate]) – Question-Answer Prompt (see Prompt Templates).

  • refine_template (Optional[BasePromptTemplate]) – Refinement Prompt (see Prompt Templates).

  • child_branch_factor (int) – Number of child nodes to consider at each level. If child_branch_factor is 1, then the query will only choose one child node to traverse for any given parent node. If child_branch_factor is 2, then the query will choose two child nodes.

  • embed_model (Optional[BaseEmbedding]) – Embedding model to use for embedding similarity.

as_query_component(partial: Optional[Dict[str, Any]] = None, **kwargs: Any) QueryComponent#

Get query component.

get_prompts() Dict[str, BasePromptTemplate]#

Get a prompt.

get_service_context() Optional[ServiceContext]#

Attempts to resolve a service context. Short-circuits at self.service_context, self._service_context, or self._index.service_context.

retrieve(str_or_query_bundle: Union[str, QueryBundle]) List[NodeWithScore]#

Retrieve nodes given query.

Parameters

str_or_query_bundle (QueryType) – Either a query string or a QueryBundle object.

update_prompts(prompts_dict: Dict[str, BasePromptTemplate]) None#

Update prompts.

Other prompts will remain in place.

class llama_index.core.indices.tree.TreeSelectLeafRetriever(index: TreeIndex, query_template: Optional[BasePromptTemplate] = None, text_qa_template: Optional[BasePromptTemplate] = None, refine_template: Optional[BasePromptTemplate] = None, query_template_multiple: Optional[BasePromptTemplate] = None, child_branch_factor: int = 1, verbose: bool = False, callback_manager: Optional[CallbackManager] = None, object_map: Optional[dict] = None, **kwargs: Any)#

Tree select leaf retriever.

This class traverses the index graph and searches for a leaf node that can best answer the query.

Parameters
  • query_template (Optional[BasePromptTemplate]) – Tree Select Query Prompt (see Prompt Templates).

  • query_template_multiple (Optional[BasePromptTemplate]) – Tree Select Query Prompt (Multiple) (see Prompt Templates).

  • child_branch_factor (int) – Number of child nodes to consider at each level. If child_branch_factor is 1, then the query will only choose one child node to traverse for any given parent node. If child_branch_factor is 2, then the query will choose two child nodes.

as_query_component(partial: Optional[Dict[str, Any]] = None, **kwargs: Any) QueryComponent#

Get query component.

get_prompts() Dict[str, BasePromptTemplate]#

Get a prompt.

get_service_context() Optional[ServiceContext]#

Attempts to resolve a service context. Short-circuits at self.service_context, self._service_context, or self._index.service_context.

retrieve(str_or_query_bundle: Union[str, QueryBundle]) List[NodeWithScore]#

Retrieve nodes given query.

Parameters

str_or_query_bundle (QueryType) – Either a query string or a QueryBundle object.

update_prompts(prompts_dict: Dict[str, BasePromptTemplate]) None#

Update prompts.

Other prompts will remain in place.