Tree Index#
Building the Tree Index
Tree-structured Index Data Structures.
- class llama_index.indices.tree.TreeAllLeafRetriever(index: TreeIndex, callback_manager: Optional[CallbackManager] = None, object_map: Optional[dict] = None, verbose: bool = False, **kwargs: Any)#
GPT all leaf retriever.
This class builds a query-specific tree from leaf nodes to return a response. Using this query mode means that the tree index doesnβt need to be built when initialized, since we rebuild the tree for each query.
- Parameters
text_qa_template (Optional[BasePromptTemplate]) β Question-Answer Prompt (see Prompt Templates).
- as_query_component(partial: Optional[Dict[str, Any]] = None, **kwargs: Any) QueryComponent #
Get query component.
- get_prompts() Dict[str, BasePromptTemplate] #
Get a prompt.
- get_service_context() Optional[ServiceContext] #
Attempts to resolve a service context. Short-circuits at self.service_context, self._service_context, or self._index.service_context.
- retrieve(str_or_query_bundle: Union[str, QueryBundle]) List[NodeWithScore] #
Retrieve nodes given query.
- Parameters
str_or_query_bundle (QueryType) β Either a query string or a QueryBundle object.
- update_prompts(prompts_dict: Dict[str, BasePromptTemplate]) None #
Update prompts.
Other prompts will remain in place.
- class llama_index.indices.tree.TreeIndex(nodes: Optional[Sequence[BaseNode]] = None, objects: Optional[Sequence[IndexNode]] = None, index_struct: Optional[IndexGraph] = None, service_context: Optional[ServiceContext] = None, summary_template: Optional[BasePromptTemplate] = None, insert_prompt: Optional[BasePromptTemplate] = None, num_children: int = 10, build_tree: bool = True, use_async: bool = False, show_progress: bool = False, **kwargs: Any)#
Tree Index.
The tree index is a tree-structured index, where each node is a summary of the children nodes. During index construction, the tree is constructed in a bottoms-up fashion until we end up with a set of root_nodes.
There are a few different options during query time (see Querying an Index). The main option is to traverse down the tree from the root nodes. A secondary answer is to directly synthesize the answer from the root nodes.
- Parameters
summary_template (Optional[BasePromptTemplate]) β A Summarization Prompt (see Prompt Templates).
insert_prompt (Optional[BasePromptTemplate]) β An Tree Insertion Prompt (see Prompt Templates).
num_children (int) β The number of children each node should have.
build_tree (bool) β Whether to build the tree during index construction.
show_progress (bool) β Whether to show progress bars. Defaults to False.
- delete(doc_id: str, **delete_kwargs: Any) None #
Delete a document from the index. All nodes in the index related to the index will be deleted.
- Parameters
doc_id (str) β A doc_id of the ingested document
- delete_nodes(node_ids: List[str], delete_from_docstore: bool = False, **delete_kwargs: Any) None #
Delete a list of nodes from the index.
- Parameters
doc_ids (List[str]) β A list of doc_ids from the nodes to delete
- delete_ref_doc(ref_doc_id: str, delete_from_docstore: bool = False, **delete_kwargs: Any) None #
Delete a document and itβs nodes by using ref_doc_id.
- property docstore: BaseDocumentStore#
Get the docstore corresponding to the index.
- classmethod from_documents(documents: Sequence[Document], storage_context: Optional[StorageContext] = None, service_context: Optional[ServiceContext] = None, show_progress: bool = False, **kwargs: Any) IndexType #
Create index from documents.
- Parameters
documents (Optional[Sequence[BaseDocument]]) β List of documents to build the index from.
- property index_id: str#
Get the index struct.
- property index_struct: IS#
Get the index struct.
- index_struct_cls#
alias of
IndexGraph
- property ref_doc_info: Dict[str, RefDocInfo]#
Retrieve a dict mapping of ingested documents and their nodes+metadata.
- refresh(documents: Sequence[Document], **update_kwargs: Any) List[bool] #
Refresh an index with documents that have changed.
This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or metadata. It will also insert any documents that previously were not stored.
- refresh_ref_docs(documents: Sequence[Document], **update_kwargs: Any) List[bool] #
Refresh an index with documents that have changed.
This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or metadata. It will also insert any documents that previously were not stored.
- set_index_id(index_id: str) None #
Set the index id.
NOTE: if you decide to set the index_id on the index_struct manually, you will need to explicitly call add_index_struct on the index_store to update the index store.
- Parameters
index_id (str) β Index id to set.
- update(document: Document, **update_kwargs: Any) None #
Update a document and itβs corresponding nodes.
This is equivalent to deleting the document and then inserting it again.
- Parameters
document (Union[BaseDocument, BaseIndex]) β document to update
insert_kwargs (Dict) β kwargs to pass to insert
delete_kwargs (Dict) β kwargs to pass to delete
- update_ref_doc(document: Document, **update_kwargs: Any) None #
Update a document and itβs corresponding nodes.
This is equivalent to deleting the document and then inserting it again.
- Parameters
document (Union[BaseDocument, BaseIndex]) β document to update
insert_kwargs (Dict) β kwargs to pass to insert
delete_kwargs (Dict) β kwargs to pass to delete
- class llama_index.indices.tree.TreeRootRetriever(index: TreeIndex, callback_manager: Optional[CallbackManager] = None, object_map: Optional[dict] = None, verbose: bool = False, **kwargs: Any)#
Tree root retriever.
This class directly retrieves the answer from the root nodes.
Unlike GPTTreeIndexLeafQuery, this class assumes the graph already stores the answer (because it was constructed with a query_str), so it does not attempt to parse information down the graph in order to synthesize an answer.
- as_query_component(partial: Optional[Dict[str, Any]] = None, **kwargs: Any) QueryComponent #
Get query component.
- get_prompts() Dict[str, BasePromptTemplate] #
Get a prompt.
- get_service_context() Optional[ServiceContext] #
Attempts to resolve a service context. Short-circuits at self.service_context, self._service_context, or self._index.service_context.
- retrieve(str_or_query_bundle: Union[str, QueryBundle]) List[NodeWithScore] #
Retrieve nodes given query.
- Parameters
str_or_query_bundle (QueryType) β Either a query string or a QueryBundle object.
- update_prompts(prompts_dict: Dict[str, BasePromptTemplate]) None #
Update prompts.
Other prompts will remain in place.
- class llama_index.indices.tree.TreeSelectLeafEmbeddingRetriever(index: TreeIndex, query_template: Optional[BasePromptTemplate] = None, text_qa_template: Optional[BasePromptTemplate] = None, refine_template: Optional[BasePromptTemplate] = None, query_template_multiple: Optional[BasePromptTemplate] = None, child_branch_factor: int = 1, verbose: bool = False, callback_manager: Optional[CallbackManager] = None, object_map: Optional[dict] = None, **kwargs: Any)#
Tree select leaf embedding retriever.
This class traverses the index graph using the embedding similarity between the query and the node text.
- Parameters
query_template (Optional[BasePromptTemplate]) β Tree Select Query Prompt (see Prompt Templates).
query_template_multiple (Optional[BasePromptTemplate]) β Tree Select Query Prompt (Multiple) (see Prompt Templates).
text_qa_template (Optional[BasePromptTemplate]) β Question-Answer Prompt (see Prompt Templates).
refine_template (Optional[BasePromptTemplate]) β Refinement Prompt (see Prompt Templates).
child_branch_factor (int) β Number of child nodes to consider at each level. If child_branch_factor is 1, then the query will only choose one child node to traverse for any given parent node. If child_branch_factor is 2, then the query will choose two child nodes.
embed_model (Optional[BaseEmbedding]) β Embedding model to use for embedding similarity.
- as_query_component(partial: Optional[Dict[str, Any]] = None, **kwargs: Any) QueryComponent #
Get query component.
- get_prompts() Dict[str, BasePromptTemplate] #
Get a prompt.
- get_service_context() Optional[ServiceContext] #
Attempts to resolve a service context. Short-circuits at self.service_context, self._service_context, or self._index.service_context.
- retrieve(str_or_query_bundle: Union[str, QueryBundle]) List[NodeWithScore] #
Retrieve nodes given query.
- Parameters
str_or_query_bundle (QueryType) β Either a query string or a QueryBundle object.
- update_prompts(prompts_dict: Dict[str, BasePromptTemplate]) None #
Update prompts.
Other prompts will remain in place.
- class llama_index.indices.tree.TreeSelectLeafRetriever(index: TreeIndex, query_template: Optional[BasePromptTemplate] = None, text_qa_template: Optional[BasePromptTemplate] = None, refine_template: Optional[BasePromptTemplate] = None, query_template_multiple: Optional[BasePromptTemplate] = None, child_branch_factor: int = 1, verbose: bool = False, callback_manager: Optional[CallbackManager] = None, object_map: Optional[dict] = None, **kwargs: Any)#
Tree select leaf retriever.
This class traverses the index graph and searches for a leaf node that can best answer the query.
- Parameters
query_template (Optional[BasePromptTemplate]) β Tree Select Query Prompt (see Prompt Templates).
query_template_multiple (Optional[BasePromptTemplate]) β Tree Select Query Prompt (Multiple) (see Prompt Templates).
child_branch_factor (int) β Number of child nodes to consider at each level. If child_branch_factor is 1, then the query will only choose one child node to traverse for any given parent node. If child_branch_factor is 2, then the query will choose two child nodes.
- as_query_component(partial: Optional[Dict[str, Any]] = None, **kwargs: Any) QueryComponent #
Get query component.
- get_prompts() Dict[str, BasePromptTemplate] #
Get a prompt.
- get_service_context() Optional[ServiceContext] #
Attempts to resolve a service context. Short-circuits at self.service_context, self._service_context, or self._index.service_context.
- retrieve(str_or_query_bundle: Union[str, QueryBundle]) List[NodeWithScore] #
Retrieve nodes given query.
- Parameters
str_or_query_bundle (QueryType) β Either a query string or a QueryBundle object.
- update_prompts(prompts_dict: Dict[str, BasePromptTemplate]) None #
Update prompts.
Other prompts will remain in place.