Structured Index Configuration#

Our structured indices are documented in Structured Store Index. Below, we provide a reference of the classes that are used to configure our structured indices.

SQL wrapper around SQLDatabase in langchain.

class llama_index.utilities.sql_wrapper.SQLDatabase(engine: Engine, schema: Optional[str] = None, metadata: Optional[MetaData] = None, ignore_tables: Optional[List[str]] = None, include_tables: Optional[List[str]] = None, sample_rows_in_table_info: int = 3, indexes_in_table_info: bool = False, custom_table_info: Optional[dict] = None, view_support: bool = False, max_string_length: int = 300)#

SQL Database.

This class provides a wrapper around the SQLAlchemy engine to interact with a SQL database. It provides methods to execute SQL commands, insert data into tables, and retrieve information about the database schema. It also supports optional features such as including or excluding specific tables, sampling rows for table info, including indexes in table info, and supporting views.

Based on langchain SQLDatabase. https://github.com/langchain-ai/langchain/blob/e355606b1100097665207ca259de6dc548d44c78/libs/langchain/langchain/utilities/sql_database.py#L39

Parameters
  • engine (Engine) – The SQLAlchemy engine instance to use for database operations.

  • schema (Optional[str]) – The name of the schema to use, if any.

  • metadata (Optional[MetaData]) – The metadata instance to use, if any.

  • ignore_tables (Optional[List[str]]) – List of table names to ignore. If set, include_tables must be None.

  • include_tables (Optional[List[str]]) – List of table names to include. If set, ignore_tables must be None.

  • sample_rows_in_table_info (int) – The number of sample rows to include in table info.

  • indexes_in_table_info (bool) – Whether to include indexes in table info.

  • custom_table_info (Optional[dict]) – Custom table info to use.

  • view_support (bool) – Whether to support views.

  • max_string_length (int) – The maximum string length to use.

property dialect: str#

Return string representation of dialect to use.

property engine: Engine#

Return SQL Alchemy engine.

classmethod from_uri(database_uri: str, engine_args: Optional[dict] = None, **kwargs: Any) SQLDatabase#

Construct a SQLAlchemy engine from URI.

get_single_table_info(table_name: str) str#

Get table info for a single table.

get_table_columns(table_name: str) List[Any]#

Get table columns.

get_usable_table_names() Iterable[str]#

Get names of tables available.

insert_into_table(table_name: str, data: dict) None#

Insert data into a table.

property metadata_obj: MetaData#

Return SQL Alchemy metadata.

run_sql(command: str) Tuple[str, Dict]#

Execute a SQL statement and return a string representing the results.

If the statement returns rows, a string of the results is returned. If the statement returns no rows, an empty string is returned.

truncate_word(content: Any, *, length: int, suffix: str = '...') str#

Truncate a string to a certain number of words, based on the max string length.

SQL Container builder.

class llama_index.indices.struct_store.container_builder.SQLContextContainerBuilder(sql_database: SQLDatabase, context_dict: Optional[Dict[str, str]] = None, context_str: Optional[str] = None)#

SQLContextContainerBuilder.

Build a SQLContextContainer that can be passed to the SQL index during index construction or during query-time.

NOTE: if context_str is specified, that will be used as context instead of context_dict

Parameters
  • sql_database (SQLDatabase) – SQL database

  • context_dict (Optional[Dict[str, str]]) – context dict

build_context_container(ignore_db_schema: bool = False) SQLContextContainer#

Build index structure.

derive_index_from_context(index_cls: Type[BaseIndex], ignore_db_schema: bool = False, **index_kwargs: Any) BaseIndex#

Derive index from context.

classmethod from_documents(documents_dict: Dict[str, List[BaseNode]], sql_database: SQLDatabase, **context_builder_kwargs: Any) SQLContextContainerBuilder#

Build context from documents.

query_index_for_context(index: BaseIndex, query_str: Union[str, QueryBundle], query_tmpl: Optional[str] = 'Please return the relevant tables (including the full schema) for the following query: {orig_query_str}', store_context_str: bool = True, **index_kwargs: Any) str#

Query index for context.

A simple wrapper around the index.query call which injects a query template to specifically fetch table information, and can store a context_str.

Parameters
  • index (BaseIndex) – index data structure

  • query_str (QueryType) – query string

  • query_tmpl (Optional[str]) – query template

  • store_context_str (bool) – store context_str

Common classes for structured operations.

class llama_index.indices.common.struct_store.base.BaseStructDatapointExtractor(llm: Union[LLMPredictor, LLM], schema_extract_prompt: BasePromptTemplate, output_parser: Callable[[str], Optional[Dict[str, Any]]])#

Extracts datapoints from a structured document.

insert_datapoint_from_nodes(nodes: Sequence[BaseNode]) None#

Extract datapoint from a document and insert it.

class llama_index.indices.common.struct_store.base.SQLDocumentContextBuilder(sql_database: SQLDatabase, service_context: Optional[ServiceContext] = None, text_splitter: Optional[TextSplitter] = None, table_context_prompt: Optional[BasePromptTemplate] = None, refine_table_context_prompt: Optional[BasePromptTemplate] = None, table_context_task: Optional[str] = None)#

Builder that builds context for a given set of SQL tables.

Parameters
  • sql_database (Optional[SQLDatabase]) – SQL database to use,

  • llm_predictor (Optional[BaseLLMPredictor]) – LLM Predictor to use.

  • prompt_helper (Optional[PromptHelper]) – Prompt Helper to use.

  • text_splitter (Optional[TextSplitter]) – Text Splitter to use.

  • table_context_prompt (Optional[BasePromptTemplate]) – A Table Context Prompt (see Prompt Templates).

  • refine_table_context_prompt (Optional[BasePromptTemplate]) – A Refine Table Context Prompt (see Prompt Templates).

  • table_context_task (Optional[str]) – The query to perform on the table context. A default query string is used if none is provided by the user.

build_all_context_from_documents(documents_dict: Dict[str, List[BaseNode]]) Dict[str, str]#

Build context for all tables in the database.

build_table_context_from_documents(documents: Sequence[BaseNode], table_name: str) str#

Build context from documents for a single table.