Pandas Query Engine

Default query for PandasIndex.

WARNING: This tool provides the Agent access to the eval function. Arbitrary code execution is possible on the machine running this tool. This tool is not recommended to be used in a production setting, and would require heavy sandboxing or virtual machines

llama_index.query_engine.pandas_query_engine.GPTNLPandasQueryEngine

alias of PandasQueryEngine

llama_index.query_engine.pandas_query_engine.NLPandasQueryEngine

alias of PandasQueryEngine

class llama_index.query_engine.pandas_query_engine.PandasQueryEngine(df: DataFrame, instruction_str: Optional[str] = None, output_processor: Optional[Callable] = None, pandas_prompt: Optional[BasePromptTemplate] = None, output_kwargs: Optional[dict] = None, head: int = 5, verbose: bool = False, service_context: Optional[ServiceContext] = None, **kwargs: Any)

GPT Pandas query.

Convert natural language to Pandas python code.

WARNING: This tool provides the Agent access to the eval function. Arbitrary code execution is possible on the machine running this tool. This tool is not recommended to be used in a production setting, and would require heavy sandboxing or virtual machines

Parameters
  • df (pd.DataFrame) – Pandas dataframe to use.

  • instruction_str (Optional[str]) – Instruction string to use.

  • output_processor (Optional[Callable[[str], str]]) – Output processor. A callable that takes in the output string, pandas DataFrame, and any output kwargs and returns a string. eg.kwargs[“max_colwidth”] = [int] is used to set the length of text that each column can display during str(df). Set it to a higher number if there is possibly long text in the dataframe.

  • pandas_prompt (Optional[BasePromptTemplate]) – Pandas prompt to use.

  • head (int) – Number of rows to show in the table context.

get_prompts() Dict[str, BasePromptTemplate]

Get a prompt.

update_prompts(prompts_dict: Dict[str, BasePromptTemplate]) None

Update prompts.

Other prompts will remain in place.

llama_index.query_engine.pandas_query_engine.default_output_processor(output: str, df: DataFrame, **output_kwargs: Any) str

Process outputs in a default manner.