MarkdownElementNodeParser#
- pydantic model llama_index.node_parser.MarkdownElementNodeParser#
Markdown element node parser.
Splits a markdown document into Text Nodes and Index Nodes corresponding to embedded objects (e.g. tables).
Show JSON schema
{ "title": "MarkdownElementNodeParser", "description": "Markdown element node parser.\n\nSplits a markdown document into Text Nodes and Index Nodes corresponding to embedded objects\n(e.g. tables).", "type": "object", "properties": { "include_metadata": { "title": "Include Metadata", "description": "Whether or not to consider metadata when splitting.", "default": true, "type": "boolean" }, "include_prev_next_rel": { "title": "Include Prev Next Rel", "description": "Include prev/next node relationships.", "default": true, "type": "boolean" }, "callback_manager": { "title": "Callback Manager" }, "id_func": { "title": "Id Func" }, "llm": { "title": "Llm" }, "summary_query_str": { "title": "Summary Query Str", "description": "Query string to use for summarization.", "default": "What is this table about? Give a very concise summary (imagine you are adding a new caption and summary for this table), and output the real/existing table title/caption if context provided.and output the real/existing table id if context provided.and also output whether or not the table should be kept.", "type": "string" }, "num_workers": { "title": "Num Workers", "description": "Num of works for async jobs.", "default": 4, "type": "integer" }, "show_progress": { "title": "Show Progress", "description": "Whether to show progress.", "default": true, "type": "boolean" }, "class_name": { "title": "Class Name", "type": "string", "default": "MarkdownElementNodeParser" } } }
- Config
arbitrary_types_allowed: bool = True
- Fields
- classmethod class_name() str #
Get the class name, used as a unique ID in serialization.
This provides a key that makes serialization robust against actual class name changes.
- extract_elements(text: str, node_id: Optional[str] = None, table_filters: Optional[List[Callable]] = None, **kwargs: Any) List[Element] #
Extract elements from text.
- filter_table(table_element: Any) bool #
Filter tables.