HTMLTagReader#

class llama_index.readers.HTMLTagReader(tag: str = 'section', ignore_no_id: bool = False)#

Bases: BaseReader

Read HTML files and extract text from a specific tag with BeautifulSoup.

By default, reads the text from the <section> tag.

Methods Summary

load_data(file[, extra_info])

Load data from the input directory.

Methods Documentation

load_data(file: Path, extra_info: Optional[Dict] = None) → List[Document]#: Load data from the input directory.