HTML to text
html2text is a Python package that converts a page of
HTML
into clean, easy-to-read plainASCII text
.
The ASCII also happens to be a validMarkdown
(a text-to-HTML format).
Installation and Setup
pip install html2text
Document Transformer
See ausage example.
from langchain_community.document_loadersimport Html2TextTransformer