UnstructuredXMLLoader
This notebook provides a quick overview for getting started with UnstructuredXMLLoaderdocument loader. TheUnstructuredXMLLoader
is used to loadXML
files. The loader works with.xml
files. The page content will be the text extracted from the XML tags.
Overview
Integration details
Class | Package | Local | Serializable | JS support |
---|---|---|---|---|
UnstructuredXMLLoader | langchain_community | ✅ | ❌ | ✅ |
Loader features
Source | Document Lazy Loading | Native Async Support |
---|---|---|
UnstructuredXMLLoader | ✅ | ❌ |
Setup
To access UnstructuredXMLLoader document loader you'll need to install thelangchain-community
integration package.
Credentials
No credentials are needed to use the UnstructuredXMLLoader
To enable automated tracing of your model calls, set yourLangSmith API key:
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"
Installation
Installlangchain_community.
%pip install-qU langchain_community
Initialization
Now we can instantiate our model object and load documents:
from langchain_community.document_loadersimport UnstructuredXMLLoader
loader= UnstructuredXMLLoader(
"./example_data/factbook.xml",
)
Load
docs= loader.load()
docs[0]
Document(metadata={'source': './example_data/factbook.xml'}, page_content='United States\n\nWashington, DC\n\nJoe Biden\n\nBaseball\n\nCanada\n\nOttawa\n\nJustin Trudeau\n\nHockey\n\nFrance\n\nParis\n\nEmmanuel Macron\n\nSoccer\n\nTrinidad & Tobado\n\nPort of Spain\n\nKeith Rowley\n\nTrack & Field')
print(docs[0].metadata)
{'source': './example_data/factbook.xml'}
Lazy Load
page=[]
for docin loader.lazy_load():
page.append(doc)
iflen(page)>=10:
# do some paged operation, e.g.
# index.upsert(page)
page=[]
API reference
For detailed documentation of all __ModuleName__Loader features and configurations head to the API reference:https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.xml.UnstructuredXMLLoader.html
Related
- Document loaderconceptual guide
- Document loaderhow-to guides