Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
OurBuilding Ambient Agents with LangGraph course is now available on LangChain Academy!
Open In ColabOpen on GitHub

Azure AI Document Intelligence

Azure AI Document Intelligence (formerly known asAzure Form Recognizer) is machine-learningbased service that extracts texts (including handwriting), tables, document structures (e.g., titles, section headings, etc.) and key-value-pairs fromdigital or scanned PDFs, images, Office and HTML files.

Document Intelligence supportsPDF,JPEG/JPG,PNG,BMP,TIFF,HEIF,DOCX,XLSX,PPTX andHTML.

This current implementation of a loader usingDocument Intelligence can incorporate content page-wise and turn it into LangChain documents. The default output format is markdown, which can be easily chained withMarkdownHeaderTextSplitter for semantic document chunking. You can also usemode="single" ormode="page" to return pure texts in a single page or document split by page.

Prerequisite

An Azure AI Document Intelligence resource in one of the 3 preview regions:East US,West US2,West Europe - followthis document to create one if you don't have. You will be passing<endpoint> and<key> as parameters to the loader.

%pip install--upgrade--quiet  langchain langchain-community azure-ai-documentintelligence

Example 1

The first example uses a local file which will be sent to Azure AI Document Intelligence.

With the initialized document analysis client, we can proceed to create an instance of the DocumentIntelligenceLoader:

from langchain_community.document_loadersimport AzureAIDocumentIntelligenceLoader

file_path="<filepath>"
endpoint="<endpoint>"
key="<key>"
loader= AzureAIDocumentIntelligenceLoader(
api_endpoint=endpoint, api_key=key, file_path=file_path, api_model="prebuilt-layout"
)

documents= loader.load()

The default output contains one LangChain document with markdown format content:

documents

Example 2

The input file can also be a public URL path. E.g.,https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/layout.png.

url_path="<url>"
loader= AzureAIDocumentIntelligenceLoader(
api_endpoint=endpoint, api_key=key, url_path=url_path, api_model="prebuilt-layout"
)

documents= loader.load()
documents

Example 3

You can also specifymode="page" to load document by pages.

from langchain_community.document_loadersimport AzureAIDocumentIntelligenceLoader

file_path="<filepath>"
endpoint="<endpoint>"
key="<key>"
loader= AzureAIDocumentIntelligenceLoader(
api_endpoint=endpoint,
api_key=key,
file_path=file_path,
api_model="prebuilt-layout",
mode="page",
)

documents= loader.load()

The output will be each page stored as a separate document in the list:

for documentin documents:
print(f"Page Content:{document.page_content}")
print(f"Metadata:{document.metadata}")

Example 4

You can also specifyanalysis_feature=["ocrHighResolution"] to enable add-on capabilities. For more information, see:https://aka.ms/azsdk/python/documentintelligence/analysisfeature.

from langchain_community.document_loadersimport AzureAIDocumentIntelligenceLoader

file_path="<filepath>"
endpoint="<endpoint>"
key="<key>"
analysis_features=["ocrHighResolution"]
loader= AzureAIDocumentIntelligenceLoader(
api_endpoint=endpoint,
api_key=key,
file_path=file_path,
api_model="prebuilt-layout",
analysis_features=analysis_features,
)

documents= loader.load()

The output contains the LangChain document recognized with high resolution add-on capability:

documents

Related


[8]ページ先頭

©2009-2025 Movatter.jp