Nuclia

Nuclia automatically indexes your unstructured data from any internal and external source, providing optimized search results and generative answers. It can handle video and audio transcription, image content extraction, and document parsing.

Nuclia Understanding API document transformer splits text into paragraphs and sentences, identifies entities, provides a summary of the text and generates embeddings for all the sentences.

To use the Nuclia Understanding API, you need to have a Nuclia account. You can create one for free athttps://nuclia.cloud, and thencreate a NUA key.

from langchain_community.document_transformers.nuclia_text_transform import NucliaTextTransformer

%pip install--upgrade--quiet  protobuf
%pip install--upgrade--quiet  nucliadb-protos

import os

os.environ["NUCLIA_ZONE"]="<YOUR_ZONE>"# e.g. europe-1
os.environ["NUCLIA_NUA_KEY"]="<YOUR_API_KEY>"

To use the Nuclia document transformer, you need to instantiate aNucliaUnderstandingAPI tool withenable_ml set toTrue:

from langchain_community.tools.nucliaimport NucliaUnderstandingAPI

nua= NucliaUnderstandingAPI(enable_ml=True)

API Reference:NucliaUnderstandingAPI

The Nuclia document transformer must be called in async mode, so you need to use theatransform_documents method:

import asyncio

from langchain_community.document_transformers.nuclia_text_transformimport(
    NucliaTextTransformer,
)
from langchain_core.documentsimport Document


asyncdefprocess():
    documents=[
        Document(page_content="<TEXT 1>", metadata={}),
        Document(page_content="<TEXT 2>", metadata={}),
        Document(page_content="<TEXT 3>", metadata={}),
]
    nuclia_transformer= NucliaTextTransformer(nua)
    transformed_documents=await nuclia_transformer.atransform_documents(documents)
print(transformed_documents)


asyncio.run(process())

API Reference:NucliaTextTransformer |Document

Movatterモバイル変換