Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Welcome to Unstructured!

To start using Unstructured right away, skip ahead to theUI quickstart orAPI quickstart now!

What is Unstructured?

Unstructured conceptual data flow
Unstructured provides a platform and tools to ingest and process your unstructured documents for:
  • Enhancing retrieval-augmented generation (RAG): RAG boosts AI accuracy and relevance by working with the data that’s most important to you, providing results that are more current, focused, and meaningful to your queries and tasks.
  • Fueling agentic AI: Agentic AI acts like virtual teammates that can plan, decide, and take action on their own to get things done on your behalf, freeing you up for bigger challenges.
We empower your organization to take full advantage of RAG and agentic AI opportunities through:This 60-second video describes more about what Unstructured does and its benefits (no sound):
This 40-second video demonstrates a simple use case that Unstructured helps solve (no sound):
This 60-second video shows why using Unstructured is preferable to building your own similar solution:
You can use Unstructured through a user interface (UI), an API, or both. Read on to learn more.
Unstructured user interface example
Unstructured API example

Unstructured UI quickstart

This quickstart shows how, in just a few minutes, you can use the Unstructured user interface (UI) to quickly and easily see Unstructured’sbest-in-class transformation results for a single file that is stored on your local computer.
This quickstart focuses on a single, local file for ease-of-use demonstration purposes.To use Unstructured later to dolarge-scale batch processing of multiple files and semi-structured data that are stored in remote locations,skip over to the remote quickstart after you finish this one.
If you do not already have an Unstructured account,sign up for free.After you sign up, you are automatically signed in to your new UnstructuredLet’s Go account, athttps://platform.unstructured.io.
If you already have an UnstructuredPay-As-You-Go orBusiness SaaS account, you are already signed up for Unstructured.Sign in to your existing UnstructuredPay-As-You-Go orBusiness SaaS account, athttps://platform.unstructured.io.If you already have an Unstructureddedicated instance orin-VPC deployment, your sign-in link will be unique to your deployment.If you’re not sure what your unique sign-in link is, see your Unstructured account administrator, or email Unstructured Support atsupport@unstructured.io.
Do the following:
  1. After you are signed in, theStart page appears.
  2. In theWelcome area, do one of the following:
    • Click one of the sample files, such asrealestate.pdf, to have Unstructured parse and transform that sample file.
    • ClickBrowse files, or drag and drop a file ontoDrop file to test, to have Unstructured parse and transform your own file.If you choose to use your own file, the file must be 10 MB or less in size. Also, the file must be one of the following supported file types:
      File extension
      .bmp
      .csv
      .doc
      .docx
      .eml
      .epub
      .heic
      .html
      .jpeg
      .jpg
      .md
      .msg
      .odt
      .org
      .p7s
      .pdf
      .png
      .ppt
      .pptx
      .rst
      .rtf
      .tif
      .tiff
      .tsv
      .txt
      .xls
      .xlsx
      .xml
    Welcome interface on the Start page
  3. After Unstructured has finished parsing and transforming the file (a process known aspartitioning), you will see the file’s contents in thePreview pane in the center and Unstructured’s results in theResult pane on the right.Unstructured's parse and transform results
  4. TheResult pane shows a formatted view of Unstructured’s results by default. This formatted view is designed for humanreadability. To see the underlying JSON view of the results, which is designed for RAG and agentic AI,clickJSON at the top of theResult pane.Learn about what’s in the JSON view.Switching to the JSON view of the results
  5. Unstructured’s initial results are based on itsHigh Respartitioning strategy, whichbegins processing the file’s contents and converting these contents into a series of Unstructureddocument elements and metadata. This partitioning strategy provides good results overall, depending on the complexity of the file’s contents.This partitioning strategy also generates a bounding box for each detected object in the file. Abounding box isan imaginary rectangular box drawn around the object to show its location and extent within the file.After the High Res partitioning results are shown, Unstructured begins improving these initial results byusing vision language models (VLMs) to apply a series of generative refinements known asenrichments. Theseenrichments include:
    • Animage description enrichment, which uses a VLM to provide a text-based summary of the contents of each detected image.
    • Agenerative OCR enrichment, which uses a VLM to improve the accuracy of each block of initially-processed text.
    • Atable to HTML enrichment, which uses a VLM to provide an HTML-structured representation of each detected table.
    While these enrichments are being applied, a banner appears at the top of theResult pane.Updating the initial results with enrichmentsTo see these enrichments applied to the initial results, clickUpdate results in the banner as soon as this button appears,which might take up to a minute or more.Seeing the initial results updated with the enrichments
    Each page that Unstructured processes by using this approach is counted as two pages for usage and billing purposes.This is because Unstructured processes each page once with itsHigh Res partitioning strategy and then reprocesses eachpage with a VLM to improve the quality, accuracy, and relevance of the initial partitioning results.The final results of these two processing passes for each page count as two pages for usage and billing purposes.This two-pass process happens regardless of whether you clickUpdate results in the banner.This two-page usage and billing behavior is a known issue and will be addressed in a future release.
  6. To synchronize the scrolling of thePreview pane’s selected contents with theResult pane’sFormatted results,rest your mouse pointer anywhere inside the contents of thePreview pane until a bounding box appears.Then click the bounding box. Unstructured automatically scrolls theResult pane’sFormattedresults to match the selected bounding box. (You cannot synchronize the scrolling of theJSON results.)Selecting a bounding boxTo show all of the bounding boxes in thePreview pane at once, turn on theShow all bounding boxes toggle at the top of thePreview pane.You can now click any of the bounding boxes without first needing to rest your mouse pointer on them to show them.Showing all bounding boxes
You can also do the following:
  • To download the JSON view of the results as a local JSON file, click the download icon to the left of theFormatted andJSON buttons in theResult pane.(You cannot download the formatted view of the results.)Downloading the results as a local JSON file
  • To have Unstructured partition a different file, clickAdd new file in theFiles pane on the left, and then browse to and select the target file.
  • To view the results for a file that was previously partitioned during this session, click the file’s name in theRecent files list in theFiles pane.
  • To return to theStart page, click theX (close) button at the left on the title bar, next toTransform.
  • To have Unstructured do more—such aschunking,embedding,applying additional kinds ofenrichments, andprocessing larger files and semi-structured data in batches at scale—clickEdit in Workflow Editor at the right on the title bar, and thenskip over to the walkthrough.Switching to the workflow editor
What’s next?

Unstructured API quickstart

This quickstart shows how you can use the Unstructured API to quickly and easily see Unstructured’stransformation results for files that are stored locally.
This quickstart is designed to workonly by processing local files.To process files (and data) in remote file and blob storage, databases, and vector stores, you must useworkflow operations in the Unstructured API other than the ones that are used in this quickstart. To learn how, see for example the notebookDropbox-To-Pinecone Connector API Quickstart for Unstructured.
  1. If you do not already have an Unstructured account,sign up for free.After you sign up, you are automatically signed in to your UnstructuredLet’s Go account, athttps://platform.unstructured.io.
    If you already have an UnstructuredPay-As-You-Go orBusiness SaaS account, you are already signed up for Unstructured.Sign in to your existing UnstructuredPay-As-You-Go orBusiness SaaS account, athttps://platform.unstructured.io.If you already have an Unstructureddedicated instance orin-VPC deployment, your sign-in link will be unique to your deployment.If you’re not sure what your unique sign-in link is, see your Unstructured account administrator, or email Unstructured Support atsupport@unstructured.io.
  2. Get your Unstructured API key:a. After you sign in to your Unstructured account, clickAPI Keys on the sidebar.
    For aBusiness account, before you clickAPI Keys, make sure you have selected the organizational workspace you want to create an API keyfor. Each API key works with one and only one organizational workspace.Learn more.
    b. ClickGenerate API Key.
    c. Follow the on-screen instructions to finish generating the key.
    d. Click theCopy icon next to your new key to add the key to your system’s clipboard. If you lose this key, simply return and click theCopy icon again.
  3. Now that you have your Unstructured API key, choose one of the following options to continue:
    • Use a remote notebook - This option uses a remotely hosted Google Colab notebook. There are no additional setup steps required.
    • Use your local machine - This option requires you to install the Unstructured Python SDK on your local machine.
  Learn more.

Pricing

Unstructured offers different account types with different pricing plans:
  •   Let’s Go andPay-As-You-Go - A single user, with a single workspace, hosted alongside other accounts on Unstructured’s cloud infrastructure.
  •   Business - Multiple users and workspaces, with three options:
    •   Business SaaS - Hosted alongside other accounts on Unstructured’s cloud infrastructure.
    •   Dedicated instance - Hosted within a virtual private cloud (VPC) running inside Unstructured’s cloud infrastructure. Dedicated instances are isolated from all other accounts, for additional security and control.
    •   In-VPC - Hosted within your own VPC on your own cloud infrastructure.
    Business accounts also allow for robust customization of Unstructured’s features for your unique needs.
For more details, see theUnstructured Pricing page.To upgrade your account fromLet’s Go orPay-As-You-Go toBusiness,email Unstructured Sales atsales@unstructured.io.Some of these plans have billing details that are determined on a per-page basis.Unstructured calculates a page as follows:
  • For these file types, a page is a page, slide, or image:.pdf,.pptx, and.tiff.
  • For.docx files that have page metadata, Unstructured calculates the number of pages based on that metadata.
  • For all other file types, Unstructured calculates the number of pages as the file’s size divided by 100 KB.
  • For non-file data, Unstructured calculates a page as 100 KB of incoming data to be processed.

Questions? Need help?

  • For general questions about Unstructured products and pricing, email Unstructured Sales atsales@unstructured.io.
  • For technical support for Unstructured accounts, email Unstructured Support atsupport@unstructured.io.
  • For technical support for the Unstructured open source library, use ourSlack community.

Was this page helpful?


[8]ページ先頭

©2009-2026 Movatter.jp