Custom extractor with generative AI

Generative AI training and extraction lets you:

Use zero-shot and few-shot technology to get a high performing model with little to no training data using the foundation model.
Use fine-tuning to further boost accuracy as you provide more and more training data.

Generative AI training methods

The training method you choose depends on the amount of documents you have available and the amount of effort available to put into training your model. There are three ways to train a generative AI model:

Training method	Zero-shot	Few-shot	Fine-tuning
Accuracy	Medium	Medium-high	High
Effort	Low	Low	Medium
Recommended number of training documents	0	5 to 10	10 to 50+

Note: You can expect reasonably good results with 10 to 50+ training examples,and performance can be further optimized by increasing the number of examples tohundreds or thousands, depending on the complexity and variation of thedocuments and extraction tasks.

Custom extractor model versions

The following models are available for custom extractor. To change modelversions, see Manage processor versions.

Versions 1.4, 1.5 and 1.5 Pro support confidence scores.

Model version	Description	Release channel	ML processing in US/EU	Fine-tuning in US/EU	Release date
`pretrained-foundation-model-v1.4-2025-02-05`	GA model powered by the Gemini 2.0 Flash LLM. Also includes advanced OCR features such as checkbox detection.Note: Effective March 31, 2026, this version will no longer be accessible.	Stable	Yes	US, EU	February 5, 2025
`pretrained-foundation-model-v1.5-2025-05-05`	Production-ready candidate powered by Gemini 2.5 Flash LLM. Recommended for those who want to experiment with newer models.	Stable	Yes	US, EU (Preview)	May 5, 2025
`pretrained-foundation-model-v1.5-pro-2025-06-20`	Production-ready model powered by the Gemini 2.5 Pro LLM. Supports a quota of up to 30 pages per minute for online process requests. This model has improved quality compared to v1.5, and may have a higher latency.	Stable	Yes	No	June 20, 2025
`pretrained-foundation-model-v1.5.1-2025-08-07`	Public preview model powered by the Gemini 2.5 Flash LLM. This model has the same features as v1.5, and has improved adaptive few-shot learning.	Release candidate	Yes	No	August 8, 2025
`pretrained-foundation-model-v1.6-pro-2025-12-01`	Preview model powered by the Gemini 3 Pro LLM.Note: This processor version uses the Vertex AI Gemini global endpoint and is not compliant with Data Residency (DMZ) standards. For example, requests in US and EU endpoints might route to anywhere globally.	Release candidate	Yes	No	December 1, 2025
`pretrained-foundation-model-v1.6-2026-01-13`	Preview model powered by the Gemini 3 Flash LLM.Note: This processor version uses the Vertex AI Gemini global endpoint and is not compliant with Data Residency (DMZ) standards. For example, requests in US and EU endpoints might route to anywhere globally.	Release candidate	Yes	No	January 13, 2026

To change the processor version in your project, reviewManaging processorversions.

To make a Quota Increase Request (QIR) for the default processor quota, followthe steps torequest a quota adjustment.

Initial setup

If not done so already, enable billing and theDocument AI APIs.

Build and evaluate a generative AI model

Create a processor anddefine fields you want to extract followingbest practices, which is important because it impacts extraction quality.
- Go toWorkbench >Custom extractor >Create processor >Assign a name.
- Go toGet started >Create new field.
Note: The field names with the foundation model can greatly affect model accuracy and performance. A descriptive name is recommended.
Import documents
- Import documents with auto-labeling and assign documents to the training and test set.
- For zero-shot, only the schema is required. To evaluate the accuracy of the model, only a testing set is needed.
- For few-shot, we recommend five training documents.
- The number of needed testing documents depend on the use case. Generally, more testing documents is better.
- Confirm or edit thelabels in the document.
Train model:
- Select theBuild, thenCreate new version.
- Enter a name and selectCreate.

Note: Each training takes time to complete.

Evaluation:
- Go toEvaluate & test, select the version you just trained, then selectView full evaluation.
- You now see metrics such as f1, precision, and recall for the entire document and each field.
- Decide if performance meets your production goals. If it does not then reevaluate training and testing sets.
Set a new version as default:
- Navigate toManage versions.
- Select to expand the options, and then selectSet as default.
Note: Verify theStatus of the version is deployed. If it is not, select the deploy version first.You can now set the new version as the default.
Your model is now deployed. Documents sent to this processor use your custom version.You can evaluate the model's performance to check if it requires further training.

Evaluation reference

The evaluation engine can do both exact match orfuzzy matching.For an exact match, the extracted value must exactly match the ground truth or is counted as a miss.

Fuzzy matching extractions that had slight differences such as capitalizationdifferences still count as a match. This can be changed at theEvaluation screen.

processor-version-genai-15

Fine-tuning

With fine-tuning, you use hundreds or thousands of documents for your training.

Create a processor and define fields you want to extract followingbest practices, which is important because it impacts extraction quality.
Note: The field names with the foundation model can greatly affect model accuracy and performance. Be sure to give a descriptive name.
Import documents with auto-labeling, and assign documents to the training and test set.
Confirm or edit the labels in the document.
Train model.
- Select theBuild tab, and selectCreate New Version in theFine-tuning box.
Note: Fine-tuning will tune a foundation model, which is recommended, whileTrain a custom model will train a conventional non-Generative AI based model.
- Try out the default training parameters or values provided.If the results are unsatisfactory, experiment with these advanced options:
- Training steps (between 100 and 400): Controls how often the weights areoptimized on a batch of data during the tuning.
  - Too low indicates a risk that the training ends before convergence (under-fitting).
  - Too high means the model might see the same batch of data multiple timesduring the training, which can lead to overfitting.
  - Fewer steps leads to faster training time. Higher counts can help for documentswith little template variation (and lower ones for those with more variation).
- Learning rate multiplier (between 0.1 and 10): Controls how quickly the modelparameters are optimized on the training data. It roughly corresponds to the size of each training step.
  - Low rates mean small changes in the model weights at each training step.If too low, the model might not converge to a stable solution.
  - High rates indicate large changes, and too high can mean the model steps overthe optimal solution and converges instead to a suboptimal solution.
  - Training time is not affected by the choice of learning rate.
- Give a name, select the required base processor version, and selectCreate.
Evaluation: Go toEvaluate & test, then select the version you just trained and selectView full evaluation.
- You now see metrics such as f1, precision, and recallfor the entire document and each field.
- Decide if performance meets your production goals. If not, then further trainingdocuments might be required.
Set a new version as the default:
- Navigate toManage versions.
- Select to expand the options, and selectSet as default.
Note: Verify the status of the version is deployed. If it's not, then instead selectDeploy version.You can now set the new version as the default when its status is changed to deployed.
Your model is now deployed and documents sent to this processor now use your custom version.You want to evaluate the model's performance to check if it requires further training.

Auto-labeling with the foundation model

The foundation model can accurately extract fields for a variety of document types,but you can also provide additional training data to improve the accuracy of the modelfor specific document structures.

Document AI uses the label names you define and previous annotations to make itquicker and easier to label documents at scale with auto-labeling.

When you've created a custom processor, go to theGet Started tab.
SelectCreate New Field.
Give the label a descriptive, distinct name. ChooseExtract for values directly from the document orDerive for values inferred by the system. This improves the foundation model's accuracy and performance.
For extraction accuracy and performance add a description (such as added context, insights, and prior knowledge for each entity) for the kinds of entities it shouldpick up.
Note: Regardless of content structure or variations, rules promote reliable and consistent extractions. To guide the model and minimize confusion in the document, include property descriptions with location information and text patterns of the property values.
Note: It's possible to update the description. However, the change won't bepropagated to custom processor version. A snapshot of the description is inthe custom processor version.
Navigate to theBuild tab, then selectImport Documents.
Select the path of the documents and which set the documents should be imported into.Check the auto-labeling option and select the foundation model.
In theBuild tab, selectManage Dataset.
When you see your imported documents, select one of them.

The predictions from the model are now shown highlighted in purple.

Review each label predicted by the model, and verify it's correct.
If there are missing fields, add those as well.
Note: It's important that all fields are as accurate as possible, or model performance will suffer. More details on labeling.
When the document has been reviewed, selectMark as Labeled. The document is now ready to be used by the model.
Make sure the document is in either the testing or training set.

Three-level nesting

Custom Extractor now provides three levels of nesting. This feature provides better extraction for complex tables.

Note: Three levels of nesting is only supported if using generative AI ("Foundation") model types within the Custom Extractor. Extract, classify, and splitting models only support two levels of nesting.

You can determine the model type using the following API calls:

The response of these is aProcessorVersion, which contains themodelType field inv1beta3 preview.

Procedure and example

We are using this sample:

processor-version-genai-3

SelectGet Started, and then create a field:
- Create the top level.
- In this sample, theofficer_appointments is used.
- SelectThis is a parent label.
- SelectOccurrence:Optional multiple.
SelectAdd child field. The second level label can now be created:
- For this level label, createofficer.
- SelectThis is a parent label.
- SelectOccurrence:Optional multiple.
SelectAdd child field from second levelofficer. Create child labels for the third level of nesting.
When your schema is set, you can get predictions from documents with three levels of nesting using auto-labeling.

Label cross-page nested entities

Thepretrained-foundation-model-v1.5-2025-05-05 processor supports three-level nesting acrosspages.

Label an entity normally across a page.Note: The labeled entity will only be visible on the page where it's labeled,with the navigation bar changing page to page. By pinning the parent entity,this navigation bar persists.
Pin the parent entity with children you want to label across pages.
Navigate to the page with the child entity or entities to label.

Note: The parent entity can also be unpinned, leaving it visible on only the pagesit was labeled before.

processor-version-genai-27

Dataset configuration

A document dataset is required to train, up-train, or evaluate a processor version.Document AI processors learn from examples, just like humans. Dataset fuelsprocessor stability in terms of performance.

Train dataset

To improve the model and its accuracy, train a dataset on your documents. The model ismade up of documents with ground-truth.

For fine-tuning, you need a minimum of 1 document to train a new model with version forpretrained-foundation-model-v1.2-2024-05-10 andpretrained-foundation-model-v1.3-2024-08-31.
For a few-shot, five documents is recommended.
For zero-shot, only a schema is required.

Ground-truth is the correctly labeled data, as determined by humans.

Test dataset

The test dataset is what the model uses to generate an F1 score (accuracy). It ismade up of documents with ground-truth. To see how often the model is right, theground truth is used to compare the model's predictions (extracted fields fromthe model) with the correct answers. The test dataset should have at least one document forpretrained-foundation-model-v1.2-2024-05-10 andpretrained-foundation-model-v1.3-2024-08-31.

Custom extractor with property descriptions

With property descriptions, you can train a model by describing what the labeledfields are like. You can provide additional context and insights for eachentity. This allows the model to train by matching fields that fit thedescription you provide and improve extraction accuracy. Property descriptionscan be specified for both parent and child entities.

Good examples of property descriptions include location information and textpatterns of the property values, which help disambiguate potential sources ofconfusion in the document. Clear and precise property descriptions guide themodel with rules that promote more reliable and consistent extractions,regardless of the specific document structure or content variations.

Update document schema for a processor

For how to set the property descriptions, refer to Update document schema.

Send a processing request with property descriptions

If the document schema already has descriptions set, you can send a process requestwith the instructions atSend a processing request.

Fine-tune a processor with property descriptions

Before using any of the request data, make the following replacements:

LOCATION: your processor'slocation, for example:
- us - United States
- eu - European Union
PROJECT_ID: Your Google Cloud project ID.
PROCESSOR_ID: the ID of your custom processor.
DISPLAY_NAME: Display name for the processor.
PRETRAINED_PROCESSOR_VERSION: the processor version identifier. Refer toSelect a processor version for more information. For example:
- pretrained-TYPE-vX.X-YYYY-MM-DD
- stable
- rc
TRAIN_STEPS: Training steps for model fine-tuning.
LEARN_RATE_MULTIPLIER: Learning rate multiplier for model finetuning.
DOCUMENT_SCHEMA: Schema for the processor. Refer toDocumentSchema representation .

HTTP method and URL:

POST https://LOCATION-documentai.googleapis.com/v1beta3/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/PROCESSOR_VERSION:process

Request JSON body:

{  "rawDocument": {      "parent": "projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID",         "processor_version": {             "name": "projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/DISPLAY_NAME",             "display_name": "DISPLAY_NAME",             "model_type": "MODEL_TYPE_GENERATIVE",         },         "base_processor_version": "projects/PROJECT_ID/locations/us/processors/PROCESSOR_ID/processorVersions/PRETRAINED_PROCESSOR_VERSION",         "foundation_model_tuning_options": {             "train_steps":TRAIN_STEPS,             "learning_rate_multiplier":LEARN_RATE_MULTIPLIER,         }         "document_schema":DOCUMENT_SCHEMA      }}

To send your request, choose one of these options:

curl

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by running gcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-documentai.googleapis.com/v1beta3/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/PROCESSOR_VERSION:process"

PowerShell

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by running gcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-documentai.googleapis.com/v1beta3/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/PROCESSOR_VERSION:process" | Select-Object -Expand Content

Custom extractor with signature detection

Preview

This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

(Public preview) Custom extractor supportssignature detection. This feature lets you detect the presence ofsignatures in the documents. Signature detection is available only by using thederived method type. You can specify a schema with the entity typesignaturefor such entities. The signature entities are derived using visual cues fromthe document.

For examples and configuration instructions, clickCustom extractor with derivedfield and signature detection.

Custom extractor with derived fields

Preview

The custom extractor supports derived fields. It lets you configure a field to bepopulated through intelligent inference or generation based on document context,rather than direct text extraction. You can employ this for use cases such asdeducing the country from an address, summarizing a document, counting items ina table, or detecting if an ID is authentic, without requiring the value to beexplicitly present in the text.

For examples and configuration instructions, clickCustom extractor with derivedfield and signature detection.

Important: Derived fields are generated on a per-page basis. Read derived fieldlimitations for more information.

What's next

Learn aboutCustom extractor with derived field and signature detection.

Template-based extraction

Derived field and signature detection

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.

Movatterモバイル変換

Custom extractor with generative AI Stay organized with collections Save and categorize content based on your preferences.

Generative AI training methods

Custom extractor model versions

Initial setup

Build and evaluate a generative AI model

Evaluation reference

Fine-tuning

Auto-labeling with the foundation model

Three-level nesting

Procedure and example

Label cross-page nested entities

Dataset configuration

Train dataset

Test dataset

Custom extractor with property descriptions

Update document schema for a processor

Send a processing request with property descriptions

Fine-tune a processor with property descriptions

curl

PowerShell

Custom extractor with signature detection

Custom extractor with derived fields

What's next

Custom extractor with generative AI