Normalization

For many specific supported fields, Document AI also returns anentity.normalizedValuein addition to the raw extracted field obtained through thetextAnchor of eachentity. It normalize the literal text. Normalization often breaks the text valueup into sub-fields.

This contain the data in a standardized format to reduce post processing, andenable conversion to whatever format is selected. ThementionText, representingwhat is literally on the document, is never changed by normalization.

Normalized fields belong to one of the following categories.

Normalized values in the console

In the Google Cloud console, the normalized fields are annotated withG. For example:

enrichment
Sample normalized field shown in the web application.

Supported processors

Here are the processors and fields that support entity enrichment and normalization:

ProcessorsNormalized fields

Bank Statement Parser

CategoryPretrained
Solution typeLending
FunctionsOCR, Entity Extraction
Release stageGeneral availability
Access statusPublic
Full processor detailsDetailed entry
  • ending_balance
  • starting_balance
  • statement_date
  • statement_end_date
  • statement_start_date
  • table_item/transaction_deposit
  • table_item/transaction_deposit_date
  • table_item/transaction_withdrawal
  • table_item/transaction_withdrawal_date

US Passport Parser

CategoryPretrained
Solution typeIdentity
FunctionsOCR, Entity Extraction
Release stageGeneral availability
Access statusPublic
Full processor detailsDetailed entry
  • Date Of Birth
  • Expiration Date
  • Issue Date

Utility Parser

CategoryPretrained
Solution typeProcurement
FunctionsOCR, Entity Extraction
Release stageGeneral availability
Access statusLimited
Full processor detailsDetailed entry
  • adjusted_amount
  • amount_due
  • balance_transfer_amount
  • currency
  • currency_exchange_rate
  • delivery_date
  • due_date
  • invoice_date
  • late_fee_amount
  • line_item/amount
  • line_item/quantity
  • line_item/tax_amount
  • line_item/unit_price
  • net_amount
  • prior_amount_due
  • prior_paid_amount
  • total_amount
  • total_tax_amount

Identity Document Proofing Parser

CategoryPretrained
Solution typeIdentity
FunctionsOCR, Quality Analysis
Release stageGeneral availability
Access statusPublic
Full processor detailsDetailed entry
  • fraud_signals_image_manipulation
  • fraud_signals_online_duplicate (US only)
  • fraud_signals_is_identity_document
  • fraud_signals_suspicious_words

Pay Slip Parser

CategoryPretrained
Solution typeLending
FunctionsOCR, Entity Extraction
Release stageGeneral availability
Access statusPublic
Full processor detailsDetailed entry
  • bonus
  • bonus_ytd
  • commissions
  • commissions_ytd
  • direct_deposit
  • end_date
  • gross_earnings
  • gross_earnings_ytd
  • holiday
  • holiday_ytd
  • net_pay
  • net_pay_ytd
  • overtime
  • overtime_ytd
  • pay_date
  • regular_pay
  • regular_pay_ytd
  • start_date
  • vacation
  • vacation_ytd

US Driver License Parser

CategoryPretrained
Solution typeIdentity
FunctionsOCR, Entity Extraction
Release stageGeneral availability
Access statusPublic
Full processor detailsDetailed entry
  • Date Of Birth
  • Expiration Date
  • Issue Date

Expense Parser

CategoryPretrained
Solution typeProcurement
FunctionsOCR, Entity Extraction
Release stageGeneral availability
Access statusPublic
Full processor detailsDetailed entry
  • currency
  • total_amount
  • total_tax_amount
  • net_amount
  • receipt_date
  • purchase_time
  • start_date
  • end_date
  • line_item/amount
  • line_item/payment_date
  • line_item/payment_amount

Invoice Parser

CategoryPretrained
Solution typeProcurement
FunctionsOCR, Entity Extraction
Release stageGeneral availability
Access statusPublic
Full processor detailsDetailed entry
  • amount_paid_since_last_invoice
  • currency
  • currency_exchange_rate
  • delivery_date
  • due_date
  • freight_amount
  • invoice_date
  • net_amount
  • total_amount
  • total_tax_amount
  • line_item/amount
  • line_item/quantity
  • line_item/unit_price
  • vat/amount
  • vat/tax_amount
  • vat/tax_rate

Extraction processors

Custom extractor supports normalization of all entities with the following Google Cloudcommon data types:dateTime,currency,money,andnumber.

ProcessorsNormalized data types

Custom Extractor

CategoryExtract
Solution typeCustom
FunctionsOCR, Entity Extraction
Release stageGeneral availability
Access statusPublic
Full processor detailsDetailed entry
  • dateTime asSTRING
  • currency asSTRING
  • money asgoogle.type.Money
  • number asFLOAT orINTEGER

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.