Document AI client libraries Stay organized with collections Save and categorize content based on your preferences.
This page shows how to get started with the Cloud Client Libraries for theDocument AI API. Client libraries make it easier to accessGoogle Cloud APIs from a supported language. Although you can useGoogle Cloud APIs directly by making raw requests to the server, clientlibraries provide simplifications that significantly reduce the amount of codeyou need to write.
Read more about the Cloud Client Librariesand the older Google API Client Libraries inClient libraries explained.
Install the client library
C++
SeeSetting up a C++ development environmentfor details about this client library's requirements and install dependencies.
C#
Install-Package Google.Cloud.DocumentAI.V1 -Pre
For more information, seeSetting Up a C# Development Environment.
Go
go get cloud.google.com/go/documentai
For more information, seeSetting Up a Go Development Environment.
Java
If you are usingMaven, addthe following to yourpom.xml file. For more information aboutBOMs, seeThe Google Cloud Platform Libraries BOM.
<dependencyManagement><dependencies><dependency><groupId>com.google.cloud</groupId><artifactId>libraries-bom</artifactId><version>26.76.0</version><type>pom</type><scope>import</scope></dependency> </dependencies></dependencyManagement><dependencies> <dependency> <groupId>com.google.cloud</groupId><artifactId>google-cloud-document-ai</artifactId></dependency></dependencies>If you are usingGradle,add the following to your dependencies:
implementation'com.google.cloud:google-cloud-document-ai:2.89.0'If you are usingsbt, addthe following to your dependencies:
libraryDependencies+="com.google.cloud"%"google-cloud-document-ai"%"2.89.0"If you're using Visual Studio Code or IntelliJ, you can add client libraries to your project using the following IDE plugins:
The plugins provide additional functionality, such as key management for service accounts. Refer to each plugin's documentation for details.
Note: Cloud Java client libraries do not currently support Android.For more information, seeSetting Up a Java Development Environment.
Node.js
npm install @google-cloud/documentai
For more information, seeSetting Up a Node.js Development Environment.
PHP
composer require google/cloud-document-ai
For more information, seeUsing PHP on Google Cloud.
Python
pip install --upgrade google-cloud-documentai
For more information, seeSetting Up a Python Development Environment.
Ruby
gem install google-cloud-document_ai
For more information, seeSetting Up a Ruby Development Environment.
Set up authentication
To authenticate calls to Google Cloud APIs, client libraries supportApplication Default Credentials (ADC);the libraries look for credentials in a set of defined locations and use those credentialsto authenticate requests to the API. With ADC, you can makecredentials available to your application in a variety of environments, such as localdevelopment or production, without needing to modify your application code.For production environments, the way you set up ADC depends on the serviceand context. For more information, seeSet up Application Default Credentials.
For a local development environment, you can set up ADC with the credentialsthat are associated with your Google Account:
Install the Google Cloud CLI. After installation,initialize the Google Cloud CLI by running the following command:
gcloudinit
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
If you're using a local shell, then create local authentication credentials for your user account:
gcloudauthapplication-defaultlogin
You don't need to do this if you're using Cloud Shell.
If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.
A sign-in screen appears. After you sign in, your credentials are stored in the local credential file used by ADC.
Use the client library
The following example shows how to use the client library.
C++
#include"google/cloud/documentai/v1/document_processor_client.h"#include"google/cloud/location.h"#include <fstream>#include <iostream>#include <string>intmain(intargc,char*argv[])try{if(argc!=5){std::cerr <<"Usage: " <<argv[0] <<" project-id location-id processor-id filename (PDF only)\n";return1;}std::stringconstlocation_id=argv[2];if(location_id!="us" &&location_id!="eu"){std::cerr <<"location-id must be either 'us' or 'eu'\n";return1;}autoconstlocation=google::cloud::Location(argv[1],location_id);namespacedocumentai=::google::cloud::documentai_v1;autoclient=documentai::DocumentProcessorServiceClient(documentai::MakeDocumentProcessorServiceConnection(location.location_id()));google::cloud::documentai::v1::ProcessRequestreq;req.set_name(location.FullName()+"/processors/"+argv[3]);req.set_skip_human_review(true);auto&doc=*req.mutable_raw_document();doc.set_mime_type("application/pdf");std::ifstreamis(argv[4]);doc.set_content(std::string{std::istreambuf_iterator<char>(is),{}});autoresp=client.ProcessDocument(std::move(req));if(!resp)throwstd::move(resp).status();std::cout <<resp->document().text() <<"\n";return0;}catch(google::cloud::Statusconst&status){std::cerr <<"google::cloud::Status thrown: " <<status <<"\n";return1;}C#
usingGoogle.Cloud.DocumentAI.V1;usingGoogle.Protobuf;usingSystem;usingSystem.IO;publicclassQuickstartSample{publicDocumentQuickstart(stringprojectId="your-project-id",stringlocationId="your-processor-location",stringprocessorId="your-processor-id",stringlocalPath="my-local-path/my-file-name",stringmimeType="application/pdf"){// Create clientvarclient=newDocumentProcessorServiceClientBuilder{Endpoint=$"{locationId}-documentai.googleapis.com"}.Build();// Read in local fileusingvarfileStream=File.OpenRead(localPath);varrawDocument=newRawDocument{Content=ByteString.FromStream(fileStream),MimeType=mimeType};// Initialize request argument(s)varrequest=newProcessRequest{Name=ProcessorName.FromProjectLocationProcessor(projectId,locationId,processorId).ToString(),RawDocument=rawDocument};// Make the requestvarresponse=client.ProcessDocument(request);vardocument=response.Document;Console.WriteLine(document.Text);returndocument;}}Go
import("context""flag""fmt""os"documentai"cloud.google.com/go/documentai/apiv1""cloud.google.com/go/documentai/apiv1/documentaipb""google.golang.org/api/option")funcmain(){projectID:=flag.String("project_id","PROJECT_ID","Cloud Project ID")location:=flag.String("location","us","The Processor location")// Create a Processor before running sampleprocessorID:=flag.String("processor_id","aaaaaaaa","The Processor ID")filePath:=flag.String("file_path","invoice.pdf","The path to the file to parse")mimeType:=flag.String("mime_type","application/pdf","The mimeType of the file")flag.Parse()ctx:=context.Background()endpoint:=fmt.Sprintf("%s-documentai.googleapis.com:443",*location)client,err:=documentai.NewDocumentProcessorClient(ctx,option.WithEndpoint(endpoint))iferr!=nil{fmt.Println(fmt.Errorf("error creating Document AI client: %w",err))}deferclient.Close()// Open local file.data,err:=os.ReadFile(*filePath)iferr!=nil{fmt.Println(fmt.Errorf("os.ReadFile: %w",err))}req:=&documentaipb.ProcessRequest{Name:fmt.Sprintf("projects/%s/locations/%s/processors/%s",*projectID,*location,*processorID),Source:&documentaipb.ProcessRequest_RawDocument{RawDocument:&documentaipb.RawDocument{Content:data,MimeType:*mimeType,},},}resp,err:=client.ProcessDocument(ctx,req)iferr!=nil{fmt.Println(fmt.Errorf("processDocument: %w",err))}// Handle the results.document:=resp.GetDocument()fmt.Printf("Document Text: %s",document.GetText())}Java
importcom.google.cloud.documentai.v1.Document;importcom.google.cloud.documentai.v1.DocumentProcessorServiceClient;importcom.google.cloud.documentai.v1.DocumentProcessorServiceSettings;importcom.google.cloud.documentai.v1.ProcessRequest;importcom.google.cloud.documentai.v1.ProcessResponse;importcom.google.cloud.documentai.v1.RawDocument;importcom.google.protobuf.ByteString;importjava.io.IOException;importjava.nio.file.Files;importjava.nio.file.Paths;importjava.util.List;importjava.util.concurrent.ExecutionException;importjava.util.concurrent.TimeoutException;publicclassQuickStart{publicstaticvoidmain(String[]args)throwsIOException,InterruptedException,ExecutionException,TimeoutException{// TODO(developer): Replace these variables before running the sample.StringprojectId="your-project-id";Stringlocation="your-project-location";// Format is "us" or "eu".StringprocessorId="your-processor-id";StringfilePath="path/to/input/file.pdf";quickStart(projectId,location,processorId,filePath);}publicstaticvoidquickStart(StringprojectId,Stringlocation,StringprocessorId,StringfilePath)throwsIOException,InterruptedException,ExecutionException,TimeoutException{// Initialize client that will be used to send requests. This client only needs// to be created// once, and can be reused for multiple requests. After completing all of your// requests, call// the "close" method on the client to safely clean up any remaining background// resources.Stringendpoint=String.format("%s-documentai.googleapis.com:443",location);DocumentProcessorServiceSettingssettings=DocumentProcessorServiceSettings.newBuilder().setEndpoint(endpoint).build();try(DocumentProcessorServiceClientclient=DocumentProcessorServiceClient.create(settings)){// The full resource name of the processor, e.g.:// projects/project-id/locations/location/processor/processor-id// You must create new processors in the Cloud Console firstStringname=String.format("projects/%s/locations/%s/processors/%s",projectId,location,processorId);// Read the file.byte[]imageFileData=Files.readAllBytes(Paths.get(filePath));// Convert the image data to a Buffer and base64 encode it.ByteStringcontent=ByteString.copyFrom(imageFileData);RawDocumentdocument=RawDocument.newBuilder().setContent(content).setMimeType("application/pdf").build();// Configure the process request.ProcessRequestrequest=ProcessRequest.newBuilder().setName(name).setRawDocument(document).build();// Recognizes text entities in the PDF documentProcessResponseresult=client.processDocument(request);DocumentdocumentResponse=result.getDocument();// Get all of the document text as one big stringStringtext=documentResponse.getText();// Read the text recognition output from the processorSystem.out.println("The document contains the following paragraphs:");Document.PagefirstPage=documentResponse.getPages(0);List<Document.Page.Paragraph>paragraphs=firstPage.getParagraphsList();for(Document.Page.Paragraphparagraph:paragraphs){StringparagraphText=getText(paragraph.getLayout().getTextAnchor(),text);System.out.printf("Paragraph text:\n%s\n",paragraphText);}}}// Extract shards from the text fieldprivatestaticStringgetText(Document.TextAnchortextAnchor,Stringtext){if(textAnchor.getTextSegmentsList().size() >0){intstartIdx=(int)textAnchor.getTextSegments(0).getStartIndex();intendIdx=(int)textAnchor.getTextSegments(0).getEndIndex();returntext.substring(startIdx,endIdx);}return"[NO TEXT]";}}Node.js
/** * TODO(developer): Uncomment these variables before running the sample. */// const projectId = 'YOUR_PROJECT_ID';// const location = 'YOUR_PROJECT_LOCATION'; // Format is 'us' or 'eu'// const processorId = 'YOUR_PROCESSOR_ID'; // Create processor in Cloud Console// const filePath = '/path/to/local/pdf';const{DocumentProcessorServiceClient}=require('@google-cloud/documentai').v1;// Instantiates a client// apiEndpoint regions available: eu-documentai.googleapis.com, us-documentai.googleapis.com (Required if using eu based processor)// const client = new DocumentProcessorServiceClient({apiEndpoint: 'eu-documentai.googleapis.com'});constclient=newDocumentProcessorServiceClient();asyncfunctionquickstart(){// The full resource name of the processor, e.g.:// projects/project-id/locations/location/processor/processor-id// You must create new processors in the Cloud Console firstconstname=`projects/${projectId}/locations/${location}/processors/${processorId}`;// Read the file into memory.constfs=require('fs').promises;constimageFile=awaitfs.readFile(filePath);// Convert the image data to a Buffer and base64 encode it.constencodedImage=Buffer.from(imageFile).toString('base64');constrequest={name,rawDocument:{content:encodedImage,mimeType:'application/pdf',},};// Recognizes text entities in the PDF documentconst[result]=awaitclient.processDocument(request);const{document}=result;// Get all of the document text as one big stringconst{text}=document;// Extract shards from the text fieldconstgetText=textAnchor=>{if(!textAnchor.textSegments||textAnchor.textSegments.length===0){return'';}// First shard in document doesn't have startIndex propertyconststartIndex=textAnchor.textSegments[0].startIndex||0;constendIndex=textAnchor.textSegments[0].endIndex;returntext.substring(startIndex,endIndex);};// Read the text recognition output from the processorconsole.log('The document contains the following paragraphs:');const[page1]=document.pages;const{paragraphs}=page1;for(constparagraphofparagraphs){constparagraphText=getText(paragraph.layout.textAnchor);console.log(`Paragraph text:\n${paragraphText}`);}}PHP
# Include the autoloader for libraries installed with Composer.require __DIR__ . '/vendor/autoload.php';# Import the Google Cloud client library.use Google\Cloud\DocumentAI\V1\Client\DocumentProcessorServiceClient;use Google\Cloud\DocumentAI\V1\RawDocument;use Google\Cloud\DocumentAI\V1\ProcessRequest;# TODO(developer): Update the following lines before running the sample.# Your Google Cloud Platform project ID.$projectId = 'YOUR_PROJECT_ID';# Your Processor Location.$location = 'us';# Your Processor ID as hexadecimal characters.# Not to be confused with the Processor Display Name.$processorId = 'YOUR_PROCESSOR_ID';# Path for the file to read.$documentPath = 'resources/invoice.pdf';# Create Client.$client = new DocumentProcessorServiceClient();# Read in file.$handle = fopen($documentPath, 'rb');$contents = fread($handle, filesize($documentPath));fclose($handle);# Load file contents into a RawDocument.$rawDocument = (new RawDocument()) ->setContent($contents) ->SetMimeType('application/pdf');# Get the Fully-qualified Processor Name.$fullProcessorName = $client->processorName($projectId, $location, $processorId);# Send a ProcessRequest and get a ProcessResponse.$request = (new ProcessRequest()) ->setName($fullProcessorName) ->setRawDocument($rawDocument);$response = $client->processDocument($request);# Show the text found in the document.printf('Document Text: %s', $response->getDocument()->getText());Python
fromgoogle.api_core.client_optionsimportClientOptionsfromgoogle.cloudimportdocumentai_v1# TODO(developer): Create a processor of type "OCR_PROCESSOR".# TODO(developer): Update and uncomment these variables before running the sample.# project_id = "MY_PROJECT_ID"# Processor ID as hexadecimal characters.# Not to be confused with the Processor Display Name.# processor_id = "MY_PROCESSOR_ID"# Processor location. For example: "us" or "eu".# location = "MY_PROCESSOR_LOCATION"# Path for file to process.# file_path = "/path/to/local/pdf"# Set `api_endpoint` if you use a location other than "us".opts=ClientOptions(api_endpoint=f"{location}-documentai.googleapis.com")# Initialize Document AI client.client=documentai_v1.DocumentProcessorServiceClient(client_options=opts)# Get the Fully-qualified Processor path.full_processor_name=client.processor_path(project_id,location,processor_id)# Get a Processor reference.request=documentai_v1.GetProcessorRequest(name=full_processor_name)processor=client.get_processor(request=request)# `processor.name` is the full resource name of the processor.# For example: `projects/{project_id}/locations/{location}/processors/{processor_id}`print(f"Processor Name:{processor.name}")# Read the file into memory.withopen(file_path,"rb")asimage:image_content=image.read()# Load binary data.# For supported MIME types, refer to https://cloud.google.com/document-ai/docs/file-typesraw_document=documentai_v1.RawDocument(content=image_content,mime_type="application/pdf",)# Send a request and get the processed document.request=documentai_v1.ProcessRequest(name=processor.name,raw_document=raw_document)result=client.process_document(request=request)document=result.document# Read the text recognition output from the processor.# For a full list of `Document` object attributes, reference this page:# https://cloud.google.com/document-ai/docs/reference/rest/v1/Documentprint("The document contains the following text:")print(document.text)Ruby
require"google/cloud/document_ai/v1"### Document AI quickstart## @param project_id [String] Your Google Cloud project (e.g. "my-project")# @param location_id [String] Your Processor Location (e.g. "us")# @param processor_id [String] Your Processor ID (e.g. "a14dae8f043b60bd")# @param file_path [String] Path to Local File (e.g. "invoice.pdf")# @param mime_type [String] Refer to https://cloud.google.com/document-ai/docs/file-types (e.g. "application/pdf")#defquickstartproject_id:,location_id:,processor_id:,file_path:,mime_type:# Create the Document AI client.client=::Google::Cloud::DocumentAI::V1::DocumentProcessorService::Client.newdo|config|config.endpoint="#{location_id}-documentai.googleapis.com"end# Build the resource name from the project.name=client.processor_path(project:project_id,location:location_id,processor:processor_id)# Read the bytes into memorycontent=File.binreadfile_path# Create requestrequest=Google::Cloud::DocumentAI::V1::ProcessRequest.new(skip_human_review:true,name:name,raw_document:{content:content,mime_type:mime_type})# Process documentresponse=client.process_documentrequest# Handle responseputsresponse.document.textendAdditional resources
C++
The following list contains links to more resources related to theclient library for C++:
C#
The following list contains links to more resources related to theclient library for C#:
Go
The following list contains links to more resources related to theclient library for Go:
Java
The following list contains links to more resources related to theclient library for Java:
Node.js
The following list contains links to more resources related to theclient library for Node.js:
PHP
The following list contains links to more resources related to theclient library for PHP:
Python
The following list contains links to more resources related to theclient library for Python:
Ruby
The following list contains links to more resources related to theclient library for Ruby:
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.