Analyzing Syntax

While most Natural Language methods analyze what a given text isabout,theanalyzeSyntax method inspects the structure of the language itself.Syntactic Analysis breaks up the given text into a series of sentences andtokens (generally, words) and provides linguistic information about those tokens.SeeMorphology & Dependency Trees for detailsabout the linguistic analysis andLanguage Supportfor a list of the languages whose syntax the Natural Language API can analyze.

This section demonstrates a few ways to detect syntax in a document.For each document, you must submit a separate request.

Analyzing Syntax in a String

Here is an example of performing syntactic analysis on a text string sentdirectly to the Natural Language API:

Protocol

To analyze syntax in a document, make aPOST request to thedocuments:analyzeSyntaxREST method and providethe appropriate request body as shown in the following example.

The example uses thegcloud auth application-default print-access-tokencommand to obtain an access token for a service account set up for theproject using the Google Cloud Platformgcloud CLI.For instructions on installing the gcloud CLI,setting up a project with a service accountsee theQuickstart.

curl-XPOST\-H"Authorization: Bearer "$(gcloudauthapplication-defaultprint-access-token)\-H"Content-Type: application/json; charset=utf-8"\--data"{  'encodingType': 'UTF8',  'document': {    'type': 'PLAIN_TEXT',    'content': 'Google, headquartered in Mountain View, unveiled the new Android phone at the Consumer Electronic Show.  Sundar Pichai said in his keynote that users love their new Android phones.'  }}""https://language.googleapis.com/v1/documents:analyzeSyntax"

If you don't specifydocument.language, then the language will be automaticallydetected. For information on which languages are supported by the Natural Language API,seeLanguage Support. See theDocumentreference documentation for more information on configuring the requestbody.

If the request is successful, the server returns a200 OK HTTP status code andthe response in JSON format:

{  "sentences": [    {      "text": {        "content": "Google, headquartered in Mountain View, unveiled the new Android phone at the Consumer Electronic Show.",        "beginOffset": 0      }    },    {      "text": {        "content": "Sundar Pichai said in his keynote that users love their new Android phones.",        "beginOffset": 105      }    }  ],  "tokens": [    {      "text": {        "content": "Google",        "beginOffset": 0      },      "partOfSpeech": {        "tag": "NOUN",        "aspect": "ASPECT_UNKNOWN",        "case": "CASE_UNKNOWN",        "form": "FORM_UNKNOWN",        "gender": "GENDER_UNKNOWN",        "mood": "MOOD_UNKNOWN",        "number": "SINGULAR",        "person": "PERSON_UNKNOWN",        "proper": "PROPER",        "reciprocity": "RECIPROCITY_UNKNOWN",        "tense": "TENSE_UNKNOWN",        "voice": "VOICE_UNKNOWN"      },      "dependencyEdge": {        "headTokenIndex": 7,        "label": "NSUBJ"      },      "lemma": "Google"    },    ...    {      "text": {        "content": ".",        "beginOffset": 179      },      "partOfSpeech": {        "tag": "PUNCT",        "aspect": "ASPECT_UNKNOWN",        "case": "CASE_UNKNOWN",        "form": "FORM_UNKNOWN",        "gender": "GENDER_UNKNOWN",        "mood": "MOOD_UNKNOWN",        "number": "NUMBER_UNKNOWN",        "person": "PERSON_UNKNOWN",        "proper": "PROPER_UNKNOWN",        "reciprocity": "RECIPROCITY_UNKNOWN",        "tense": "TENSE_UNKNOWN",        "voice": "VOICE_UNKNOWN"      },      "dependencyEdge": {        "headTokenIndex": 20,        "label": "P"      },      "lemma": "."    }  ],  "language": "en"}

Thetokens array containsTokenobjects representing the detected sentence tokens, which include informationsuch as a token's part of speech and its position in the sentence.

gcloud

Refer to theanalyze-syntaxcommand for complete details.

To perform syntax analysis, use the gcloud CLI anduse the--content flag to identify the content to analyze:

gcloud ml language analyze-syntax --content="Google, headquartered in Mountain View, unveiled the new Android phone at the Consumer Electronic Show.  Sundar Pichai said in his keynote that users love their new Android phones."

If the request is successful, the server returns a response in JSON format:

{  "sentences": [    {      "text": {        "content": "Google, headquartered in Mountain View, unveiled the new Android phone at the Consumer Electronic Show.",        "beginOffset": 0      }    },    {      "text": {        "content": "Sundar Pichai said in his keynote that users love their new Android phones.",        "beginOffset": 105      }    }  ],  "tokens": [    {      "text": {        "content": "Google",        "beginOffset": 0      },      "partOfSpeech": {        "tag": "NOUN",        "aspect": "ASPECT_UNKNOWN",        "case": "CASE_UNKNOWN",        "form": "FORM_UNKNOWN",        "gender": "GENDER_UNKNOWN",        "mood": "MOOD_UNKNOWN",        "number": "SINGULAR",        "person": "PERSON_UNKNOWN",        "proper": "PROPER",        "reciprocity": "RECIPROCITY_UNKNOWN",        "tense": "TENSE_UNKNOWN",        "voice": "VOICE_UNKNOWN"      },      "dependencyEdge": {        "headTokenIndex": 7,        "label": "NSUBJ"      },      "lemma": "Google"    },    ...    {      "text": {        "content": ".",        "beginOffset": 179      },      "partOfSpeech": {        "tag": "PUNCT",        "aspect": "ASPECT_UNKNOWN",        "case": "CASE_UNKNOWN",        "form": "FORM_UNKNOWN",        "gender": "GENDER_UNKNOWN",        "mood": "MOOD_UNKNOWN",        "number": "NUMBER_UNKNOWN",        "person": "PERSON_UNKNOWN",        "proper": "PROPER_UNKNOWN",        "reciprocity": "RECIPROCITY_UNKNOWN",        "tense": "TENSE_UNKNOWN",        "voice": "VOICE_UNKNOWN"      },      "dependencyEdge": {        "headTokenIndex": 20,        "label": "P"      },      "lemma": "."    }  ],  "language": "en"}

Thetokens array containsTokenobjects representing the detected sentence tokens, which include informationsuch as a token's part of speech and its position in the sentence.

Go

To learn how to install and use the client library for Natural Language, seeNatural Language client libraries. For more information, see theNatural LanguageGo API reference documentation.

To authenticate to Natural Language, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

funcanalyzeSyntax(ctxcontext.Context,client*language.Client,textstring)(*languagepb.AnnotateTextResponse,error){returnclient.AnnotateText(ctx,&languagepb.AnnotateTextRequest{Document:&languagepb.Document{Source:&languagepb.Document_Content{Content:text,},Type:languagepb.Document_PLAIN_TEXT,},Features:&languagepb.AnnotateTextRequest_Features{ExtractSyntax:true,},EncodingType:languagepb.EncodingType_UTF8,})}

Java

To learn how to install and use the client library for Natural Language, seeNatural Language client libraries. For more information, see theNatural LanguageJava API reference documentation.

To authenticate to Natural Language, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

// Instantiate the Language client com.google.cloud.language.v1.LanguageServiceClienttry(com.google.cloud.language.v1.LanguageServiceClientlanguage=com.google.cloud.language.v1.LanguageServiceClient.create()){com.google.cloud.language.v1.Documentdoc=com.google.cloud.language.v1.Document.newBuilder().setContent(text).setType(com.google.cloud.language.v1.Document.Type.PLAIN_TEXT).build();AnalyzeSyntaxRequestrequest=AnalyzeSyntaxRequest.newBuilder().setDocument(doc).setEncodingType(com.google.cloud.language.v1.EncodingType.UTF16).build();// Analyze the syntax in the given textAnalyzeSyntaxResponseresponse=language.analyzeSyntax(request);// Print the responsefor(Tokentoken:response.getTokensList()){System.out.printf("\tText: %s\n",token.getText().getContent());System.out.printf("\tBeginOffset: %d\n",token.getText().getBeginOffset());System.out.printf("Lemma: %s\n",token.getLemma());System.out.printf("PartOfSpeechTag: %s\n",token.getPartOfSpeech().getTag());System.out.printf("\tAspect: %s\n",token.getPartOfSpeech().getAspect());System.out.printf("\tCase: %s\n",token.getPartOfSpeech().getCase());System.out.printf("\tForm: %s\n",token.getPartOfSpeech().getForm());System.out.printf("\tGender: %s\n",token.getPartOfSpeech().getGender());System.out.printf("\tMood: %s\n",token.getPartOfSpeech().getMood());System.out.printf("\tNumber: %s\n",token.getPartOfSpeech().getNumber());System.out.printf("\tPerson: %s\n",token.getPartOfSpeech().getPerson());System.out.printf("\tProper: %s\n",token.getPartOfSpeech().getProper());System.out.printf("\tReciprocity: %s\n",token.getPartOfSpeech().getReciprocity());System.out.printf("\tTense: %s\n",token.getPartOfSpeech().getTense());System.out.printf("\tVoice: %s\n",token.getPartOfSpeech().getVoice());System.out.println("DependencyEdge");System.out.printf("\tHeadTokenIndex: %d\n",token.getDependencyEdge().getHeadTokenIndex());System.out.printf("\tLabel: %s\n\n",token.getDependencyEdge().getLabel());}returnresponse.getTokensList();}

Node.js

To learn how to install and use the client library for Natural Language, seeNatural Language client libraries. For more information, see theNatural LanguageNode.js API reference documentation.

To authenticate to Natural Language, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

// Imports the Google Cloud client libraryconstlanguage=require('@google-cloud/language');// Creates a clientconstclient=newlanguage.LanguageServiceClient();/** * TODO(developer): Uncomment the following line to run this code. */// const text = 'Your text to analyze, e.g. Hello, world!';// Prepares a document, representing the provided textconstdocument={content:text,type:'PLAIN_TEXT',};// Need to specify an encodingType to receive word offsetsconstencodingType='UTF8';// Detects the sentiment of the documentconst[syntax]=awaitclient.analyzeSyntax({document,encodingType});console.log('Tokens:');syntax.tokens.forEach(part=>{console.log(`${part.partOfSpeech.tag}:${part.text.content}`);console.log('Morphology:',part.partOfSpeech);});

Python

To learn how to install and use the client library for Natural Language, seeNatural Language client libraries. For more information, see theNatural LanguagePython API reference documentation.

To authenticate to Natural Language, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

fromgoogle.cloudimportlanguage_v1defsample_analyze_syntax(text_content):"""    Analyzing Syntax in a String    Args:      text_content The text content to analyze    """client=language_v1.LanguageServiceClient()# text_content = 'This is a short sentence.'# Available types: PLAIN_TEXT, HTMLtype_=language_v1.Document.Type.PLAIN_TEXT# Optional. If not specified, the language is automatically detected.# For list of supported languages:# https://cloud.google.com/natural-language/docs/languageslanguage="en"document={"content":text_content,"type_":type_,"language":language}# Available values: NONE, UTF8, UTF16, UTF32encoding_type=language_v1.EncodingType.UTF8response=client.analyze_syntax(request={"document":document,"encoding_type":encoding_type})# Loop through tokens returned from the APIfortokeninresponse.tokens:# Get the text content of this token. Usually a word or punctuation.text=token.textprint(f"Token text:{text.content}")print(f"Location of this token in overall document:{text.begin_offset}")# Get the part of speech information for this token.# Part of speech is defined in:# http://www.lrec-conf.org/proceedings/lrec2012/pdf/274_Paper.pdfpart_of_speech=token.part_of_speech# Get the tag, e.g. NOUN, ADJ for Adjective, et al.print("Part of Speech tag:{}".format(language_v1.PartOfSpeech.Tag(part_of_speech.tag).name))# Get the voice, e.g. ACTIVE or PASSIVEprint("Voice:{}".format(language_v1.PartOfSpeech.Voice(part_of_speech.voice).name))# Get the tense, e.g. PAST, FUTURE, PRESENT, et al.print("Tense:{}".format(language_v1.PartOfSpeech.Tense(part_of_speech.tense).name))# See API reference for additional Part of Speech information available# Get the lemma of the token. Wikipedia lemma description# https://en.wikipedia.org/wiki/Lemma_(morphology)print(f"Lemma:{token.lemma}")# Get the dependency tree parse information for this token.# For more information on dependency labels:# http://www.aclweb.org/anthology/P13-2017dependency_edge=token.dependency_edgeprint(f"Head token index:{dependency_edge.head_token_index}")print("Label:{}".format(language_v1.DependencyEdge.Label(dependency_edge.label).name))# Get the language of the text, which will be the same as# the language specified in the request or, if not specified,# the automatically-detected language.print(f"Language of the text:{response.language}")

Additional languages

C#: Please follow theC# setup instructions on the client libraries page and then visit theNatural Language reference documentation for .NET.

PHP: Please follow thePHP setup instructions on the client libraries page and then visit theNatural Language reference documentation for PHP.

Ruby: Please follow theRuby setup instructions on the client libraries page and then visit theNatural Language reference documentation for Ruby.

Analyzing Syntax from Cloud Storage

For your convenience, the Natural Language API can perform syntacticanalysis directly on a file located in Cloud Storage, without the needto send the contents of the file in the body of your request.

Here is an example of performing syntactic analysis on a file located in CloudStorage.

Protocol

To analyze syntax in a document stored in Cloud Storage,make aPOST request to thedocuments:analyzeSyntaxREST method and providethe appropriate request body with the path to the documentas shown in the following example.

curl-XPOST\-H"Authorization: Bearer "$(gcloudauthapplication-defaultprint-access-token)\-H"Content-Type: application/json; charset=utf-8"\--data"{  'encodingType': 'UTF8',  'document': {    'type': 'PLAIN_TEXT',    'gcsContentUri': 'gs://<bucket-name>/<object-name>'  }}""https://language.googleapis.com/v1/documents:analyzeSyntax"

If you don't specifydocument.language, then the language will be automaticallydetected. For information on which languages are supported by the Natural Language API,seeLanguage Support. See theDocumentreference documentation for more information on configuring the request body.

If the request is successful, the server returns a200 OK HTTP status code andthe response in JSON format:

{  "sentences": [    {      "text": {        "content": "Hello, world!",        "beginOffset": 0      }    }  ],  "tokens": [    {      "text": {        "content": "Hello",        "beginOffset": 0      },      "partOfSpeech": {        "tag": "X",        // ...      },      "dependencyEdge": {        "headTokenIndex": 2,        "label": "DISCOURSE"      },      "lemma": "Hello"    },    {      "text": {        "content": ",",        "beginOffset": 5      },      "partOfSpeech": {        "tag": "PUNCT",        // ...      },      "dependencyEdge": {        "headTokenIndex": 2,        "label": "P"      },      "lemma": ","    },    // ...  ],  "language": "en"}

Thetokens array containsTokenobjects representing the detected sentence tokens, which include informationsuch as a token's part of speech and its position in the sentence.

gcloud

Refer to theanalyze-syntaxcommand for complete details.

To perform syntax analysis on a file in Cloud Storage, use thegcloudcommand line tool and use the--content-file flag to identify the filepath that contains the content to analyze:

gcloud ml language analyze-syntax --content-file=gs://YOUR_BUCKET_NAME/YOUR_FILE_NAME

If the request is successful, the server returns a response in JSON format:

{  "sentences": [    {      "text": {        "content": "Hello, world!",        "beginOffset": 0      }    }  ],  "tokens": [    {      "text": {        "content": "Hello",        "beginOffset": 0      },      "partOfSpeech": {        "tag": "X",        // ...      },      "dependencyEdge": {        "headTokenIndex": 2,        "label": "DISCOURSE"      },      "lemma": "Hello"    },    {      "text": {        "content": ",",        "beginOffset": 5      },      "partOfSpeech": {        "tag": "PUNCT",        // ...      },      "dependencyEdge": {        "headTokenIndex": 2,        "label": "P"      },      "lemma": ","    },    // ...  ],  "language": "en"}

Thetokens array containsTokenobjects representing the detected sentence tokens, which include informationsuch as a token's part of speech and its position in the sentence.

Go

To learn how to install and use the client library for Natural Language, seeNatural Language client libraries. For more information, see theNatural LanguageGo API reference documentation.

To authenticate to Natural Language, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

funcanalyzeSyntaxFromGCS(ctxcontext.Context,gcsURIstring)(*languagepb.AnnotateTextResponse,error){returnclient.AnnotateText(ctx,&languagepb.AnnotateTextRequest{Document:&languagepb.Document{Source:&languagepb.Document_GcsContentUri{GcsContentUri:gcsURI,},Type:languagepb.Document_PLAIN_TEXT,},Features:&languagepb.AnnotateTextRequest_Features{ExtractSyntax:true,},EncodingType:languagepb.EncodingType_UTF8,})}

Java

To learn how to install and use the client library for Natural Language, seeNatural Language client libraries. For more information, see theNatural LanguageJava API reference documentation.

To authenticate to Natural Language, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

// Instantiate the Language client com.google.cloud.language.v1.LanguageServiceClienttry(com.google.cloud.language.v1.LanguageServiceClientlanguage=com.google.cloud.language.v1.LanguageServiceClient.create()){com.google.cloud.language.v1.Documentdoc=com.google.cloud.language.v1.Document.newBuilder().setGcsContentUri(gcsUri).setType(com.google.cloud.language.v1.Document.Type.PLAIN_TEXT).build();AnalyzeSyntaxRequestrequest=AnalyzeSyntaxRequest.newBuilder().setDocument(doc).setEncodingType(com.google.cloud.language.v1.EncodingType.UTF16).build();// Analyze the syntax in the given textAnalyzeSyntaxResponseresponse=language.analyzeSyntax(request);// Print the responsefor(Tokentoken:response.getTokensList()){System.out.printf("\tText: %s\n",token.getText().getContent());System.out.printf("\tBeginOffset: %d\n",token.getText().getBeginOffset());System.out.printf("Lemma: %s\n",token.getLemma());System.out.printf("PartOfSpeechTag: %s\n",token.getPartOfSpeech().getTag());System.out.printf("\tAspect: %s\n",token.getPartOfSpeech().getAspect());System.out.printf("\tCase: %s\n",token.getPartOfSpeech().getCase());System.out.printf("\tForm: %s\n",token.getPartOfSpeech().getForm());System.out.printf("\tGender: %s\n",token.getPartOfSpeech().getGender());System.out.printf("\tMood: %s\n",token.getPartOfSpeech().getMood());System.out.printf("\tNumber: %s\n",token.getPartOfSpeech().getNumber());System.out.printf("\tPerson: %s\n",token.getPartOfSpeech().getPerson());System.out.printf("\tProper: %s\n",token.getPartOfSpeech().getProper());System.out.printf("\tReciprocity: %s\n",token.getPartOfSpeech().getReciprocity());System.out.printf("\tTense: %s\n",token.getPartOfSpeech().getTense());System.out.printf("\tVoice: %s\n",token.getPartOfSpeech().getVoice());System.out.println("DependencyEdge");System.out.printf("\tHeadTokenIndex: %d\n",token.getDependencyEdge().getHeadTokenIndex());System.out.printf("\tLabel: %s\n\n",token.getDependencyEdge().getLabel());}returnresponse.getTokensList();}

Node.js

To learn how to install and use the client library for Natural Language, seeNatural Language client libraries. For more information, see theNatural LanguageNode.js API reference documentation.

To authenticate to Natural Language, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

// Imports the Google Cloud client libraryconstlanguage=require('@google-cloud/language');// Creates a clientconstclient=newlanguage.LanguageServiceClient();/** * TODO(developer): Uncomment the following lines to run this code */// const bucketName = 'Your bucket name, e.g. my-bucket';// const fileName = 'Your file name, e.g. my-file.txt';// Prepares a document, representing a text file in Cloud Storageconstdocument={gcsContentUri:`gs://${bucketName}/${fileName}`,type:'PLAIN_TEXT',};// Need to specify an encodingType to receive word offsetsconstencodingType='UTF8';// Detects the sentiment of the documentconst[syntax]=awaitclient.analyzeSyntax({document,encodingType});console.log('Parts of speech:');syntax.tokens.forEach(part=>{console.log(`${part.partOfSpeech.tag}:${part.text.content}`);console.log('Morphology:',part.partOfSpeech);});

Python

To learn how to install and use the client library for Natural Language, seeNatural Language client libraries. For more information, see theNatural LanguagePython API reference documentation.

To authenticate to Natural Language, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

fromgoogle.cloudimportlanguage_v1defsample_analyze_syntax(gcs_content_uri):"""    Analyzing Syntax in text file stored in Cloud Storage    Args:      gcs_content_uri Google Cloud Storage URI where the file content is located.      e.g. gs://[Your Bucket]/[Path to File]    """client=language_v1.LanguageServiceClient()# gcs_content_uri = 'gs://cloud-samples-data/language/syntax-sentence.txt'# Available types: PLAIN_TEXT, HTMLtype_=language_v1.Document.Type.PLAIN_TEXT# Optional. If not specified, the language is automatically detected.# For list of supported languages:# https://cloud.google.com/natural-language/docs/languageslanguage="en"document={"gcs_content_uri":gcs_content_uri,"type_":type_,"language":language,}# Available values: NONE, UTF8, UTF16, UTF32encoding_type=language_v1.EncodingType.UTF8response=client.analyze_syntax(request={"document":document,"encoding_type":encoding_type})# Loop through tokens returned from the APIfortokeninresponse.tokens:# Get the text content of this token. Usually a word or punctuation.text=token.textprint(f"Token text:{text.content}")print(f"Location of this token in overall document:{text.begin_offset}")# Get the part of speech information for this token.# Part of speech is defined in:# http://www.lrec-conf.org/proceedings/lrec2012/pdf/274_Paper.pdfpart_of_speech=token.part_of_speech# Get the tag, e.g. NOUN, ADJ for Adjective, et al.print("Part of Speech tag:{}".format(language_v1.PartOfSpeech.Tag(part_of_speech.tag).name))# Get the voice, e.g. ACTIVE or PASSIVEprint("Voice:{}".format(language_v1.PartOfSpeech.Voice(part_of_speech.voice).name))# Get the tense, e.g. PAST, FUTURE, PRESENT, et al.print("Tense:{}".format(language_v1.PartOfSpeech.Tense(part_of_speech.tense).name))# See API reference for additional Part of Speech information available# Get the lemma of the token. Wikipedia lemma description# https://en.wikipedia.org/wiki/Lemma_(morphology)print(f"Lemma:{token.lemma}")# Get the dependency tree parse information for this token.# For more information on dependency labels:# http://www.aclweb.org/anthology/P13-2017dependency_edge=token.dependency_edgeprint(f"Head token index:{dependency_edge.head_token_index}")print("Label:{}".format(language_v1.DependencyEdge.Label(dependency_edge.label).name))# Get the language of the text, which will be the same as# the language specified in the request or, if not specified,# the automatically-detected language.print(f"Language of the text:{response.language}")

Additional languages

C#: Please follow theC# setup instructions on the client libraries page and then visit theNatural Language reference documentation for .NET.

PHP: Please follow thePHP setup instructions on the client libraries page and then visit theNatural Language reference documentation for PHP.

Ruby: Please follow theRuby setup instructions on the client libraries page and then visit theNatural Language reference documentation for Ruby.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.