Movatterモバイル変換


[0]ホーム

URL:


Skip to contents

Google Natural Language API

MarkEdmondson

2025-06-28

Source:vignettes/nlp.Rmd
nlp.Rmd

The Google Natural Language API reveals the structure and meaning oftext by offering powerful machine learning models in an easy to use RESTAPI. You can use it to extract information about people, places, eventsand much more, mentioned in text documents, news articles or blog posts.You can also use it to understand sentiment about your product on socialmedia or parse intent from customer conversations happening in a callcenter or a messaging app.

Read moreon theGoogle Natural Language API

The Natural Language API returns natural language understandingtechnologies. You can call them individually, or the default is toreturn them all. The available returns are:

  • Entity analysis - Finds named entities (currently propernames and common nouns) in the text along with entity types, salience,mentions for each entity, and other properties. If possible, will alsoreturn metadata about that entity such as a Wikipedia URL.
  • Syntax - Analyzes the syntax of the text and providessentence boundaries and tokenization along with part of speech tags,dependency trees, and other properties.
  • Sentiment - The overall sentiment of the text, representedby a magnitude[0, +inf] and score between-1.0 (negative sentiment) and1.0 (positivesentiment).
  • Content Classification - Analyzes a document and returns alist of content categories that apply to the text found in the document.A complete list of content categories can be foundhere.

Demo for Entity Analysis

You can pass a vector of text which will call the API for eachelement. The return is a list of responses, each response being a listof tibbles holding the different types of analysis.

library(googleLanguageR)# random text form wikipediatexts<-c("Norma is a small constellation in the Southern Celestial Hemisphere between Ara and Lupus, one of twelve drawn up in the 18th century by French astronomer Nicolas Louis de Lacaille and one of several depicting scientific instruments. Its name refers to a right angle in Latin, and is variously considered to represent a rule, a carpenter's square, a set square or a level. It remains one of the 88 modern constellations. Four of Norma's brighter stars make up a square in the field of faint stars. Gamma2 Normae is the brightest star with an apparent magnitude of 4.0. Mu Normae is one of the most luminous stars known, but is partially obscured by distance and cosmic dust. Four star systems are known to harbour planets. ","Solomon Wariso (born 11 November 1966 in Portsmouth) is a retired English sprinter who competed primarily in the 200 and 400 metres.[1] He represented his country at two outdoor and three indoor World Championships and is the British record holder in the indoor 4 × 400 metres relay.")nlp_result<-gl_nlp(texts)

Each text has its own entry in returned tibbles

str(nlp_result,max.level =2)List of7$ sentences:List of2  ..$:'data.frame':7 obs. of4 variables:  ..$:'data.frame':1 obs. of4 variables:$ tokens:List of2  ..$:'data.frame':139 obs. of17 variables:  ..$:'data.frame':54 obs. of17 variables:$ entities:List of2  ..$:Classes ‘tbl_df’, ‘tbl’ and'data.frame':52 obs. of9 variables:  ..$:Classes ‘tbl_df’, ‘tbl’ and'data.frame':8 obs. of9 variables:$ language: chr [1:2]"en""en"$ text: chr [1:2]"Norma is a small constellation in the Southern Celestial Hemisphere between Ara and Lupus, one of twelve drawn "| __truncated__"Solomon Wariso (born 11 November 1966 in Portsmouth) is a retired English sprinter who competed primarily in th"| __truncated__$ documentSentiment:Classes ‘tbl_df’, ‘tbl’ and'data.frame':2 obs. of2 variables:  ..$ magnitude: num [1:2]2.40.1  ..$ score: num [1:2]0.30.1$ classifyText:Classes ‘tbl_df’, ‘tbl’ and'data.frame':1 obs. of2 variables:  ..$ name: chr"/Science/Astronomy"  ..$ confidence: num0.93

Sentence structure and sentiment:

## sentences structurenlp_result$sentences[[2]]content1 SolomonWariso (born11 November1966in Portsmouth) is a retired English sprinter who competed primarilyin the200 and400 metres.[1] He represented his country at two outdoor and three indoor World Championships and is the British record holderin the indoor4 ×400 metres relay.  beginOffset magnitude score100.10.1

Information on what words (tokens) are within each text:

# word tokens datastr(nlp_result$tokens[[1]])'data.frame':139 obs. of17 variables:$ content: chr"Norma""is""a""small" ...$ beginOffset: int06911173134384757 ...$ tag: chr"NOUN""VERB""DET""ADJ" ...$ aspect: chr"ASPECT_UNKNOWN""ASPECT_UNKNOWN""ASPECT_UNKNOWN""ASPECT_UNKNOWN" ...$ case: chr"CASE_UNKNOWN""CASE_UNKNOWN""CASE_UNKNOWN""CASE_UNKNOWN" ...$ form: chr"FORM_UNKNOWN""FORM_UNKNOWN""FORM_UNKNOWN""FORM_UNKNOWN" ...$ gender: chr"GENDER_UNKNOWN""GENDER_UNKNOWN""GENDER_UNKNOWN""GENDER_UNKNOWN" ...$ mood: chr"MOOD_UNKNOWN""INDICATIVE""MOOD_UNKNOWN""MOOD_UNKNOWN" ...$ number: chr"SINGULAR""SINGULAR""NUMBER_UNKNOWN""NUMBER_UNKNOWN" ...$ person: chr"PERSON_UNKNOWN""THIRD""PERSON_UNKNOWN""PERSON_UNKNOWN" ...$ proper: chr"PROPER""PROPER_UNKNOWN""PROPER_UNKNOWN""PROPER_UNKNOWN" ...$ reciprocity: chr"RECIPROCITY_UNKNOWN""RECIPROCITY_UNKNOWN""RECIPROCITY_UNKNOWN""RECIPROCITY_UNKNOWN" ...$ tense: chr"TENSE_UNKNOWN""PRESENT""TENSE_UNKNOWN""TENSE_UNKNOWN" ...$ voice: chr"VOICE_UNKNOWN""VOICE_UNKNOWN""VOICE_UNKNOWN""VOICE_UNKNOWN" ...$ headTokenIndex: int1144149995 ...$ label: chr"NSUBJ""ROOT""DET""AMOD" ...$ value: chr"Norma""be""a""small" ...

What entities within text have been identified, with optionalwikipedia URL if its available.

nlp_result$entities[[1]]# A tibble: 52 x 9   name           type         salience mid   wikipedia_url magnitude score beginOffset mention_type<chr><chr><dbl><chr><chr><dbl><dbl><int><chr>1 angle          OTHER0.0133NANA00261 COMMON2 Ara            ORGANIZATION0.0631NANA0076 PROPER3 astronomerNANANANANANA144 COMMON4 carpenter      PERSON0.0135NANA00328 COMMON5 constellation  OTHER0.150NANA0017 COMMON6 constellations OTHER0.0140NANA0.90.9405 COMMON7 distance       OTHER0.00645NANA00649 COMMON8 dust           OTHER0.00645NANA0.3-0.3669 COMMON9 field          LOCATION0.00407NANA0.6-0.6476 COMMON10 French         LOCATION0.0242NANA00137 PROPER# ... with 42 more rows[[2]]# A tibble: 8 x 9  name                type         salience mid         wikipedia_url    magnitude score beginOffset mention_type<chr><chr><dbl><chr><chr><dbl><dbl><int><chr>1 British             LOCATION0.0255NANA00226 PROPER2 country             LOCATION0.0475NANA00155 COMMON3 English             OTHER0.0530NANA0066 PROPER4 Portsmouth          LOCATION0.0530/m/0619_    https://en.wiki…0041 PROPER5 record holder       PERSON0.0541NANA00234 COMMON6 Solomon Wariso      ORGANIZATION0.156/g/120x5nf6 https://en.wiki…000 PROPER7 sprinter            PERSON0.600NANA0074 COMMON8 World Championships EVENT0.0113NANA0.10.1195 PROPER

Sentiment of the entire text:

nlp_result$documentSentiment# A tibble: 2 x 2  magnitude score<dbl><dbl>12.40.320.10.1

The category for the text as defined by the listhere.

nlp_result$classifyText# A tibble: 1 x 2  name               confidence<chr><dbl>1/Science/Astronomy0.93

The language for the text:

nlp_result$language# [1] "en" "en"

The original passed in text, to aid with working with the output:

nlp_result$text[1]"Norma is a small constellation in the Southern Celestial Hemisphere between Ara and Lupus, one of twelve drawn up in the 18th century by French astronomer Nicolas Louis de Lacaille and one of several depicting scientific instruments. Its name refers to a right angle in Latin, and is variously considered to represent a rule, a carpenter's square, a set square or a level. It remains one of the 88 modern constellations. Four of Norma's brighter stars make up a square in the field of faint stars. Gamma2 Normae is the brightest star with an apparent magnitude of 4.0. Mu Normae is one of the most luminous stars known, but is partially obscured by distance and cosmic dust. Four star systems are known to harbour planets."[2]"Solomon Wariso (born 11 November 1966 in Portsmouth) is a retired English sprinter who competed primarily in the 200 and 400 metres.[1] He represented his country at two outdoor and three indoor World Championships and is the British record holder in the indoor 4 × 400 metres relay."

[8]ページ先頭

©2009-2025 Movatter.jp