Inspecting text for sensitive data Stay organized with collections Save and categorize content based on your preferences.
Sensitive Data Protection can detect and classify sensitive data within textcontent. Given text input, DLP API returns details about anyinfoTypes found in the text, alikelihood value, and offsetinformation.
Best Practices
Identify and prioritize scanning
It's important to identify your resources and specify which have the highestpriority for scanning. When just getting started you may have a large backlog ofdata that needs classification, and it'll be impossible to scan it allimmediately. Choose data initially that poses the highest risk—forexample, data that is frequently accessed, widely accessible, or unknown.
Reduce latency
Latency is affected by several factors: the amount of data to scan, the storagerepository being scanned, and the type and number of infoTypes that are enabled.
To help reduce job latency, you can try the following:
- Enablesampling.
- Avoid enabling infoTypes you don't need. While useful incertain scenarios, some infoTypes—including
PERSON_NAME,FEMALE_NAME,MALE_NAME,FIRST_NAME,LAST_NAME,DATE_OF_BIRTH,LOCATION,STREET_ADDRESS,ORGANIZATION_NAME—can make requests run much more slowlythan requests that do not include them. - Always specify infoTypes explicitly. Do not use an empty infoTypes list.
- Consider organizing the data to be inspected into atable withrows and columns, if possible, to reduce network roundtrips.
Limit the scope of your first scans
For best results, limit the scope of your first scans instead of scanning all ofyour data. Start with a few requests. Your findings will be more meaningfulwhen you fine-tune what detectors to enable and whatexclusion rulesmight be needed to reduce false positives. Avoid turning on all infoTypes if youdon't need them all, as false positives or unusable findings may make it harderto assess your risk. While useful in certain scenarios, some infoTypes such asDATE,TIME,DOMAIN_NAME, andURL detect a broad range of findings andmay not be useful to turn on.
On-premises, hybrid, and multi-cloud scans
If the data to be scanned resides on-premises or outside of Google Cloud,use the API methodscontent.inspectandcontent.deidentifyto scan the content to classify findings and pseudonymize content withoutpersisting the content outside of your local storage.
Inspecting a text string
Following is sample JSON and code in several languages that demonstrate how touse the DLP API to inspect strings of text for sensitive data.
Important: The code on this page requires that you first set up a Sensitive Data Protection client. For more information about installing and creating a Sensitive Data Protection client, seeSensitive Data Protection client libraries. (Sending JSON to Sensitive Data Protection REST endpoints does not require a client library.)
C#
To learn how to install and use the client library for Sensitive Data Protection, seeSensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
usingSystem;usingSystem.Collections.Generic;usingSystem.Linq;usingGoogle.Api.Gax.ResourceNames;usingGoogle.Cloud.Dlp.V2;usingstaticGoogle.Cloud.Dlp.V2.InspectConfig.Types;publicclassInspectString{publicstaticInspectContentResponseInspect(stringprojectId,stringdataValue,stringminLikelihood,intmaxFindings,boolincludeQuote,IEnumerable<InfoType>infoTypes,IEnumerable<CustomInfoType>customInfoTypes){varinspectConfig=newInspectConfig{MinLikelihood=(Likelihood)Enum.Parse(typeof(Likelihood),minLikelihood,true),Limits=newFindingLimits{MaxFindingsPerRequest=maxFindings},IncludeQuote=includeQuote,InfoTypes={infoTypes},CustomInfoTypes={customInfoTypes}};varrequest=newInspectContentRequest{Parent=newLocationName(projectId,"global").ToString(),Item=newContentItem{Value=dataValue},InspectConfig=inspectConfig};vardlp=DlpServiceClient.Create();varresponse=dlp.InspectContent(request);PrintResponse(includeQuote,response);returnresponse;}privatestaticvoidPrintResponse(boolincludeQuote,InspectContentResponseresponse){varfindings=response.Result.Findings;if(findings.Any()){Console.WriteLine("Findings:");foreach(varfindinginfindings){if(includeQuote){Console.WriteLine($" Quote: {finding.Quote}");}Console.WriteLine($" InfoType: {finding.InfoType}");Console.WriteLine($" Likelihood: {finding.Likelihood}");}}else{Console.WriteLine("No findings.");}}}Go
To learn how to install and use the client library for Sensitive Data Protection, seeSensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
import("context""fmt""io"dlp"cloud.google.com/go/dlp/apiv2""cloud.google.com/go/dlp/apiv2/dlppb")// inspectString inspects the a given string, and prints results.funcinspectString(wio.Writer,projectID,textToInspectstring)error{// projectID := "my-project-id"// textToInspect := "My name is Gary and my email is gary@example.com"ctx:=context.Background()// Initialize client.client,err:=dlp.NewClient(ctx)iferr!=nil{returnerr}deferclient.Close()// Closing the client safely cleans up background resources.// Create and send the request.req:=&dlppb.InspectContentRequest{Parent:fmt.Sprintf("projects/%s/locations/global",projectID),Item:&dlppb.ContentItem{DataItem:&dlppb.ContentItem_Value{Value:textToInspect,},},InspectConfig:&dlppb.InspectConfig{InfoTypes:[]*dlppb.InfoType{{Name:"PHONE_NUMBER"},{Name:"EMAIL_ADDRESS"},{Name:"CREDIT_CARD_NUMBER"},},IncludeQuote:true,},}resp,err:=client.InspectContent(ctx,req)iferr!=nil{returnerr}// Process the results.result:=resp.Resultfmt.Fprintf(w,"Findings: %d\n",len(result.Findings))for_,f:=rangeresult.Findings{fmt.Fprintf(w,"\tQuote: %s\n",f.Quote)fmt.Fprintf(w,"\tInfo type: %s\n",f.InfoType.Name)fmt.Fprintf(w,"\tLikelihood: %s\n",f.Likelihood)}returnnil}Java
To learn how to install and use the client library for Sensitive Data Protection, seeSensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
importcom.google.cloud.dlp.v2.DlpServiceClient;importcom.google.privacy.dlp.v2.ByteContentItem;importcom.google.privacy.dlp.v2.ByteContentItem.BytesType;importcom.google.privacy.dlp.v2.ContentItem;importcom.google.privacy.dlp.v2.Finding;importcom.google.privacy.dlp.v2.InfoType;importcom.google.privacy.dlp.v2.InspectConfig;importcom.google.privacy.dlp.v2.InspectContentRequest;importcom.google.privacy.dlp.v2.InspectContentResponse;importcom.google.privacy.dlp.v2.LocationName;importcom.google.protobuf.ByteString;importjava.io.IOException;importjava.util.ArrayList;importjava.util.List;publicclassInspectString{publicstaticvoidmain(String[]args)throwsException{// TODO(developer): Replace these variables before running the sample.StringprojectId="your-project-id";StringtextToInspect="My name is Gary and my email is gary@example.com";inspectString(projectId,textToInspect);}// Inspects the provided text.publicstaticvoidinspectString(StringprojectId,StringtextToInspect)throwsIOException{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests. After completing all of your requests, call// the "close" method on the client to safely clean up any remaining background resources.try(DlpServiceClientdlp=DlpServiceClient.create()){// Specify the type and content to be inspected.ByteContentItembyteItem=ByteContentItem.newBuilder().setType(BytesType.TEXT_UTF8).setData(ByteString.copyFromUtf8(textToInspect)).build();ContentItemitem=ContentItem.newBuilder().setByteItem(byteItem).build();// Specify the type of info the inspection will look for.List<InfoType>infoTypes=newArrayList<>();// See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info typesfor(StringtypeName:newString[]{"PHONE_NUMBER","EMAIL_ADDRESS","CREDIT_CARD_NUMBER"}){infoTypes.add(InfoType.newBuilder().setName(typeName).build());}// Construct the configuration for the Inspect request.InspectConfigconfig=InspectConfig.newBuilder().addAllInfoTypes(infoTypes).setIncludeQuote(true).build();// Construct the Inspect request to be sent by the client.InspectContentRequestrequest=InspectContentRequest.newBuilder().setParent(LocationName.of(projectId,"global").toString()).setItem(item).setInspectConfig(config).build();// Use the client to send the API request.InspectContentResponseresponse=dlp.inspectContent(request);// Parse the response and process resultsSystem.out.println("Findings: "+response.getResult().getFindingsCount());for(Findingf:response.getResult().getFindingsList()){System.out.println("\tQuote: "+f.getQuote());System.out.println("\tInfo type: "+f.getInfoType().getName());System.out.println("\tLikelihood: "+f.getLikelihood());}}}}Node.js
To learn how to install and use the client library for Sensitive Data Protection, seeSensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
// Imports the Google Cloud Data Loss Prevention libraryconstDLP=require('@google-cloud/dlp');// Instantiates a clientconstdlp=newDLP.DlpServiceClient();// The project ID to run the API call under// const projectId = 'my-project';// The string to inspect// const string = 'My name is Gary and my email is gary@example.com';// The minimum likelihood required before returning a match// const minLikelihood = 'LIKELIHOOD_UNSPECIFIED';// The maximum number of findings to report per request (0 = server maximum)// const maxFindings = 0;// The infoTypes of information to match// const infoTypes = [{ name: 'PHONE_NUMBER' }, { name: 'EMAIL_ADDRESS' }, { name: 'CREDIT_CARD_NUMBER' }];// The customInfoTypes of information to match// const customInfoTypes = [{ infoType: { name: 'DICT_TYPE' }, dictionary: { wordList: { words: ['foo', 'bar', 'baz']}}},// { infoType: { name: 'REGEX_TYPE' }, regex: {pattern: '\\(\\d{3}\\) \\d{3}-\\d{4}'}}];// Whether to include the matching string// const includeQuote = true;asyncfunctioninspectString(){// Construct item to inspectconstitem={value:string};// Construct requestconstrequest={parent:`projects/${projectId}/locations/global`,inspectConfig:{infoTypes:infoTypes,customInfoTypes:customInfoTypes,minLikelihood:minLikelihood,includeQuote:includeQuote,limits:{maxFindingsPerRequest:maxFindings,},},item:item,};// Run requestconst[response]=awaitdlp.inspectContent(request);constfindings=response.result.findings;if(findings.length >0){console.log('Findings:');findings.forEach(finding=>{if(includeQuote){console.log(`\tQuote:${finding.quote}`);}console.log(`\tInfo type:${finding.infoType.name}`);console.log(`\tLikelihood:${finding.likelihood}`);});}else{console.log('No findings.');}}inspectString();PHP
To learn how to install and use the client library for Sensitive Data Protection, seeSensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
use Google\Cloud\Dlp\V2\Client\DlpServiceClient;use Google\Cloud\Dlp\V2\ContentItem;use Google\Cloud\Dlp\V2\InfoType;use Google\Cloud\Dlp\V2\InspectConfig;use Google\Cloud\Dlp\V2\InspectContentRequest;use Google\Cloud\Dlp\V2\Likelihood;/** * @param string $projectId * @param string $textToInspect */function inspect_string(string $projectId, string $textToInspect): void{ // Instantiate a client. $dlp = new DlpServiceClient(); // Construct request $parent = "projects/$projectId/locations/global"; $item = (new ContentItem()) ->setValue($textToInspect); $inspectConfig = (new InspectConfig()) // The infoTypes of information to match ->setInfoTypes([ (new InfoType())->setName('PHONE_NUMBER'), (new InfoType())->setName('EMAIL_ADDRESS'), (new InfoType())->setName('CREDIT_CARD_NUMBER') ]) // Whether to include the matching string ->setIncludeQuote(true); // Run request $inspectContentRequest = (new InspectContentRequest()) ->setParent($parent) ->setInspectConfig($inspectConfig) ->setItem($item); $response = $dlp->inspectContent($inspectContentRequest); // Print the results $findings = $response->getResult()->getFindings(); if (count($findings) == 0) { print('No findings.' . PHP_EOL); } else { print('Findings:' . PHP_EOL); foreach ($findings as $finding) { print(' Quote: ' . $finding->getQuote() . PHP_EOL); print(' Info type: ' . $finding->getInfoType()->getName() . PHP_EOL); $likelihoodString = Likelihood::name($finding->getLikelihood()); print(' Likelihood: ' . $likelihoodString . PHP_EOL); } }}Python
To learn how to install and use the client library for Sensitive Data Protection, seeSensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
fromtypingimportListimportgoogle.cloud.dlpdefinspect_string(project:str,content_string:str,info_types:List[str],custom_dictionaries:List[str]=None,custom_regexes:List[str]=None,min_likelihood:str=None,max_findings:str=None,include_quote:bool=True,)->None:"""Uses the Data Loss Prevention API to analyze strings for protected data. Args: project: The Google Cloud project id to use as a parent resource. content_string: The string to inspect. info_types: A list of strings representing info types to look for. A full list of info type categories can be fetched from the API. min_likelihood: A string representing the minimum likelihood threshold that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED', 'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'. max_findings: The maximum number of findings to report; 0 = no maximum. include_quote: Boolean for whether to display a quote of the detected information in the results. Returns: None; the response from the API is printed to the terminal. """# Instantiate a client.dlp=google.cloud.dlp_v2.DlpServiceClient()# Prepare info_types by converting the list of strings into a list of# dictionaries (protos are also accepted).info_types=[{"name":info_type}forinfo_typeininfo_types]# Prepare custom_info_types by parsing the dictionary word lists and# regex patterns.ifcustom_dictionariesisNone:custom_dictionaries=[]dictionaries=[{"info_type":{"name":f"CUSTOM_DICTIONARY_{i}"},"dictionary":{"word_list":{"words":custom_dict.split(",")}},}fori,custom_dictinenumerate(custom_dictionaries)]ifcustom_regexesisNone:custom_regexes=[]regexes=[{"info_type":{"name":f"CUSTOM_REGEX_{i}"},"regex":{"pattern":custom_regex},}fori,custom_regexinenumerate(custom_regexes)]custom_info_types=dictionaries+regexes# Construct the configuration dictionary. Keys which are None may# optionally be omitted entirely.inspect_config={"info_types":info_types,"custom_info_types":custom_info_types,"min_likelihood":min_likelihood,"include_quote":include_quote,"limits":{"max_findings_per_request":max_findings},}# Construct the `item`.item={"value":content_string}# Convert the project id into a full resource id.parent=f"projects/{project}"# Call the API.response=dlp.inspect_content(request={"parent":parent,"inspect_config":inspect_config,"item":item})# Print out the results.ifresponse.result.findings:forfindinginresponse.result.findings:try:iffinding.quote:print(f"Quote:{finding.quote}")exceptAttributeError:passprint(f"Info type:{finding.info_type.name}")print(f"Likelihood:{finding.likelihood}")else:print("No findings.")Ruby
To learn how to install and use the client library for Sensitive Data Protection, seeSensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
# project_id = "Your Google Cloud project ID"# content = "The text to inspect"# max_findings = "Maximum number of findings to report per request (0 = server maximum)"require"google/cloud/dlp"dlp=Google::Cloud::Dlp.dlp_serviceinspect_config={# The types of information to matchinfo_types:[{name:"PERSON_NAME"},{name:"US_STATE"}],# Only return results above a likelihood threshold (0 for all)min_likelihood::POSSIBLE,# Limit the number of findings (0 for no limit)limits:{max_findings_per_request:max_findings},# Whether to include the matching string in the responseinclude_quote:true}# The item to inspectitem_to_inspect={value:content}# Run requestparent="projects/#{project_id}/locations/global"response=dlp.inspect_contentparent:parent,inspect_config:inspect_config,item:item_to_inspect# Print the resultsifresponse.result.findings.empty?puts"No findings"elseresponse.result.findings.eachdo|finding|puts"Quote:#{finding.quote}"puts"Info type:#{finding.info_type.name}"puts"Likelihood:#{finding.likelihood}"endendREST
See theJSON quickstart for more information aboutusing the DLP API with JSON.
JSON Input:
POSThttps://dlp.googleapis.com/v2/projects/[PROJECT_ID]/content:inspect?key={YOUR_API_KEY}{"item":{"value":"My phone number is (415) 555-0890"},"inspectConfig":{"includeQuote":true,"minLikelihood":"POSSIBLE","infoTypes":{"name":"PHONE_NUMBER"}}}JSON Output:
{ "result":{ "findings":[ { "quote":"(415) 555-0890", "infoType":{ "name":"PHONE_NUMBER" }, "likelihood":"VERY_LIKELY", "location":{ "byteRange":{ "start":"19", "end":"33" }, "codepointRange":{ "start":"19", "end":"33" } }, "createTime":"2018-11-13T19:29:15.412Z" } ] }}Inspecting a text file
The code samples below demonstrate how to check a text file for sensitivecontent.
C#
To learn how to install and use the client library for Sensitive Data Protection, seeSensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
usingSystem;usingSystem.Collections.Generic;usingSystem.IO;usingSystem.Linq;usingGoogle.Api.Gax.ResourceNames;usingGoogle.Cloud.Dlp.V2;usingGoogle.Protobuf;usingstaticGoogle.Cloud.Dlp.V2.ByteContentItem.Types;publicclassDlpInspectFile{publicstaticIEnumerable<Finding>InspectFile(stringprojectId,stringfilePath,BytesTypefileType){// Instantiate a client.vardlp=DlpServiceClient.Create();// Get the bytes from the file.ByteStringfileBytes;using(Streamf=newFileStream(filePath,FileMode.Open)){fileBytes=ByteString.FromStream(f);}// Construct a request.varrequest=newInspectContentRequest{Parent=newLocationName(projectId,"global").ToString(),Item=newContentItem{ByteItem=newByteContentItem(){Data=fileBytes,Type=fileType}},InspectConfig=newInspectConfig{// The info types of information to matchInfoTypes={newInfoType{Name="PHONE_NUMBER"},newInfoType{Name="EMAIL_ADDRESS"},newInfoType{Name="CREDIT_CARD_NUMBER"}},// The minimum likelihood before returning a matchMinLikelihood=Likelihood.Unspecified,// Whether to include the matching stringIncludeQuote=true,Limits=newInspectConfig.Types.FindingLimits{// The maximum number of findings to report per request// (0 = server maximum)MaxFindingsPerRequest=0}}};// Execute requestvarresponse=dlp.InspectContent(request);// Inspect responsevarfindings=response.Result.Findings;if(findings.Any()){Console.WriteLine("Findings:");foreach(varfindinginfindings){Console.WriteLine($"Quote: {finding.Quote}");Console.WriteLine($"InfoType: {finding.InfoType}");Console.WriteLine($"Likelihood: {finding.Likelihood}");}}else{Console.WriteLine("No findings.");}returnfindings;}}Go
To learn how to install and use the client library for Sensitive Data Protection, seeSensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
import("context""fmt""io""os"dlp"cloud.google.com/go/dlp/apiv2""cloud.google.com/go/dlp/apiv2/dlppb")// inspectTextFile inspects a text file at a given filePath, and prints results.funcinspectTextFile(wio.Writer,projectID,filePathstring)error{// projectID := "my-project-id"// filePath := "path/to/image.png"ctx:=context.Background()// Initialize client.client,err:=dlp.NewClient(ctx)iferr!=nil{returnerr}deferclient.Close()// Closing the client safely cleans up background resources.// Gather the resources for the request.data,err:=os.ReadFile(filePath)iferr!=nil{returnerr}// Create and send the request.req:=&dlppb.InspectContentRequest{Parent:fmt.Sprintf("projects/%s/locations/global",projectID),Item:&dlppb.ContentItem{DataItem:&dlppb.ContentItem_ByteItem{ByteItem:&dlppb.ByteContentItem{Type:dlppb.ByteContentItem_TEXT_UTF8,Data:data,},},},InspectConfig:&dlppb.InspectConfig{InfoTypes:[]*dlppb.InfoType{{Name:"PHONE_NUMBER"},{Name:"EMAIL_ADDRESS"},{Name:"CREDIT_CARD_NUMBER"},},IncludeQuote:true,},}resp,err:=client.InspectContent(ctx,req)iferr!=nil{returnfmt.Errorf("InspectContent: %w",err)}// Process the results.fmt.Fprintf(w,"Findings: %d\n",len(resp.Result.Findings))for_,f:=rangeresp.Result.Findings{fmt.Fprintf(w,"\tQuote: %s\n",f.Quote)fmt.Fprintf(w,"\tInfo type: %s\n",f.InfoType.Name)fmt.Fprintf(w,"\tLikelihood: %s\n",f.Likelihood)}returnnil}Java
To learn how to install and use the client library for Sensitive Data Protection, seeSensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
importcom.google.cloud.dlp.v2.DlpServiceClient;importcom.google.privacy.dlp.v2.ByteContentItem;importcom.google.privacy.dlp.v2.ByteContentItem.BytesType;importcom.google.privacy.dlp.v2.ContentItem;importcom.google.privacy.dlp.v2.Finding;importcom.google.privacy.dlp.v2.InfoType;importcom.google.privacy.dlp.v2.InspectConfig;importcom.google.privacy.dlp.v2.InspectContentRequest;importcom.google.privacy.dlp.v2.InspectContentResponse;importcom.google.privacy.dlp.v2.LocationName;importcom.google.protobuf.ByteString;importjava.io.FileInputStream;importjava.io.IOException;importjava.util.ArrayList;importjava.util.List;publicclassInspectTextFile{publicstaticvoidmain(String[]args)throwsException{// TODO(developer): Replace these variables before running the sample.StringprojectId="your-project-id";StringfilePath="path/to/file.txt";inspectTextFile(projectId,filePath);}// Inspects the specified text file.publicstaticvoidinspectTextFile(StringprojectId,StringfilePath)throwsIOException{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests. After completing all of your requests, call// the "close" method on the client to safely clean up any remaining background resources.try(DlpServiceClientdlp=DlpServiceClient.create()){// Specify the type and content to be inspected.ByteStringfileBytes=ByteString.readFrom(newFileInputStream(filePath));ByteContentItembyteItem=ByteContentItem.newBuilder().setType(BytesType.TEXT_UTF8).setData(fileBytes).build();ContentItemitem=ContentItem.newBuilder().setByteItem(byteItem).build();// Specify the type of info the inspection will look for.List<InfoType>infoTypes=newArrayList<>();// See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info typesfor(StringtypeName:newString[]{"PHONE_NUMBER","EMAIL_ADDRESS","CREDIT_CARD_NUMBER"}){infoTypes.add(InfoType.newBuilder().setName(typeName).build());}// Construct the configuration for the Inspect request.InspectConfigconfig=InspectConfig.newBuilder().addAllInfoTypes(infoTypes).setIncludeQuote(true).build();// Construct the Inspect request to be sent by the client.InspectContentRequestrequest=InspectContentRequest.newBuilder().setParent(LocationName.of(projectId,"global").toString()).setItem(item).setInspectConfig(config).build();// Use the client to send the API request.InspectContentResponseresponse=dlp.inspectContent(request);// Parse the response and process resultsSystem.out.println("Findings: "+response.getResult().getFindingsCount());for(Findingf:response.getResult().getFindingsList()){System.out.println("\tQuote: "+f.getQuote());System.out.println("\tInfo type: "+f.getInfoType().getName());System.out.println("\tLikelihood: "+f.getLikelihood());}}}}Node.js
To learn how to install and use the client library for Sensitive Data Protection, seeSensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
// Imports the Google Cloud Data Loss Prevention libraryconstDLP=require('@google-cloud/dlp');// Import other required librariesconstfs=require('fs');constmime=require('mime');// Instantiates a clientconstdlp=newDLP.DlpServiceClient();// The project ID to run the API call under// const projectId = 'my-project';// The path to a local file to inspect. Can be a text, JPG, or PNG file.// const filepath = 'path/to/image.png';// The minimum likelihood required before returning a match// const minLikelihood = 'LIKELIHOOD_UNSPECIFIED';// The maximum number of findings to report per request (0 = server maximum)// const maxFindings = 0;// The infoTypes of information to match// const infoTypes = [{ name: 'PHONE_NUMBER' }, { name: 'EMAIL_ADDRESS' }, { name: 'CREDIT_CARD_NUMBER' }];// The customInfoTypes of information to match// const customInfoTypes = [{ infoType: { name: 'DICT_TYPE' }, dictionary: { wordList: { words: ['foo', 'bar', 'baz']}}},// { infoType: { name: 'REGEX_TYPE' }, regex: {pattern: '\\(\\d{3}\\) \\d{3}-\\d{4}'}}];// Whether to include the matching string// const includeQuote = true;asyncfunctioninspectFile(){// Construct file data to inspectconstfileTypeConstant=['image/jpeg','image/bmp','image/png','image/svg'].indexOf(mime.getType(filepath))+1;constfileBytes=Buffer.from(fs.readFileSync(filepath)).toString('base64');constitem={byteItem:{type:fileTypeConstant,data:fileBytes,},};// Construct requestconstrequest={parent:`projects/${projectId}/locations/global`,inspectConfig:{infoTypes:infoTypes,customInfoTypes:customInfoTypes,minLikelihood:minLikelihood,includeQuote:includeQuote,limits:{maxFindingsPerRequest:maxFindings,},},item:item,};// Run requestconst[response]=awaitdlp.inspectContent(request);constfindings=response.result.findings;if(findings.length >0){console.log('Findings:');findings.forEach(finding=>{if(includeQuote){console.log(`\tQuote:${finding.quote}`);}console.log(`\tInfo type:${finding.infoType.name}`);console.log(`\tLikelihood:${finding.likelihood}`);});}else{console.log('No findings.');}}Python
To learn how to install and use the client library for Sensitive Data Protection, seeSensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
importmimetypesfromtypingimportListfromtypingimportOptionalimportgoogle.cloud.dlpdefinspect_file(project:str,filename:str,info_types:List[str],min_likelihood:str=None,custom_dictionaries:List[str]=None,custom_regexes:List[str]=None,max_findings:Optional[int]=None,include_quote:bool=True,mime_type:str=None,)->None:"""Uses the Data Loss Prevention API to analyze a file for protected data. Args: project: The Google Cloud project id to use as a parent resource. filename: The path to the file to inspect. info_types: A list of strings representing info types to look for. A full list of info type categories can be fetched from the API. min_likelihood: A string representing the minimum likelihood threshold that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED', 'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'. max_findings: The maximum number of findings to report; 0 = no maximum. include_quote: Boolean for whether to display a quote of the detected information in the results. mime_type: The MIME type of the file. If not specified, the type is inferred via the Python standard library's mimetypes module. Returns: None; the response from the API is printed to the terminal. """# Instantiate a client.dlp=google.cloud.dlp_v2.DlpServiceClient()# Prepare info_types by converting the list of strings into a list of# dictionaries (protos are also accepted).ifnotinfo_types:info_types=["FIRST_NAME","LAST_NAME","EMAIL_ADDRESS"]info_types=[{"name":info_type}forinfo_typeininfo_types]# Prepare custom_info_types by parsing the dictionary word lists and# regex patterns.ifcustom_dictionariesisNone:custom_dictionaries=[]dictionaries=[{"info_type":{"name":f"CUSTOM_DICTIONARY_{i}"},"dictionary":{"word_list":{"words":custom_dict.split(",")}},}fori,custom_dictinenumerate(custom_dictionaries)]ifcustom_regexesisNone:custom_regexes=[]regexes=[{"info_type":{"name":f"CUSTOM_REGEX_{i}"},"regex":{"pattern":custom_regex},}fori,custom_regexinenumerate(custom_regexes)]custom_info_types=dictionaries+regexes# Construct the configuration dictionary. Keys which are None may# optionally be omitted entirely.inspect_config={"info_types":info_types,"custom_info_types":custom_info_types,"min_likelihood":min_likelihood,"include_quote":include_quote,"limits":{"max_findings_per_request":max_findings},}# If mime_type is not specified, guess it from the filename.ifmime_typeisNone:mime_guess=mimetypes.MimeTypes().guess_type(filename)mime_type=mime_guess[0]# Select the content type index from the list of supported types.# https://github.com/googleapis/googleapis/blob/master/google/privacy/dlp/v2/dlp.proto / message ByteContentItemsupported_content_types={None:0,# "Unspecified" or BYTES_TYPE_UNSPECIFIED"image/jpeg":1,# IMAGE_JPEG"image/bmp":2,# IMAGE_BMP"image/png":3,# IMAGE_PNG"image/svg":4,# IMAGE_SVG - Adjusted to "image/svg+xml" for correct MIME type"text/plain":5,# TEXT_UTF8# Note: No specific MIME type for general "image", mapping to IMAGE for any image type not specified"image":6,# IMAGE - Any image type"application/msword":7,# WORD_DOCUMENT"application/pdf":8,# PDF"application/powerpoint":9,# POWERPOINT_DOCUMENT"application/msexcel":10,# EXCEL_DOCUMENT"application/avro":11,# AVRO"text/csv":12,# CSV"text/tsv":13,# TSV}content_type_index=supported_content_types.get(mime_type,0)# Construct the item, containing the file's byte data.withopen(filename,mode="rb")asf:item={"byte_item":{"type_":content_type_index,"data":f.read()}}# Convert the project id into a full resource id.parent=f"projects/{project}"# Call the API.response=dlp.inspect_content(request={"parent":parent,"inspect_config":inspect_config,"item":item})# Print out the results.ifresponse.result.findings:forfindinginresponse.result.findings:try:print(f"Quote:{finding.quote}")exceptAttributeError:passprint(f"Info type:{finding.info_type.name}")print(f"Likelihood:{finding.likelihood}")else:print("No findings.")Ruby
To learn how to install and use the client library for Sensitive Data Protection, seeSensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
# project_id = "Your Google Cloud project ID"# filename = "The file path to the file to inspect"# max_findings = "Maximum number of findings to report per request (0 = server maximum)"require"google/cloud/dlp"dlp=Google::Cloud::Dlp.dlp_serviceinspect_config={# The types of information to matchinfo_types:[{name:"PERSON_NAME"},{name:"PHONE_NUMBER"}],# Only return results above a likelihood threshold (0 for all)min_likelihood::POSSIBLE,# Limit the number of findings (0 for no limit)limits:{max_findings_per_request:max_findings},# Whether to include the matching string in the responseinclude_quote:true}# The item to inspectfile=File.openfilename,"rb"item_to_inspect={byte_item:{type::BYTES_TYPE_UNSPECIFIED,data:file.read}}# Run requestparent="projects/#{project_id}/locations/global"response=dlp.inspect_contentparent:parent,inspect_config:inspect_config,item:item_to_inspect# Print the resultsifresponse.result.findings.empty?puts"No findings"elseresponse.result.findings.eachdo|finding|puts"Quote:#{finding.quote}"puts"Info type:#{finding.info_type.name}"puts"Likelihood:#{finding.likelihood}"endendWhat's next
- Work through theRedacting Sensitive Data with Sensitive Data Protection codelab.
- Learn how toinspect images for sensitive data.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-17 UTC.