Deploy a job to import logs from Cloud Storage to Cloud Logging Stay organized with collections Save and categorize content based on your preferences.
This document describes how you deploy the reference architecture described inImport logs from Cloud Storage to Cloud Logging.
These instructions are intended for engineers and developers, including DevOps,site reliability engineers (SREs), and security investigators, who want toconfigure and run the log importing job. This document also assumes you arefamiliar with running Cloud Run import jobs, and how to useCloud Storage and Cloud Logging.
Architecture
The following diagram shows how Google Cloud services are used in thisreference architecture:
For details, seeImport logs from Cloud Storage to Cloud Logging.
Objectives
- Create and configure a Cloud Run import job
- Create a service account to run the job
Costs
In this document, you use the following billable components of Google Cloud:
To generate a cost estimate based on your projected usage, use thepricing calculator.
Before you begin
Ensure that the logs you intend to import were previously exported toCloud Storage, which means that they're already organized in theexpectedexport format.
In the Google Cloud console, activate Cloud Shell.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator role (
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission.Learn how to grant roles.
ReplacePROJECT_ID with the destination project ID.
Note: We recommend that you create a new designated project for the imported logs.If you use an existing project, imported logs might get routed to unwanted destinations,which can cause extra charges or accidental export. To ensure logs don't routeto unwanted destinations, review the filters on all thesinks,including_Requiredand_Default. Ensure that sinks are not inherited fromorganizations or folders.Make sure that billing is enabled for your Google Cloud project.
Required roles
To get the permissions that you need to deploy this solution, ask your administrator to grant you the following IAM roles:
- To grant the Logs Writer role on the log bucket:Project IAM Admin (
roles/resourcemanager.projectIamAdmin) on the destination project - To grant the Storage Object Viewer role on the storage bucket:Storage Admin (
roles/storage.admin) on the project where the storage bucket is hosted - To create a service account:Create Service Accounts (
roles/iam.serviceAccountCreator) on the destination project - To enable services on the project:Service Usage Admin (
roles/serviceusage.serviceUsageAdmin) on the destination project - To upgrade the log bucket and delete imported logs:Logging Admin (
roles/logging.admin) on the destination project - To create, run, and modify the import job:Cloud Run Developer (
roles/run.developer) on the destination project
For more information about granting roles, seeManage access to projects, folders, and organizations.
You might also be able to get the required permissions throughcustom roles or otherpredefined roles.
Upgrade the log bucket to use Log Analytics
We recommend that you use the default log bucket, and upgrade it to use LogAnalytics.However, in a production environment, you can use your own log bucket if thedefault bucket doesn't meet your requirements. If you decide to use your ownbucket, you must route logs that are ingested to the destination project to thislog bucket. For more information, seeConfigure log buckets andCreate a sink.
When you upgrade the bucket, you can use SQL to query and analyze your logs.There's no additional cost to upgrade the bucket or use Log Analytics.
Note: After you upgrade a bucket, it can't be downgraded. For details aboutsettings and restrictions, seeUpgrade a bucket to use Log Analytics.To upgrade the default log bucket in the destination project, do thefollowing:
Upgrade the default log bucket to use Log Analytics:
gcloudloggingbucketsupdateBUCKET_ID--location=LOCATION--enable-analyticsReplace the following:
- BUCKET_ID: the name of the log bucket (for example,
_Default) - LOCATION: a supported region (for example,
global)
- BUCKET_ID: the name of the log bucket (for example,
Create the Cloud Run import job
When you create the job, you can use the prebuilt container image that isprovided for this reference architecture. If you need to modify theimplementation to change the 30-dayretention period or if you have other requirements, you canbuild your own custom image.
In Cloud Shell, create the job with the configurations andenvironment variables:
gcloudrunjobscreateJOB_NAME\--image=IMAGE_URL\--region=REGION\--tasks=TASKS\--max-retries=0\--task-timeout=60m\--cpu=CPU\--memory=MEMORY\--set-env-vars=END_DATE=END_DATE,LOG_ID=LOG_ID,\START_DATE=START_DATE,STORAGE_BUCKET_NAME=STORAGE_BUCKET_NAME,\PROJECT_ID=PROJECT_IDReplace the following:
- JOB_NAME: the name of your job.
- IMAGE_URL: the reference to the container image; use
us-docker.pkg.dev/cloud-devrel-public-resources/samples/import-logs-solutionor the URL of the custom image, if you built one by using theinstructions in GitHub. - REGION: theregion where you want your job to be located; to avoid additionalcosts, we recommend keeping the job region the same or within the samemulti-region as the Cloud Storage bucket region. For example, if your bucket ismulti-region US, you can use us-central1. For details, seeCost optimization.
- TASKS: the number of tasks that the job must run.The default value is
1. You can increase the number of tasks if timeouts occur. - CPU: the CPU limit, which can be 1, 2, 4, 6, or 8 CPUs.The default value is
2. You can increase the number if timeouts occur; for details,seeConfigure CPU limits. - MEMORY: the memory limit.The default value is
2Gi. You can increase the number if timeouts occur; fordetails, seeConfigure memory limits. - END_DATE: the end of the date range in the format MM/DD/YYYY. Logs withtimestamps earlier than or equal to this date are imported.
- LOG_ID: the log identifier of the logs you want to import. Log ID is apart of the
logNamefield of the log entry. For example,cloudaudit.googleapis.com. - START_DATE: the start of the date range in the format MM/DD/YYYY. Logswith timestamps later than or equal to this date are imported.
- STORAGE_BUCKET_NAME: the name of the Cloud Storage bucket wherelogs are stored (without the
gs://prefix).
The
max-retriesoption is set to zero to prevent retries for failedtasks, which can cause duplicate log entries.If the Cloud Run job fails due to a timeout, an incomplete importcan result. To prevent incomplete imports due to timeouts, increase the
tasksvalue, as well as theCPU andmemory resources.
Increasing these values might increase costs. For details about costs, seeCost optimization.
Create a service account to run your Cloud Run job
In Cloud Shell, create the user-managed service account:
gcloudiamservice-accountscreateSA_NAMEReplaceSA_NAME with the name of the service account.
Grant theStorage Object Viewer role on the storage bucket:
gcloudstoragebucketsadd-iam-policy-bindinggs://STORAGE_BUCKET_NAME\--member=serviceAccount:SA_NAME@PROJECT_ID.iam.gserviceaccount.com\--role=roles/storage.objectViewerReplace the following:
- STORAGE_BUCKET_NAME: the name of the storage bucket that you used in the importjob configuration. For example,
my-bucket. - PROJECT_ID: the destination project ID.
- STORAGE_BUCKET_NAME: the name of the storage bucket that you used in the importjob configuration. For example,
Grant theLogs Writer role on the log bucket:
gcloudprojectsadd-iam-policy-bindingPROJECT_ID\--member=serviceAccount:SA_NAME@PROJECT_ID.iam.gserviceaccount.com\--role=roles/logging.logWriterSet the service account for the Cloud Run job:
gcloudrunjobsupdateJOB_NAME\--region=REGION\--service-accountSA_NAME@PROJECT_ID.iam.gserviceaccount.comReplaceREGION with the same region where you deployed the Cloud Run import job.
Run the import job
In Cloud Shell, execute the created job:
gcloudrunjobsexecuteJOB_NAME\--region=REGION
For more information, seeExecute jobs andManage job executions.
If you need to rerun the job, delete the previously imported logs to avoidcreating duplicates. For details, seeDelete imported logslater in this document.
When you query the imported logs, duplicates don't appear in the query results.Cloud Logging removes duplicates (log entries from the same project, with thesame insertion ID and timestamp) from query results. For more information, seetheinsert_id field in the Logging API reference.
Verify results
To validate that the job has completed successfully, in Cloud Shell, you can query importresults:
gcloudloggingread'log_id("imported_logs") AND timestamp<=END_DATE'The output shows the imported logs. If this project was used to run more thanone import job within the specified timeframe, the output shows imported logsfrom those jobs as well.
For more options and details about querying log entries, seegcloud logging read.
Delete imported logs
If you need to run the same job more than one time, delete the previouslyimported logs to avoid duplicated entries and increased costs.
To delete imported logs, in Cloud Shell, execute the logs delete:
gcloudlogginglogsdeleteimported_logs
Be aware that deleting imported logs purgesall log entries that wereimported to the destination project and not only the results of the last importjob execution.
What's Next
- Review the implementation code in theGitHub repository.
- Learn how to analyze imported logs by usingLog Analytics and SQL.
- For more reference architectures, diagrams, and best practices, explore theCloud Architecture Center.
Contributors
Author:Leonid Yankulin | Developer Relations Engineer
Other contributors:
- Summit Tuladhar | Senior Staff Software Engineer
- Wilton Wong | Enterprise Architect
- Xiang Shen | Solutions Architect
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-02-19 UTC.