Design secure deployment pipelines Stay organized with collections Save and categorize content based on your preferences.
A deployment pipeline is an automated process that takes code or prebuiltartifacts and deploys them to a test environment or a production environment.Deployment pipelines are commonly used to deploy applications, configuration, orcloud infrastructure (infrastructure as code), and they can play an importantrole in the overall security posture of a cloud deployment.
This guide is intended for DevOps and security engineers and describes bestpractices for designing secure deployment pipelines based on yourconfidentiality, integrity, and availability requirements.
Architecture
The following diagram shows the flow of data in a deployment pipeline. Itillustrates how you can turn your artifacts into resources.
Deployment pipelines are often part of a larger continuousintegration/continuous deployment (CI/CD) workflow and are typically implementedusing one of the following models:
Push model: In this model, you implement the deployment pipelineusing a central CI/CD system such asJenkins orGitLab.This CI/CD system might run on Google Cloud, on-premises, or on a differentcloud environment. Often, the same CI/CD system is used to manage multipledeployment pipelines.
The push model leads to a centralized architecture with a few CI/CDsystems that are used for managing a potentially large number of resourcesor applications. For example, you might use a single Jenkins or GitLabinstance to manage your entire production environment, including all itsprojects and applications.
Pull model: In this model, the deployment process isimplemented by an agent that is deployed alongside the resource–for example,in the sameKubernetes cluster. The agent pulls artifacts or source code from a centralizedlocation, and deploys them locally. Each agent manages one or two resources.
The pull model leads to a more decentralized architecture with apotentially large number of single-purpose agents.
Compared to manual deployments, consistently using deployment pipelines canhave the following benefits:
- Increased efficiency, because no manual work is required.
- Increased reliability, because the process is fully automated andrepeatable.
- Increased traceability, because you can trace all deployments to changesin code or to input artifacts.
To perform, a deployment pipeline requires access to the resources itmanages:
- A pipeline that deploys infrastructure by using tools likeTerraform might need to create, modify, or even delete resources like VM instances,subnets, or Cloud Storage buckets.
- A pipeline that deploys applications might need to upload new containerimages to Artifact Registry, and deploy new application versions toApp Engine,Cloud Run,orGoogle Kubernetes Engine (GKE).
- A pipeline that manages settings or deploys configuration files mightneed to modify VM instance metadata, Kubernetes configurations, or modifydata inCloud Storage.
If your deployment pipelines aren't properly secured, their access toGoogle Cloud resources can become a weak spot in your security posture.Weakened security can lead to several kinds of attacks, including the following:
Pipeline poisoning attacks: Instead of attacking a resourcedirectly, a bad actor might attempt to compromise the deployment pipeline,its configuration, or its underlying infrastructure. Taking advantage ofthe pipeline's access to Google Cloud, the bad actor could make thepipeline perform malicious actions on Cloud resources, as shown in thefollowing diagram:
Supply chain attacks: Instead of attacking the deployment pipeline,a bad actor might attempt to compromise or replace pipeline input—includingsource code, libraries, or container images, as shown in the followingdiagram:
To determine whether your deployment pipelines are appropriately secured, it's insufficient to look only at the allow policies and deny policies of Google Cloud resources in isolation. Instead, you must consider theentire graph of systems that directly or indirectly grant access to a resource.This graph includes the following information:
- The deployment pipeline, its underlying CI/CD system, and itsunderlying infrastructure
- The source code repository, its underlying servers, and its underlyinginfrastructure
- Input artifacts, their storage locations, and their underlyinginfrastructure
- Systems that produce the input artifacts, and their underlyinginfrastructure
Complex input graphs make it difficult to identify user access to resources andsystemic weaknesses.
The following sections describe best practices for designing deploymentpipelines in a way that helps you manage the size of the graph, and reduce therisk of lateral movement and supply chain attacks.
Assess security objectives
Your resources on Google Cloud are likely to vary in how sensitive theyare. Some resources might be highly sensitive because they're business criticalor confidential. Other resources might be less sensitive because they'reephemeral or only intended for testing purposes.
To design a secure deployment pipeline, you must first understand the resourcesthe pipeline needs to access, and how sensitive these resources are. The moresensitive your resources, the more you should focus on securing the pipeline.
The resources accessed by deployment pipelines might include:
- Applications, such as Cloud Run or App Engine
- Cloud resources, such as VM instances or Cloud Storage buckets
- Data, such as Cloud Storage objects, BigQuery records,or files
Some of these resources might have dependencies on other resources, forexample:
- Applications might access data, cloud resources, and other applications.
Cloud resources, such as VM instances or Cloud Storage buckets,might contain applications or data.
As shown in the preceding diagram, dependencies affect how sensitive a resourceis. For example, if you use an application that accesses highly sensitive data,typically you should treat that application as highly sensitive. Similarly, if acloud resource like a Cloud Storage bucket contains sensitive data,then you typically should treat the bucket as sensitive.
Because of these dependencies, it's best to first assess the sensitivity ofyour data. Once you've assessed your data, you can examine the dependency chainand assess the sensitivity of your Cloud resources and applications.
Categorize the sensitivity of your data
To understand the sensitivity of the data in your deployment pipeline, considerthe following three objectives:
- Confidentiality: You must protect the data from unauthorized access.
- Integrity: You must protect the data against unauthorizedmodification or deletion.
- Availability: You must ensure that authorized people and systems canaccess the data in your deployment pipeline.
For each of these objectives, ask yourself what would happen if your pipelinewas breached:
- Confidentiality: How damaging would it be if data was disclosed toa bad actor, or leaked to the public?
- Integrity: How damaging would it be if data was modified or deletedby a bad actor?
- Availability: How damaging would it be if a bad actor disrupted yourdata access?
To make the results comparable across resources, it's useful to introducesecurity categories.Standards for Security Categorization (FIPS-199) suggests using the following four categories:
- High: Damage would be severe or catastrophic
- Moderate: Damage would be serious
- Low: Damage would be limited
- Not applicable: The standard doesn't apply
Depending on your environment and context, a different set of categories couldbe more appropriate.
The confidentiality and integrity of pipeline data exist on a spectrum, basedon the security categories just discussed. The following subsections containexamples of resources with different confidentiality and integritymeasurements:
Resources with low confidentiality, but low, moderate, and high integrity
The following resource examples all have low confidentiality:
- Low integrity: Test data
- Moderate integrity: Public web server content, policy constraintsfor your organization
- High integrity: Container images, disk images, applicationconfigurations, access policies (allow and deny lists), liens, access-leveldata
Resources with medium confidentiality, but low, moderate, and high integrity
The following resource examples all have medium confidentiality:
- Low integrity: Internal web server content
- Moderate integrity: Audit logs
- High integrity: Application configuration files
Resources with high confidentiality, but low, moderate, and high integrity
The following resource examples all have high confidentiality:
- Low integrity: Usage data and personally identifiable information
- Moderate integrity: Secrets
- High integrity: Financial data, KMS keys
Categorize applications based on the data that they access
When an application accesses sensitive data, the application and the deploymentpipeline that manages the application can also become sensitive. To qualify thatsensitivity, look at the data that the application and the pipeline need toaccess.
Once you've identifiedand categorized all data accessed by an application,you can use the following categories to initially categorize the applicationbefore you design a secure deployment pipeline:
- Confidentiality: Highest category of any data accessed
- Integrity: Highest category of any data accessed
- Availability: Highest category of any data accessed
This initial assessment provides guidance, but there might be additionalfactors to consider—for example:
- Two sets of data might have low-confidentiality in isolation. But whencombined, they could reveal new insights. If an application has access toboth sets of data, you might need to categorize it as medium- orhigh-confidentiality.
- If an application has access to high-integrity data, then you shouldtypically categorize the application as high-integrity. But if that accessis read only, a categorization of high-integrity might be too strict.
For details on a formalized approach to categorize applications, seeGuide for Mapping Types of Information and Information Systems to Security Categories (NIST SP 800-60 Vol. 2 Rev1).
Categorize cloud resources based on the data and applications they host
Any data or application that you deploy on Google Cloud is hosted by aGoogle Cloud resource:
- An application might be hosted by an App Engine service, a VMinstance, or a GKE cluster.
- Your data might be hosted by a persistent disk, a Cloud Storagebucket, or a BigQuery dataset.
When a cloud resource hosts sensitive data or applications, the resource andthe deployment pipeline that manages the resource can also become sensitive.For example, you should consider a Cloud Run service and itsdeployment pipeline to be as sensitive as the application that it's hosting.
Aftercategorizing your data andyour applications,create an initial security category for the application. To do so, determine alevel from the following categories:
- Confidentiality: Highest category of any data or application hosted
- Integrity: Highest category of any data or application hosted
- Availability: Highest category of any data or application hosted
When making your initial assessment, don't be too strict—for example:
- If you encrypt highly confidential data, treat the encryption key ashighly confidential. But, you can use a lower security category for theresource containing the data.
- If you store redundant copies of data, or run redundant instances of thesame applications across multiple resources, you can make the category ofthe resource lower than the category of the data or application it hosts.
Constrain the use of deployment pipelines
If your deployment pipeline needs to access sensitive Google Cloudresources, you must consider its security posture. The more sensitive theresources, the better you need to attempt to secure the pipeline. However, youmight encounter the following practical limitations:
- When using existing infrastructure or an existing CI/CD system, thatinfrastructure might constrain the security level you can realisticallyachieve. For example, your CI/CD system might only support a limited set ofsecurity controls, or it might be running on infrastructure that youconsider less secure than some of your production environments.
- When setting up new infrastructure and systems to run your deploymentpipeline, securing all components in a way that meets your most stringentsecurity requirements might not be cost effective.
To deal with these limitations, it can be useful to set constraints on whatscenarios should and shouldn't use deployment pipelines and a particular CI/CDsystem. For example, the most sensitive deployments are often better handledoutside of a deployment pipeline. These deployments could be manual, using aprivileged session management system or a privileged access management system,or something else, like tool proxies.
To set your constraints, define which access controls you want to enforce basedon your resource categories. Consider the guidance offered in the followingtable:
| Category of resource | Access controls |
|---|---|
| Low | No approval required |
| Moderate | Team lead must approve |
| High | Multiple leads must approve and actions must be recorded |
Contrast these requirements with the capabilities of your source codemanagement (SCM) and CI/CD systems by asking the following questions andothers:
- Do your SCM or CI/CD systems support necessary access controls andapproval mechanisms?
- Are the controls protected from being subverted if bad actors attack theunderlying infrastructure?
- Is the configuration that defines the controls appropriately secured?
Depending on the capabilities and limitations imposed by your SCM or CI/CDsystems, you can then define your data and application constraints for yourdeployment pipelines. Consider the guidance offered in the following table:
| Category of resource | Constraints |
|---|---|
| Low | Deployment pipelines can be used, and developers can self-approvedeployments. |
| Moderate | Deployment pipelines can be used, but a team lead has to approveevery commit and deployment. |
| High | Don't use deployment pipelines. Instead, administrators have to use aprivileged access management system and session recording. |
Maintain resource availability
Using a deployment pipeline to manage resources can impact the availability ofthose resources and can introduce new risks:
- Causing outages: A deployment pipeline might push faulty code orconfiguration files, causing a previously working system to break, or datato become unusable.
- Prolonging outages: To fix an outage, you might need to rerun adeployment pipeline. If the deployment pipeline is broken or unavailablefor other reasons, that could prolong the outage.
A pipeline that can cause or prolong outages poses a denial of service risk: Abad actor might use the deployment pipeline to intentionally cause an outage.
Create emergency access procedures
When a deployment pipeline is the only way to deploy or configure anapplication or resource, pipeline availability can become critical. In extremecases, where a deployment pipeline is the only way to manage a business-criticalapplication, you might also need to consider the deployment pipelinebusiness-critical.
Because deployment pipelines are often made from multiple systems and tools,maintaining a high level of availability can be difficult or uneconomical.
You can reduce the influence of deployment pipelines on availability bycreating emergency access procedures. For example, create an alternative accesspath that can be used if the deployment pipeline isn't operational.
Creating an emergency access procedure typically requires most of the followingprocesses:
- Maintain one of more user accounts with privileged access to relevantGoogle Cloud resources.
- Store the credentials of emergency-access user accounts in a safelocation, or use a privileged access management system to broker access.
- Establish a procedure that authorized employees can follow to access thecredentials.
- Audit and review the use of emergency-access user accounts.
Ensure that input artifacts meet your availability demands
Deployment pipelines typically need to download source code from a centralsource code repository before they can perform a deployment. If the source coderepository isn't available, running the deployment pipeline is likely to fail.
Many deployment pipelines also depend on third-party artifacts. Such artifactsmight include libraries from sources such as npm, Maven Central, or the NuGetGallery, as well as container base images, and.deb, and.rpm packages. Ifone of the third-party sources is unavailable, running the deployment pipelinemight fail.
To maintain a certain level of availability, you must ensure that the inputartifacts of your deployment pipeline all meet the same or higher availabilityrequirements. The following list can help you ensure the availability of inputartifacts:
- Limit the number of sources for input artifacts, particularlythird-party sources
- Maintain a cache of input artifacts that deployment pipelines can use ifsource systems are unavailable
Treat deployment pipelines and their infrastructure like production systems
Deployment pipelines often serve as the connective tissue between development,staging, and production environments. Depending on the environment, they mightimplement multiple stages:
- In the first stage, the deployment pipeline updates a developmentenvironment.
- In the next stage, the deployment pipeline updates a staging environment.
- In the final stage, the deployment pipeline updates the productionenvironment.
When using a deployment pipeline across multiple environments, ensure that thepipeline meets the availability demands of each environment. Because productionenvironments typically have the highest availability demands, you should treatthe deployment pipeline and its underlying infrastructure like a productionsystem. In other words, apply the same access control, security, and qualitystandards to the infrastructure running your deployment pipelines as you do foryour production systems.
Limit the scope of deployment pipelines
The more resources that a deployment pipeline can access, the more damage itcan possibly cause if compromised. A compromised deployment pipeline that hasaccess to multiple projects or even your entire organization could, in the worstcase, possibly cause lasting damage to all your data and applications onGoogle Cloud.
To help avoid this worst-case scenario, limit the scope of your deploymentpipelines. Define the scope of each deployment pipeline so it only needs accessto a relatively small number of resources on Google Cloud:
- Instead of granting access on the project level, only grant deploymentpipelines access to individual resources.
- Avoid granting access to resources across multiple Google Cloud projects.
- Split deployment pipelines into multiple stages if they need access tomultiple projects or environments. Then, secure the stages individually.
Maintain confidentiality
A deployment pipeline must maintain the confidentiality of the data it manages.One of the primary risks related to confidentiality is data exfiltration.
There are multiple ways in which a bad actor might attempt to use a deploymentpipeline to exfiltrate data from your Google Cloud resources. These waysinclude:
- Direct: A bad actor might modify the deployment pipeline or itsconfiguration so that it extracts data from your Google Cloudresources and then copies it elsewhere.
- Indirect: A bad actor might use the deployment pipeline to deploycompromised code, which then steals data from your Google Cloudenvironment.
You can reduce confidentiality risks by minimizing access to confidentialresources. Removing all access to confidential resources might not be practical,however. Therefore, you must design your deployment pipeline to meet theconfidentiality demands of the resources it manages. To determine these demands,you can use the following approach:
- Determine the data, applications, and resources the deployment pipelineneeds to access, and categorize them.
- Find the resource with the highest confidentiality category and use itas an initial category for the deployment pipeline.
Similar to the categorization process for applications and cloud resources,this initial assessment isn't always appropriate. For example, you might use adeployment pipeline to create resources that will eventually contain highlyconfidential information. If you restrict the deployment pipeline so that it cancreate–but can't read–these resources, then a lower confidentiality categorymight be sufficient.
To maintain confidentiality, theBell–LaPadula model suggests that a deployment pipeline must not:
- Consume input artifacts of higher confidentiality
- Write data to a resource of lower confidentiality
According to the Bell–LaPadula model, the preceding diagram shows how datashould flow in the pipeline to help ensure data confidentiality.
Don't let deployment pipelines read data they don't need
Deployment pipelines often don't need access to data, but they might still haveit. Such over-granting of access can result from:
- Granting incorrect access permissions. A deployment pipeline might begranted access to Cloud Storage on the project level, for example.As a result, the deployment pipeline can access all Cloud Storagebuckets in the project, although access to a single bucket might be sufficient.
- Using an overly permissive role. A deployment pipeline might be granteda role that provides full access to Cloud Storage, for example.However, the permission to create new buckets would suffice.
The more data that a pipeline can access, the higher the risk that someone orsomething can steal your data. To help minimize this risk, avoid grantingdeployment pipelines access to any data that they don't need. Many deploymentpipelines don't need data access at all, because their sole purpose is to manageconfiguration or software deployments.
Don't let deployment pipelines write to locations they don't need
To remove data, a bad actor needs access and a way to transfer the data out ofyour environment. The more storage and network locations a deployment pipelinecan send data to, the more likely it is that a bad actor can use one of thoselocations for exfiltration.
You can help reduce risk by limiting the number of network and storagelocations where a pipeline can send data:
- Revoke write access to resources that the pipeline doesn't need, evenif the resources don't contain any confidential data.
- Block internet access, or restrict connections, to an allow-listed setof network locations.
Restricting outbound access is particularly important for pipelines that you'vecategorized as moderately confidential or highly confidential because they haveaccess to confidential data or cryptographic key material.
Use VPC Service Controls to help prevent compromised deployments from stealing data
Instead of letting the deployment pipeline perform data exfiltration, a badactor might attempt to use the deployment pipeline to deploy compromised code.That compromised code can then steal data from within your Google Cloudenvironment.
You can help reduce the risk of such data-theft threats by usingVPC Service Controls.VPC Service Controls let you restrict the set ofresources and APIs that can be accessed from within certainGoogle Cloud projects.
Maintain integrity
To keep your Google Cloud environment secure, you must protect itsintegrity. This includes:
- Preventing unauthorized modification or deletion of data or configuration
- Preventing untrusted code or configuration from being deployed
- Ensuring that all changes leave a clear audit trail
Deployment pipelines can help you maintain the integrity of your environment byletting you:
- Implement approval processes—for example, in the form of code reviews
- Enforce a consistent process for all configuration or code changes
- Run automated tests or quick checks before each deployment
To be effective, you must try to ensure that bad actors can't undermine orsidestep these measures. To prevent such activity, you must protect theintegrity of:
- The deployment pipeline and its configuration
- The underlying infrastructure
- All inputs consumed by the deployment pipeline
To prevent the deployment pipeline from becoming vulnerable, try to ensure thatthe integrity standards of the deployment pipeline match or exceed the integritydemands of the resources it manages. To determine these demands, you can use thefollowing approach:
- Determine the data, applications, and resources the deployment pipelineneeds to access, and categorize them.
- Find the resource with the highest integrity category and use it as thecategory for the deployment pipeline.
To maintain the integrity of the deployment pipeline, theBiba model suggests that:
- The deployment pipeline must not consume input artifacts of lowerintegrity.
- The deployment pipeline must not write data to a resource of higherintegrity.
According to the Biba model, the preceding diagram shows how data should flow inthe pipeline to help ensure data integrity.
Verify the authenticity of input artifacts
Many deployment pipelines consume artifacts from third-party sources. Suchartifacts might include:
- Docker base images
.rpmor.debpackages- Maven,
.npm, or NuGet libraries
A bad actor might attempt to modify your deployment pipeline so that it usescompromised versions of third-party artifacts by:
- Compromising the repository that stores the artifacts
- Modifying the deployment pipeline's configuration to use a differentsource repository
- Uploading malicious packages with similar names, or names that contain typos
Many package managers let you verify the authenticity of a package bysupporting code-signing. For example, you can use PGP to sign RPM and Mavenpackages. You can use Authenticode to sign NuGet packages.
You can use code-signing to reduce the risk of falling victim to compromisedthird-party packages by:
- Requiring that all third-party artifacts are signed
- Maintaining a curated list of trusted publisher certificates or public keys
- Letting the deployment pipeline verify the signature of third-partyartifacts against the trusted publishers list
Alternatively, you can verify the hashes of artifacts. You can use thisapproach for artifacts that don't support code-signing and changeinfrequently.
Ensure that underlying infrastructure meets your integrity demands
Instead of compromising the deployment pipeline itself, bad actors mightattempt to compromise its infrastructure, including:
- The CI/CD software that runs the deployment pipeline
- The tools used by the pipeline—for example, Terraform, kubectl, or Docker
- The operating system and all its components
Because the infrastructure that underlies deployment pipelines is often complexand might contain components from various vendors or sources, this type ofsecurity breach can be difficult to detect.
You can help reduce the risk of compromised infrastructure by:
- Holding the infrastructure and all its components to the same integritystandards as the deployment pipeline and the Google Cloud resourcesthat it manages
- Making sure tools come from a trusted source and verifying theirauthenticity
- Regularly rebuilding infrastructure from scratch
- Running the deployment pipeline onshielded VMs
Apply integrity controls in the pipeline
While bad actors are a threat, they aren't the only possible source of softwareor configuration changes that can impair the integrity of yourGoogle Cloud environment. Such changes can also originate from developersand simply be accidental, due to unawareness, or the result of typos and othermistakes.
You can help reduce the risk of inadvertently applying risky changes byconfiguring deployment pipelines to apply additional integrity controls. Suchcontrols can include:
- Performing static analysis of code and configuration
- Requiring all changes to pass a set of rules (policy as code)
- Limiting the number of changes that can be done at the same time
What's next
- Learn about our best practices forusing service accounts in deploymentpipelines.
- Review ourbest practices for securing service accounts.
- Learn more aboutInvestigating and responding to threats.
- For more reference architectures, diagrams, and best practices, explore theCloud Architecture Center.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-10-29 UTC.