Data deletion on Google Cloud Stay organized with collections Save and categorize content based on your preferences.
This content was last updated in May 2024, and represents the status quo as ofthe time it was written. Google's security policies and systems may change goingforward, as we continually improve protection for our customers.
This document gives an overview of the secured process that occurs when youdelete your customer data on Google Cloud. As defined in theGoogle Cloud Terms of Service,customer data is data that is provided to Google by customers or end usersthrough the services under the account.
This document describes how customer data is stored in Google Cloud, thedeletion pipeline, and how we prevent any reconstruction of data that is storedin our platform.
For information about our data deletion commitments, seeCloud Data Processing Addendum (Customers).
Data storage and replication
Google Cloud offersstorage services anddatabases services such asBigtableandSpanner. Most Google Cloud applications andservices access Google's storage infrastructure indirectly using these cloudservices.
Data replication is critical to achieving low latency, highly available,scalable, and durable solutions. Redundant copies of customer data can be storedlocally and regionally and even globally, depending on your configuration andthe demands of your projects. Actions taken on data in Google Cloud may besimultaneously replicated in multiple data centers, so that customer data ishighly available. When performance-impacting changes occur in the hardware,software, or network environment, customer data is automatically shifted fromone system or facility to another, subject to customers' configuration settings,so that customer projects continue performing at scale and without interruption.
At the physical storage level, customer data is stored at rest in two types ofsystems: active storage systems and backup storage systems. These two types ofsystems process data differently. Active storage systems areGoogle Cloud's production servers that run Google's application andstorage layers. Active systems are mass arrays of disks and drives used to writenew data as well as store and retrieve data in multiple replicated copies.Active storage systems are optimized to perform live read and write operationson customer data at speed and scale.
Google's backup storage systems store full and incremental copies of Google'sactive systems for a defined period of time to help Google recover data andsystems in the event of a catastrophic outage or disaster. Unlike activesystems, backup systems are designed to receive periodic snapshots of Googlesystems and backup copies are retired after a limited window of time as newbackup copies are made.
Throughout the storage systems described above, customer data is encrypted whenstored at rest. For more information, seeDefault encryption at rest.
Data deletion pipeline
After customer data is stored in Google Cloud, our systems are designedto store the data securely until the data deletion pipeline completes itsstages. This section describes the deletion stages.
Stage 1: Deletion request
The deletion of customer data begins when you initiate a deletion request.Generally, a deletion request is directed to a specific resource, aGoogle Cloud project, or your Google account. Deletion requests might be handled indifferent ways depending on the scope of your request:
- Resource Deletion: Individual resources containing customer data,such as Cloud Storage buckets, can be deleted in a number of ways fromthe Google Cloud console or using API. For example, you can issue aremove bucket or
gcloud storage rmcommand to delete a storage bucket through the command line or you canselect a storage bucket and delete it from the Google Cloud console. - Project Deletion: As a Google Cloud project owner, you can shutdown a project. Deleting a project acts as a bulk deletion request for allresources tied to the correspondingproject number.
- Google account deletion: When you delete your Google account, itdeletes all projects that aren't associated with an organization and thatare solely owned by you. When there are multiple owners for anon-organization project, the project is not deleted until all owners areremoved from the project or delete their Google accounts. This processensures that projects continue so long as they have an owner.
- Google Workspace or Cloud Identity account deletion:Organizations that are bound to a Google Workspace or Cloud Identityaccount are deleted when you delete a Google Workspace orCloud Identity account. For more information, seeDelete your organization's Google Account.
You use deletion requests primarily to manage your data. However, Google canissue deletion requests automatically; for instance when you end yourrelationship with Google.
Stage 2: Soft deletion
Soft deletion is the point in the process to provide a brief internal stagingand recovery period to help ensure that there is time to recover any data thathas been marked for deletion by accident or error. Individual Google Cloudproducts might adopt and configure such a defined recovery period before thedata is deleted from the underlying storage systems so long as the period fitswithin Google's overall deletion timeline.
Whenprojects are deleted,Google Cloud first identifies the unique project number then it broadcastsa suspension signal to the Google Cloud products (for example, for exampleCompute Engine and Bigtable) that contain that project number.In this case, Compute Engine suspends operations that are keyed to thatproject number and the relevant tables in Bigtable enter aninternal recovery period of up to 30 days. At the end of the recovery period,Google Cloud broadcasts a signal to the same products to begin logicaldeletion of resources tied to the unique project number. Then Google waits (and,when necessary, rebroadcasts the signal) to collect an acknowledgement signal(ACK) from the applicable products to complete project deletion.
When a Google account is closed, Google Cloud might impose an internalrecovery period up to 30 days, depending on past account activity. After thatgrace period expires, a signal that contains the deleted billing account user IDis broadcasted to Google products and Google Cloud resources tied solelyto that user ID are marked for deletion.
Stage 3: Logical deletion from active systems
After the data is marked for deletion and any recovery period has expired, thedata is deleted successively from Google's active and backup storage systems. Onactive systems, data is deleted in two ways.
In all Google Cloud products under theCompute,Storage, andDatabase projectcategories, except Cloud Storage, copies of the deleted data are markedas available storage and overwritten over time. In an active storage system likeBigtable, deleted data is stored as entries within a massivestructured table. Compacting existing tables to overwrite deleted data can beexpensive, as it requires re-writing tables of existing (non-deleted) data, somark-and-sweep garbage collection and major compaction events are scheduled tooccur at regular intervals to reclaim storage space and overwrite deleted data.
In Cloud Storage, customer data is also deleted through cryptographicerasure. This is an industry standard technique that renders data unreadable bydeleting the encryption keys that are needed to decrypt that data. One advantageof using cryptographic erasure, whether it involves Google-supplied orcustomer-supplied encryption keys, is that logical deletion can be completedeven before all deleted blocks of that data are overwritten inGoogle Cloud's active and backup storage systems.
Stage 4: Expiration from backup systems
Similar to deletion from Google's active systems, deleted data is eliminatedfrom backup systems using both overwriting and cryptographic techniques. In thecase of backup systems, however, customer data is typically stored within largeaggregate snapshots of active systems that are retained for static periods oftime to ensure business continuity in the event of a disaster (for example, anoutage affecting an entire data center), when the time and expense of restoringa system entirely from backup systems might become necessary. Consistent withreasonable business continuity practices, full and incremental snapshots ofactive systems are made on a daily, weekly, and monthly cycles and retired aftera predefined period of time to make room for the newest snapshots.
When a backup is retired, it is marked as available space and overwritten as newdaily, weekly, or monthly backups are performed.
Note that any reasonable backup cycle imposes a pre-defined delay in propagatinga data deletion request through backup systems. When customer data is deletedfrom active systems, it is no longer copied into backup systems. Backups thatwere performed before deletion are expired regularly based on the pre-definedbackup cycle.
Finally, cryptographic erasure of the deleted data might occur before the backupthat contains customer data has expired. Without the encryption key that wasused to encrypt specific customer data, the customer data is unrecoverable evenduring its remaining lifespan on Google's backup systems.
Deletion timeline
Google Cloud is engineered to achieve a high degree of speed,availability, durability, and consistency. The design of systems optimized forthese performance attributes must be balanced carefully with the need to achievetimely data deletion. Google Cloud commits to delete customer data withina maximum period of about six months (180 days). This commitment incorporatesthe stages of Google's deletion pipeline described above, including thefollowing:
- Stage 2: After the deletion request is made, data is typicallymarked for deletion immediately and our goal is to perform this step withina maximum period of 24 hours. After the data is marked for deletion, aninternal recovery period of up to 30 days might apply depending on theservice or deletion request.
- Stage 3: The time needed to complete garbage collection tasks andachieve logical deletion from active systems. These processes might occurimmediately after the deletion request is received, depending on the levelof data replication and the timing of ongoing garbage collection cycles.After the deletion request is made, it generally takes about two months todelete data from active systems, which is typically enough time to completetwo major garbage collection cycles and ensure that logical deletion iscompleted.
- Stage 4: The Google backup cycle is designed to expire deleted datawithin data center backups within six months of the deletion request.Deletion may occur sooner depending on the level of data replication andthe timing of Google's ongoing backup cycles.
The following diagram shows the stages of Google Cloud's deletionpipeline and when data is erased from active and backup systems.
Ensure safe and secure media sanitization
A disciplined media sanitization program enhances the security of the deletionprocess by preventing forensic or laboratory attacks on the physical storagemedia after it has reached the end of its lifecycle.
Google meticulously tracks the location and status of all storage equipmentwithin our data centers, through acquisition, installation, retirement, anddestruction, using barcodes and asset tags that are tracked in Google's assetdatabase. Various techniques such as biometric identification, metal detection,cameras, vehicle barriers, and laser-based intrusion detection systems are usedto prevent equipment from leaving the data center floor without authorization.For more information, see theGoogle infrastructure security design overview.
Physical storage media can be decommissioned for a range of reasons. If acomponent fails to pass a performance test at any point during its lifecycle, itis removed from inventory and retired. Google also upgrades obsolete hardware toimprove processing speed and energy efficiency, or increase storage capacity.Whether hardware is decommissioned due to failure, upgrade, or any other reason,storage media is decommissioned using appropriate safeguards. Google hard drivesuse technologies like full disk encryption (FDE) and drive locking to helpprotect data at rest during decommission. When a hard drive is retired,authorized individuals verify that the disk is erased by overwriting the drivewith zeros and performing a multi-step verification process to ensure the drivecontains no data.
If the storage media cannot be erased for any reason, it is stored securelyuntil it can be physically destroyed. Depending on available equipment, weeither crush and deform the drive or shred the drive into small pieces. Ineither case, the disk is recycled at a secure facility, ensuring that no onewill be able to read data on retired Google disks. Each data center adheres to astrict disposal policy and uses the techniques described to achieve compliancewithNIST SP 800-88 Revision 1Guidelines for Media SanitizationandDoD 5220.22-MNational Industrial Security Program Operating Manual.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.