Transcoding of gzip-compressed files

This page discusses the conversion of files to and from agzip-compressedstate. The page includes an overview of transcoding, best practices forworking with associated metadata, and compressed file behavior in Cloud Storage.

Transcoding and gzip

gzip is a form of data compression: it typically reduces the size of a file.This allows the file to be transferred faster and stored using less spacethan if it were not compressed. Compressing a file can reduce both costand transfer time.Transcoding, in Cloud Storage, is the automatic changingof a file's compression before it's served to a requester. When transcodingresults in a file becoming gzip-compressed, it can be consideredcompressive,whereas when the result is a file that is no longer gzip-compressed, it can beconsidereddecompressive. Cloud Storage supports the decompressiveform of transcoding.

Cloud Storage does not support decompressive transcoding forBrotli-compressed objects.

Decompressive transcoding

Decompressive transcoding allows you to store compressed versions of filesin Cloud Storage, which reduces at-rest storage costs, while stillserving the file itself to the requester, without any compression. This isuseful, for example, when serving files to customers.

In order for decompressive transcoding to occur, an object must meet twocriteria:

  1. The file is gzip-compressed when stored in Cloud Storage.

  2. Theobject's metadata includesContent-Encoding: gzip.

When an object meets these two criteria, it undergoes decompressive transcodingwhen served, and the response containing the object does not contain aContent-Encoding orContent-Length header.

There are two ways to prevent decompressive transcoding from occurring for anobject that is otherwise eligible:

  • If the request for the object includes anAccept-Encoding: gzip header, theobject is served as-is in that specific request, along with aContent-Encoding: gzip response header.

  • If theCache-Control metadata field for the object is settono-transform, the object is served as a compressed object in allsubsequent requests, regardless of anyAccept-Encoding request headers.

Preventing decompressive transcoding is useful, for example, if you want toreduce outbound data transfer cost or time or if you want to validate thedownloaded objects have the expected crc32c/md5 checksums.

Considerations

Keep in mind the following when working with decompressive transcoding:

  • Decompressive transcoding invalidatesintegrity checking. If requestersof your data rely on the checksum for integrity checking, you shouldn't usedecompressive transcoding.

  • Decompressive transcoding allows you to store objects in Cloud Storage in acompressed state, saving space and costs. However, charges for downloading theobject are based on itsdecompressed size, because that is the size of theserved object.

  • When accessed from within a Cloud Storage FUSE-mounted bucket, objects do notundergo decompressive transcoding and are read as compressed.

Content-Type vs. Content-Encoding

There are several behaviors that you should be aware of concerning howContent-Type andContent-Encoding relate to transcoding. Both are metadatastored along with an object. SeeViewing and Editing Object Metadata forstep-by-step instructions on how to add metadata to objects.

Content-Type should be included in all uploads and indicatesthe type of object being uploaded. For example:

Content-Type: text/plain

indicates that the uploaded object is a plain-text file. While there is no checkto guarantee the specifiedContent-Type matches the true nature of anuploaded object, incorrectly specifying its type will at best cause requestersto receive something other than what they were expecting and could lead tounintended behaviors.

Content-Encoding is optional and can, if desired, be includedin the upload of files that are compressed. For example:

Content-Encoding: gzip

indicates that the uploaded object is gzip-compressed. As withContent-Type, there is no check to guarantee the specifiedContent-Encodingis actually applied to the uploaded object, and incorrectly specifying anobject's encoding could lead to unintended behavior on subsequent downloadrequests.

Good practices

  • When uploading a gzip-compressed object, the recommended way to set yourmetadata is to specify both theContent-Type andContent-Encoding. Forexample, for a compressed, plain-text file:

    Content-Type: text/plainContent-Encoding: gzip

    This gives the most information about the state of the object to anyoneaccessing it. Doing so also makes the object eligible for decompressivetranscoding when it is later downloaded, allowing client applications tohandle the semantics of theContent-Type correctly.

    Note: To automatically gzip and set theContent-Encoding metadata of filesyou upload, you can include the--gzip-local flag when usinggcloud storage cp. This method setsCache-Control:no-transform onthe uploaded objects' metadata, so if you want the objects to be eligiblefor decompressive transcoding, you mustedit the object metadata toremoveno-transform.
  • Alternatively, you can upload the object with theContent-Type set toindicate compression and NOContent-Encoding at all. For example:

    Content-Type: application/gzip

    However, in this case the only thing immediately known about the objectis that it is gzip-compressed, with no information regarding the underlyingobject type. Moreover, the object is not eligible for decompressivetranscoding.

Discouraged practices

  • While it is possible to do so, a file that is gzip-compressed should not beuploaded with the compressed nature of the file omitted. For example, for agzip-compressed plain-text file, you should avoid only settingContent-Type: text/plain. Doing so misrepresents the state of the objectas it will be delivered to a requester.

  • Similarly, objects should not be uploaded with an omittedContent-Type,even if aContent-Encoding is included. Doing so may result inContent-Type being set to a default value, but may result in the requestbeing rejected, depending on how the upload is made.

Incorrect practices

  • Youshould not set your metadata to redundantly report the compressionof the object:

    Content-Type: application/gzipContent-Encoding: gzip

    This implies you are uploading a gzip-compressed object that has beengzip-compressed a second time, when that is not usually the case (if youactually plan to doubly compress a file, please see theusing gzip on compressed objects section below). Whendecompressive transcoding occurs on such an incorrectly reported object,the object is served identity encoded, but requestersthink that theyhave received an object which still has a layer of compression associatedwith it. Attempts to decompress the object will fail.

  • Similarly, a file that is not gzip-compressedshould not be uploaded withtheContent-Encoding: gzip. Doing so makes the objectappearto be eligible for transcoding, but when requests for the object are made,attempts at transcoding fail.

Using gzip on compressed objects

Some objects, such as many video, audio, and image files, not to mention gzipfiles themselves, are already compressed. Using gzip on such objects offersvirtually no benefit: in almost all cases, doing so makes the object largerdue to gzip overhead. For this reason, using gzip on compressedcontent is generally discouraged and may cause undesired behaviors.

For example, while Cloud Storage allows "doubly compressed" objects (that is,objects that are gzip-compressed but also have an underlyingContent-Type thatis itself compressed) to be uploaded and stored, it does not allow objects to beserved in a doubly compressed state unless theirCache-Control metadataincludesno-transform. Instead, it removes the outer, gzip, level ofcompression, drops theContent-Encoding response header, and serves theresulting object. This occurs even for requests withAccept-Encoding: gzip.The file that is received by the client thus does not have the same checksum aswhat was uploaded and stored in Cloud Storage, so any integrity checksfail.

Using the Range header

When transcoding occurs, if the request for the object includes aRangeheader, that header is silently ignored. This means that requests for partialcontent are not fulfilled, and the response instead serves the entirerequested object. For example, if you have a 10 GB object that is eligible fortranscoding, but include the headerRange: bytes=0-10000 in the request,you still receive the entire 10 GB object.

This behavior arises because it is not possible to selecta range from a compressed file without first decompressing the file in itsentirety: each request for part of a file would be accompaniedby the decompression of the entire, potentially large, file, which wouldpoorly utilize resources. You should be aware of this behavior and avoidusing theRange header when using transcoding, as charges are incurredfor the transmission of the entire object and not just the range requested.For more information on allowed response behavior to requests withRangeheaders,see the specification.

If requests withRange headers are needed, you should ensure that transcodingdoes not occur for the requested object. You can achieve this by choosingthe appropriate properties when uploading objects to begin with. For example,range requests for objects withContent-Type: application/gzip and noContent-Encoding are performed as requested.

What's Next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.