Best practices for media workloads

This page describes the best practices when using Cloud Storage for mediaworkloads. These workloads often include various Google Cloud products likeMedia CDN,Live Stream API,Transcoder API, andVideo Stitcher API.

Overview

Google Cloud offers solutions to optimize the following types of mediaworkloads:

  • Media production: Includes workloads such as post production ofmovies including video editing that are compute heavy and often use GPUsfor high performance computing. Often, media-related data residing in Cloud Storageis processed by applications running in Compute Engine or Google Kubernetes Engine,and the output of this process is written back to Cloud Storage.These workloads require scaling aggregate read and write throughput fromCloud Storage to a compute cluster with a lower GPU idle time. They alsorequire low read and write latencies as it's crucial in reducing tail latency.
  • Media asset management: Includes organizing your media assets forefficient storage, retrieval, and usage.
  • Content serving and distribution: Includes streaming media to users,including video on demand (VoD) and livestreaming services. During VoD,when users' request content that isn't cached on the content deliverynetwork (CDN), the content is fetched from the Cloud Storage buckets. Forlivestreaming requests, the content is written to the Storage bucket andread from the CDN simultaneously.

Best practices for media workloads

For best practices that apply to media workloads, see the following sections.

Data transfer

UseStorage Transfer Service to upload more than 1 TiB of raw media files from anon-premises source, such asvideo camera or on-premises storage to Cloud Storage. Storage Transfer Serviceenables seamless data movement across object and file storage systems. Forsmaller transfers, choose the service totransfer data to and from Cloud Storageor between file systems based on your transfer scenario.

Bucket location

For workloads that require compute resources such as media production, youshould create buckets in the same region or dual-regions as the computeresources. This method helps to optimize the performance by lowering read andwrite latencies for your processing workloads,cost, andbandwidth.For more guidance about choosing the bucket location, seeBucket location considerations.

Storage class

Depending on the type of media workload, the storage class you should selectdiffers.The recommended storage class types for different media workloads are asfollows:

  • For managing media assets, such as archive videos, the default storage classof a bucket should be Archive storage. You can specify a different storageclass for objects that have different availability or access needs.
  • For media production and content serving workloads, as data is readfrequently from a Cloud Storage bucket, you should store the data inStandard storage.

For more guidance about choosing the storage class for your bucket, seeStorage class.

Data lifecycle management

For managing your media assets, you should manage object lifecycle for yourbuckets by defining alifecycle configuration. With theObject Lifecycle Management feature, you can manage the data lifecycleincluding setting a Time to Live (TTL) for objects, retaining noncurrentversions of objects, and downgrading storage classes of objects to help managecosts.

When data access patterns are predictable, you canset the lifecycle configuration for a bucket.For unknown or unpredictable access patterns for your data, you canset the Autoclass feature for your bucket.With Autoclass, Cloud Storage automatically moves data that is notfrequently accessed to colder storage classes.

Best practices for content serving and distribution workloads

For both VoD and livestreaming workloads, the goal is to avoid any playbackerrors, playback start delays, or buffering while playing a video on theend-users' video player. These workloads also require scaling of reads toaccount for a large number of concurrent viewers. In all cases, customer trafficreads should go through a CDN.

For best practices that apply to content serving and distribution workloads,see the following sections.

Use the CDN effectively

Using a content delivery network (CDN) in front of the Cloud Storage bucketimproves the end-user experience as the CDN caches content by reducing latencyand increasing bandwidth efficiency. A CDN lets you reduce the total cost ofownership (TCO) by reducing bandwidth costs, optimizing resource utilization,and improving performance. UsingMedia CDN helps reduce the TCO forserving the content to end users as cache-fill cost for Media CDNis zero. You can use Media CDN to serve as the source of otherthird-party CDNs. With other CDNs, you still get some TCO reduction when servingcontent from this Media CDN cache instead of from the origin.

If you are using a third-party CDN,CDN Interconnectenables selected providers to establish direct peering links with Google's edgenetwork at various locations. Your network traffic egressing from Google Cloudthrough one of these links benefits from the direct connectivity to supportedCDN providers and is billed automatically with reduced pricing. For a list ofapproved providers, seeGoogle-approved service providers.

The following lists the options to configure when setting up a CDN:

Select the origin shield location

The origin shield location is a cache between the CDN and Cloud Storage. If your CDN lets you select the origin shield location, follow the CDN guidelines on whether it's recommended tochoose the origin shield to be closer to the region of your Cloud Storage bucket or your end-user traffic concentration location. An origin shield is a protective measure that protects your origin server from overloading. CDNs with origin shielding help increase origin offload by adding an extra cache between the origin and CDN. For example,Media CDN provides a deeply tiered edge infrastructure that is designed to actively minimize cache fill wherever possible.

Enable request coalescing

Ensure that request collapsing is enabled for your CDN. Collapsing multiplerequests into a single request reduces the Cloud Storage class B operation cost.CDNs have distributed caches deployed across the globe but provide a way tocollapse multiple end-user requests into a single request to origin. Forexample, Media CDN actively collapses multiple user-driven cachefill requests for the same cache key into a single origin request per edge node,thereby reducing the number of requests made to the buckets.

Configure the retry behavior on CDN

Ensure that you configure retry for any server issues with HTTP 5xx responsecode–502, 503, 504 on your CDN. CDNs support origin retries, allowing retry ofunsuccessful requests to the origin. Most CDNs let you specify the number ofretries for the current origin. For information about retrying origin requestsin Media CDN, seeRetry origin requests.

Location options for content distribution

For workloads reading data from Cloud Storage that isn't cached on CDN, such ascontent serving and distribution of a VoD type content, consider the followingfactors when selecting a location for your bucket:

  • To optimize for cost, buckets created in a single region have theloweststorage cost.
  • To optimize for availability, consider the following:
    • For most media workloads, using dual-region buckets isrecommended because it replicates your objects in two regions forbetter availability.
    • For use cases that require content serving and analytics withgeo-redundancy, use buckets in multi-regions for highest availability.
  • To optimize for latency and reduce network costs, consider the following:
    • For VoD, choose regions closest to where most of your end usersare or the region with the most traffic concentration.
    • During livestreaming, buckets receive write requests fromtranscoders and read requests from a CDN that cache and distribute thecontent to end users. For an enhanced streaming performance, chooseregional buckets which are colocated with the compute resources usedfor transcoding.

Optimize video segment length for livestreams

For livestreams, the lowest recommended segment size is two seconds becauseshort video segments are more sensitive to long-tail write latencies. Long-tailwrite latencies refers to the slow or delayed write operations for content thatis infrequently accessed or has a low volume of requests.

The physical distance between the bucket location and the end-users' playbacklocation affects the transmission time. If your end users are far from thebucket location, we recommend having a longer video segment size.

In order to provide viewers the best experience, it's recommended touse the retry strategy and request hedgingfor writes on the transcoders to mitigate long tail latencies of more than twoseconds for writes to Cloud Storage and to experimentwith longer buffer times of approximately ten seconds.

Ramp up QPS gradually

Cloud Storage buckets have aninitial IO capacityof 1,000 object writes per second and 5,000 object reads per second. Forlivestream workloads, the guideline is to scale your requests gradually bystarting at 1,000 writes per second and 5,000 reads per second, andincrementally doubling the request rate every 20 minutes. This method letsCloud Storage redistribute the load across multiple servers, andimproves the availability and latency of your bucket by reducing the chances ofplayback issues.

For a livestream event with higher QPS, you should implement scaling on yourbucket by eitherprewarming your bucket or byenabling hierarchical namespace on your bucket. Before implementingscaling on your bucket, you should perform the following tasks:

Estimate your QPS to the origin

Suppose for a livestream with one million viewers, the CDN will receive onemillion QPS. Assuming your CDN has a cache hit rate of 99.0%, the resultingtraffic to Cloud Storage will be 1%. The QPS will be 1% of the totalviewers (one million), which equals to 10,000 QPS. This value is more than theinitial IO capacity.

Monitor the QPS and troubleshoot any scaling errors

You should monitor the QPS and troubleshoot any scaling errors. For moreinformation, seeOverview of monitoring in Cloud Storage. Tomonitor the read and write requests, observe theTotal read/list/get request count chart and theTotal write request count chart respectively in theGoogle Cloud console.If you scale the QPS on buckets faster than the specified ramp-up guidelinesmentioned in the preceding section, you might encounter the429 Too many requests error. Learn how to resolve the429 Too many requests error.

The following sections describe how to scale your bucket for higher QPS afteryou have estimated the QPS to the origin.

Implement QPS scaling on your bucket by prewarming your bucket

You can expedite the scaling process ahead of a livestreaming event byprewarming your bucket. Before the livestreaming event, generate synthetictraffic to your bucket that matches the expected max QPS you expect theCDN's origin serverwill receive to the event plus additional 50% buffer factoring in the expectedcache-hit rate of your CDN. For example, if youestimated the QPS to your origin to be 10,000, then your simulated trafficshould target 15,000 requests per second to prepare your origin for the event.

For this simulated traffic, you can either use the previous event's live feedfiles such as segments and manifest or test files. Ensure that you have distinctfiles throughout the warmup process.

When generating this simulated traffic, follow a gradual scaling approach,starting at 5,000 requests per second and progressively increasing until youreach your target. Allocate sufficient time before your event to achieve theestimated load. For example, reaching 15,000 requests per second, doubling theload every 20 minutes from an initial 5,000 requests per second, will takeapproximately 30 minutes.

The origin server maintains the capacity until the traffic is consistent. Theorigin server's capacity gradually decreases to its baseline level over 24hours. If your origin server experiences multi-hour gaps between the livestreamevents, we recommend that you simulate traffic before each event.

Use hierarchical namespace enabled buckets for high initial QPS

Cloud Storage buckets withhierarchical namespace enabledprovide up to eight times the initial QPS compared to the buckets without HNS.The higher initial QPS makes it easier to scale data-intensive workloads andprovides enhanced throughput. For information about limitations in buckets withhierarchical namespace enabled, seeLimitations.

Avoid sequential names for video segments for scaling QPS

With QPS scaling, requests are redistributed across multiple servers. However,you might encounter performance bottlenecks when all the objects use anon-randomized or sequential prefix. Using completely random names oversequential names gives you the best load distribution. However, if you want touse sequential numbers or timestamps as part of your object names, introducerandomness to the object names by adding a hash value before the sequence numberor timestamp. For example, if the originalobject name you want to use ismy-bucket/2016-05-10-12-00-00/file1, you cancompute the MD5 hash of the original object name and add the first six charactersof the hash as a prefix to the object name. The new object becomesmy-bucket/2fa764-2016-05-10-12-00-00/file1.For more information, seeUse a naming convention that distributes load evenly across key ranges.If you can't avoid sequential naming for video segments, use buckets withhierarchical namespace enabled to get higher QPS.

Use different buckets for each livestream

For concurrent livestreams, using different buckets for each livestream willhelp you scale the read and write load effectively without reaching theIO limits for the bucket.Using different buckets for each livestream decreases large outlier latenciesdue to scaling delays.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.