Select a managed container runtime environment

This document helps you to assess your application requirements and choosebetweenCloud Run andGoogle Kubernetes Engine (GKE) Autopilot,based on technical and organizational considerations. This document is for cloudarchitects who need to choose a Google Cloud target container runtimeenvironment for their workloads. It assumes that you're familiar with Kubernetesand Google Cloud, and that you have some knowledge of cloud serverless runtimeenvironments like Cloud Run, Cloud Run functions, or AWSLambda.

Google Cloud offers several runtime environment options that have a rangeof capabilities. The following diagram shows the range of Google Cloudmanaged offerings:

Google Cloud offerings from most managed to least managed.

The diagram shows the following:

  • Most-managed runtime environments (the focus of this guide):

    These options are managed by Google, with no user management of underlyingcompute infrastructure.

  • Least-managed runtime environments:

    These options require some degree of user-level infrastructure management,such as the virtual machines (VMs) that underlie the compute capabilities.VMs in GKE Standard are the Kubernetes cluster nodes. VMs inCompute Engine are the core platform offering, which you cancustomize to suit your requirements.

This guide helps you to choose between the most-managed runtime environments,Cloud Run and GKE Autopilot. For abroader view of Google Cloud runtime environments, see theGoogle Cloud Application Hosting Options guide.

Overview of environments

This section provides an overview of Cloud Run andGKE Autopilot capabilities.Cloud Run and GKE Autopilot areboth tightly integrated within Google Cloud, so there is a lot ofcommonality between the two. Both platforms support multiple options for loadbalancing with Google's highly reliable and scalableload balancing services.They also both supportVPC networking,Identity-Aware Proxy (IAP),andGoogle Cloud Armor for when more granular, private networking is a requirement. Both platformscharge you only for the exact resources that you use for your applications.

From a software delivery perspective, as container runtime environments,Cloud Run and GKE Autopilot aresupported by services that make up the Google Cloud container ecosystem.These services includeCloud Build,Artifact Registry,Binary Authorization,and continuous delivery withCloud Deploy,to help ensure that your applications are safely and reliably deployed toproduction. This means that you and your teams own the build and deploymentdecisions.

Because of the commonality between the two platforms, you might want to takeadvantage of the strengths of each by adopting a flexible approach to where youdeploy your applications, as detailed in the guideUse GKE and Cloud Run together.The following sections describe unique aspects of Cloud Run andAutopilot.

Cloud Run

Cloud Run is a serverless managed compute platform that lets you run your applicationsdirectly on top of Google's scalable infrastructure. Cloud Runprovides automation and scaling for two main kinds of applications:

  • Cloud Run services:For code that responds to web requests.
  • Cloud Run jobs:For code that performs one or more background tasks and then exits when thework is done.

With these two deployment models, Cloud Run can support a widerange of application architectures while enabling best practices and lettingdevelopers focus on code.

Cloud Run also supports deploying application code from thefollowing sources:

  • Individual lightweight functions
  • Full applications from source code
  • Containerized applications

Cloud Run incorporates a build-and-deploy capability thatsupports both FaaS and the ability to build from source, alongside the prebuiltcontainer runtime capability. When you use Cloud Run in this way,the steps ofbuilding and deploying the application container image that will be executed are entirely automatic,and they don't require custom configuration from you.

GKE Autopilot

GKE Autopilot is the default and recommended cluster mode of operation inGKE.Autopilot lets you run applications on Kubernetes without the overheadof managing infrastructure. When you use Autopilot, Google manages keyunderlying aspects of your cluster configuration, including node provisioningand scaling, default security posture, and other preconfigured settings. WithAutopilot managing node resources, you pay only for the resources thatare requested by your workloads. Autopilot continuously monitors andoptimizes infrastructure resourcing to ensure the best fit while providing anSLA for your workloads.

GKE Autopilot supports workloads that might not bea good fit for Cloud Run. For example, GKEAutopilot commonly supports long-lived or stateful workloads.

Choose a runtime environment

In general, if the characteristics of your workload are suitable for a managedplatform, the serverless runtime environment of Cloud Run isideal. Using Cloud Run can result in less infrastructure tomanage, less self-managed configuration, and therefore lower operationaloverhead. Unless you specifically want or need Kubernetes, we recommend that youconsider serverless first as your target runtime environment. AlthoughKubernetes provides the powerful abstraction of an open platform, using it addscomplexity. If you don't need Kubernetes, then we recommend that you considerwhether your application is a good fit for serverless. If there are criteriathat make your workload less suitable for serverless, then we recommend usingAutopilot.

The following sections provide more detail about some of the criteria that canhelp you answer these questions, particularly the question of whether theworkload is a fit for serverless. Given the commonality betweenAutopilot and Cloud Run that's described in thepreceding sections, migration between the platforms is a straightforward taskwhen there aren't any technical or other blockers. To explore migration optionsin more detail, seeMigrate from Cloud Run to GKE andMigrate from Kubernetes to Cloud Run.

When you choose a runtime environment for your workload, you need to factor intechnical considerations and organizational considerations. Technicalconsiderations are characteristics of your application or the Google Cloudruntime environment. Organizational considerations are non-technicalcharacteristics of your organization or team that might influence your decision.

Technical considerations

Some of the technical considerations that will influence your choice ofplatform are the following:

  • Control and configurability: Granularity of control of theexecution environment.
  • Network traffic management and routing: Configurability ofinteractions over the network.
  • Horizontal and vertical scalability: Support for dynamically growingand shrinking capacity.
  • Support for stateful applications: Capabilities for storingpersistent state.
  • CPU architecture: Support for different CPU types.
  • Accelerator offload (GPUs and TPUs): Ability to offload computationto dedicated hardware.
  • High memory, CPU, and other resource capacity: Level of variousresources consumed.
  • Explicit dependency on Kubernetes: Requirements for Kubernetes APIusage.
  • Complex RBAC for multi-tenancy: Support for sharing pooled resources.
  • Maximum container task timeout time: Execution duration oflong-lived applications or components.

The following sections detail these technical considerations to help you choosea runtime environment.

Control and configurability

Compared to Cloud Run, GKEAutopilot provides more granular control of the execution environmentfor your workloads. Within the context of aPod,Kubernetes provides many configurable primitives that you can tune to meet yourapplication requirements. Configuration options includeprivilege level,quality of service parameters,custom handlers for container lifecycle events,andprocess namespace sharing between multiple containers.

Cloud Run directly supports a subset of theKubernetes Pod API surface,which is described in thereference YAML for the Cloud Run Service object and in thereference YAML for the Cloud Run Job object.These reference guides can help you to evaluate the two platforms alongside yourapplication requirements.

Thecontainer contract for the Cloud Run execution environment is relatively straightforward and will suit most serving workloads. However, thecontract specifies some requirements that must be fulfilled. If your applicationor its dependencies can't fulfill those requirements, or if you require a finerdegree of control over the execution environment, then Autopilot mightbe more suitable.

If you want to reduce the time that you spend on configuration andadministration, consider choosing Cloud Run as your runtimeenvironment. Cloud Run has fewer configuration options thanAutopilot, so it can help you to maximize developer productivity andreduce operational overhead.

Network traffic management and routing

Both Cloud Run and GKE Autopilotintegrate with Google Cloud Load Balancing.However, GKE Autopilot additionally provides arich and powerful set of primitives for configuring the networking environment for service-to-servicecommunications. The configuration options include granular permissions andsegregation at the network layer by usingnamespaces andnetwork policies,port remapping, and built-in DNS service discovery within the cluster.GKE Autopilot also supports the highlyconfigurable and flexibleGateway API.This functionality provides powerful control over the way that traffic is routedinto and between services in the cluster.

Because Autopilot is highly configurable, it can be the best option ifyou have multiple services with a high degree of networking codependency, orcomplex requirements around how traffic is routed between your applicationcomponents. An example of this pattern is a distributed application that isdecomposed into numerousmicroservices that have complex patterns of interdependence. In such scenarios,Autopilot networking configuration options can help you to manage andcontrol the interactions between services.

Horizontal and vertical scalability

Cloud Run and GKE Autopilot bothsupport manual and automatic horizontal scaling for services and jobs.Horizontal scaling provides increased processing power when required, and itremoves the added processing power when it isn't needed. For a typical workload,Cloud Run can usually scale out more quickly thanGKE Autopilot to respond to spikes in the numberof requests per second. As an example, the video demonstration"What's New in Serverless Compute?" shows Cloud Run scaling from zero to over 10,000 instances inapproximately 10 seconds. To increase the speed of horizontal scaling onKubernetes (at some additional cost), Autopilot lets youprovision extra compute capacity.

If your application can't scale by adding more instances to increase the levelof resources that are available, then it might be a better fit forAutopilot. Autopilot supportsvertical scaling to dynamically vary the amount of processing power that's available withoutincreasing the number of running instances of the application.

Cloud Run can automatically scale your applications down to zeroreplicas while they aren't being used, which is helpful for certain use casesthat have a special focus on cost optimization. Because of thecharacteristics of how your applications can scale to zero,there aremultiple optimization steps that you can take to minimize the time between the arrival of a request and thetime at which your application is up and running, and able to process therequest.

Support for stateful applications

Autopilot offers complete Kubernetes Volume support, backed byPersistent Disks that let you run a broad range of stateful deployments, includingself-managed databases. Both Cloud Run andGKE Autopilot let you connect with other serviceslike Filestore and Cloud Storage buckets. They also both include theability to mount object-store buckets into the file system withCloud Storage FUSE.

Cloud Run uses an in-memory file system, which might not be agood fit for applications that require apersistent local file system. Inaddition, the local in-memory file system is shared with the memory of yourapplication. Therefore, both the ephemeral file system and the application andcontainer memory usage contribute towards exhausting the memory limit. You canavoid this issue if you use adedicated in-memory volume with a size limit.

A Cloud Run service or job container has a maximumtask timeout.A container running within a pod in an Autopilot cluster canbe rescheduled, subject to any constraints that are configured withPod Disruption Budgets (PDBs).However, pods can run for up to seven days when they'reprotected from eviction caused by node auto-upgrades or scale-down events. Typically, task timeout ismore likely to be a consideration for batch workloads inCloud Run. For long-lived workloads, and for batch tasks thatcan't be completed within the maximum task duration, Autopilot might bethe best option.

CPU architecture

All Google Cloud compute platforms support x86 CPU architecture.Cloud Run doesn't supportArm architecture processors,but Autopilot supports managed nodes that are backed byArm architecture.If your workload requires Arm architecture, you will need to useAutopilot.

Accelerator offload

Autopilotsupports the use of GPUs andthe use of TPUs,including the ability to consumereserved resources.Cloud Runsupports the use of GPUs with somelimitations.

High memory, CPU, and other resource requirements

Compared toGKE Autopilot resource request limits,the maximum CPU and memory resources that can be consumed by a singleCloud Run service or job (a single instance) islimited.Depending on the characteristics of your workloads, Cloud Runmight have other limits that constrain the resources that are available. Forexample, the startup timeout and the maximum number of outbound connectionsmight be limited with Cloud Run. With Autopilot, somelimits might not apply or they might have higher permitted values.

Explicit dependency on Kubernetes

Some applications, libraries, or frameworks might have an explicit dependencyon Kubernetes. The Kubernetes dependency might be a result of one of thefollowing:

  1. The application requirements (for example, the application callsKubernetes APIs, or uses Kubernetescustom resources).
  2. The requirements of the tooling that's used to configure or deploy theapplication (such asHelm).
  3. The support requirements of a third-party creator or supplier.

In these scenarios, Autopilot is the target runtime environmentbecause Cloud Run doesn't support Kubernetes.

Complex RBAC for multi-tenancy

If your organization has particularly complex organizational structures orrequirements for multi-tenancy, then use Autopilot so that you can takeadvantage of Kubernetes'Role-Based Access Control (RBAC).For a simpler option, you can use the security and segregation capabilities thatare built in to Cloud Run.

Organizational considerations

The following are some of the organizational considerations that will influenceyour choice of environment:

  • Broad technical strategy: Your organization's technical direction.
  • Leveraging the Kubernetes ecosystem: Interest in leveraging the OSScommunity.
  • Existing in-house tooling: Incumbent use of certain tooling.
  • Development team profiles: Developer skill-sets and experience.
  • Operational support: Operations teams' capabilities and focus.

The following sections detail these organizational considerations to help youchoose an environment.

Broad technical strategy

Organizations or teams might have agreed-upon strategies for preferring certaintechnologies over others. For example, if a team has an agreement to standardizewhere possible on either serverless or Kubernetes, that agreement mightinfluence or even dictate a target runtime environment.

If a given workload isn't a good fit for the runtime environment that'sspecified in the strategy, you might decide to do one or more of the following,with the accompanying caveats:

  • Rearchitect the workload. However, if the workload isn't a good fit,doing so might result in non-optimal performance, cost, security, or othercharacteristics.
  • Register the workload as an exception to the strategic direction.However, if exceptions are overused, doing so can result in a disparatetechnology portfolio.
  • Reconsider the strategy. However, doing so can result in policy overheadthat can impede or block progress.

Leveraging the Kubernetes ecosystem

As part of the broad technical strategy described earlier, organizations orteams might decide to select Kubernetes as their platform of choice because ofthe significant and growing ecosystem. This choice is distinct from selectingKubernetes because of technical application dependencies, as described in thepreceding sectionExplicit dependency on Kubernetes.The consideration to use the Kubernetes ecosystem places emphasis on anactive community, rich third-party tooling, and strong standards andportability. Leveraging the Kubernetes ecosystem can accelerate your developmentvelocity and reduce time to market.

Existing in-house tooling

In some cases, it can be advantageous to use existing tooling ecosystems inyour organization or team (for any of the environments). For example, if you'reusing Kubernetes, you might opt to continue using deployment tooling likeArgoCD,security and policy tooling likeGatekeeper,and package management likeHelm.Existing tooling might include established rules for organizational complianceautomation and other functionality that might be costly or require a longlead-time to implement for an alternative target environment.

Development team profiles

An application or workload team might have prior experience with Kubernetesthat can accelerate the team's velocity and capability to deliver onAutopilot. It can take time for a team to become proficient with a newruntime environment. Depending on the operating model, doing so can potentiallylead to lower platform reliability during the upskilling period.

For a growing team, hiring capability might influence an organization's choiceof platform. In some markets, Kubernetes skills might be scarce and thereforecommand a hiring premium. Choosing an environment such asCloud Run can help you to streamline the hiring process andallow for more rapid team growth within your budget.

Operational support

When you choose a runtime environment, consider the experience and abilities ofyourSRE,DevOps, and platforms teams, and other operational staff. The capabilities ofthe operational teams to effectively support the production environment arecrucial from a reliability perspective. It's also critical that operationalteams can support pre-production environments to ensure that developer velocityisn't impeded by downtime, reliance on manual processes, or cumbersomedeployment mechanisms.

If you use Kubernetes, a central operations or platform engineering team canhandleAutopilot Kubernetes upgrades.Although the upgrades are automatic, operational staff will typically closelymonitor them to ensure minimal disruptions to your workloads. Some organizationschoose tomanually upgrade control plane versions.GKE also includescapabilities to streamline and simplifythe management of applications across multiple clusters.

In contrast to Autopilot, Cloud Run doesn't requireongoing management overhead or upgrades of the control plane. By usingCloud Run, you can simplify your operations processes. Byselecting a single runtime environment, you can further simplify your operationsprocesses. If you opt to use multiple runtime environments, you need to ensurethat the team has the capacity, capabilities, and interest to support thoseruntime environments.

Selection

To begin the selection process, talk with the various stakeholders. For eachapplication, assemble a working group that consists of developers, operationalstaff, representatives of any central technology governance group, internalapplication users and consumers, security, cloud financial optimization teams,and other roles or groups within your organization that might be relevant. Youmight choose to circulate an information-gathering survey to collate applicationcharacteristics, and share the results in advance of the session. We recommendthat you select a small working group that includes only the requiredstakeholders. All representatives might not be required for every workingsession.

You might also find it useful to include representatives from other teams orgroups that have experience in building and running applications on eitherAutopilot or Cloud Run, or both. Use the technical andorganizational considerations from this document to guide your conversation andevaluate your application's suitability for each of the potential platforms.

We recommend that you schedule a check-in after some months have passed toconfirm or revisit the decision based on the outcomes of deploying yourapplication in the new environment.

What's next

Contributors

Author:Henry Bell | Cloud Solutions Architect

Other contributors:

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2024-08-30 UTC.