File storage on Compute Engine

Last reviewed 2025-07-29 UTC

File storage, also known as network-attached storage (NAS), provides file-levelaccess to applications to read and update information that can be shared acrossmultiple machines. Some on-premises file storage solutions have a scale-uparchitecture and simply add storage to a fixed amount of compute resources.Other file storage solutions have a scale-out architecture where capacity andcompute (performance) can be incrementally added to an existing file system asneeded. In both storage architectures, one or multiple virtual machines (VMs)can access the storage.

Although some file systems use a native POSIX client, many storage systems use aprotocol that enables client machines to mount a file system and access thefiles as if they were hosted locally. The most common protocols for exportingfile shares are Network File System (NFS) for Linux (and in some cases Windows)and Server Message Block (SMB) for Windows.

This document describes the following options for sharing files:

An underlying factor in the performance and predictability of all of the Google Cloud services is the network stack that Google evolved over many years.With theJupiter Fabric,Google built a robust, scalable, and stable networking stack that can continueto evolve without affecting your workloads. As Google improves and bolsters itsnetwork abilities internally, your file-sharing solution benefits from the addedperformance.

One feature of Google Cloud that can help you get the most out of yourinvestment is the ability to specifycustom VM types.When choosing the size of your filer, you can pick exactly the right mix ofmemory and CPU, so that your filer is operating at optimal performance withoutbeing oversubscribed.

Note thatCloud Storage is also a great way to store petabytes or exabytes of data with high levels ofredundancy at a low cost, but Cloud Storage has a differentperformance profile and API than the file servers discussed here.

Summary of file-server solutions

The following table summarizes the file-server solutions and features:

SolutionOptimal datasetThroughputManaged supportExport protocols
Filestore Basic1 TiB to 64 TiBUp to 1.2 GiB/sFully managed by GoogleNFSv3
Filestore Zonal1 TiB to 100 TiBUp to 26 GiB/sFully managed by GoogleNFSv3, NFSv4.1
Filestore Regional1 TiB to 100 TiBUp to 26 GiB/sFully managed by GoogleNFSv3, NFSv4.1
Managed Lustre18 TiB to 8 PiBUp to 1 TB/sFully managed by GooglePOSIX
NetApp Volumes1 GiB to 1 PiB1 MB/s to 30 GiB/sFully managed by GoogleNFSv3, NFSv4.1, SMB3
Read-only Persistent Disk< 64 TB240 to 1,200 MBpsNoDirect attachment

Durable disks and Local SSD

If you have data that only needs to be accessed by a single VM or doesn't changeover time, you can avoid a file server altogether by using the durable disksoffered by Compute Engine—Hyperdisk or Persistent Disk.You can format Hyperdisk and Persistent Disk short volumes witha file system such as Ext4 or XFS and attach them to VMs in either read-write orread-only mode. This means that you can first attach a volume to an instance,load it with the data you need, and thenattach it as a read-only disk to hundreds of VMs simultaneously. Employing read-only disks does notwork for all use cases, but it can greatly reduce complexity, compared to usinga file server.

Durable disks deliver consistent performance. All Persistent Disk volumes of the same size(and for SSD Persistent Disk, the same number of vCPUs) that you attach toyour instance have the same performance characteristics. You don't need topre-warm or test your disks before using them in production.

The cost of persistent disks is simple to determine because there are no I/Ocosts to consider after provisioning your volume. Persistent disks can alsobe resized when required. This lets you start with a low-cost and low-capacityvolume, and you need not create additional instances or disks toscale your capacity.

If total storage capacity is the main requirement, you can use low-cost standardpersistent disks. For the best performance while continuing to be durable,you can use SSD persistent disks.

Furthermore, it's important that you choose the correct Compute Enginepersistent disk capacity and number of vCPUs to ensure that your file server'sstorage devices receive the required storage bandwidth, IOPS, and networkbandwidth. The network bandwidth for VMs depends on the machine type that youchoose. For example, A4 VMs have a maximum network bandwidth of up to 3,600 Gbps.For more information, seeMachine families resource and comparison guide.For information about tuning persistent disks, seeAbout Persistent Disk performance.

If your data is ephemeral and requires sub-millisecond latency and high I/Ooperations per second (IOPS), you can take advantage of up to 9 TB of Local SSDsfor extreme performance. Local SSDs provide GB/s of bandwidth and millions ofIOPS, all while not using up your instances' allotted network bandwidth. It isimportant to remember though that Local SSDs have certain trade-offs inavailability, durability, and flexibility.

For more information about the storage options for Compute Engine, seeDesign an optimal storage strategy for your cloud workload.

Considerations when choosing a file storage solution

Choosing a file storage solution requires you to make tradeoffs regardingmanageability, cost, performance, and scalability. Making the decisionis easier if you have a well-defined workload, which isn't often the case.Where workloads evolve over time or are highly variant, it's prudent totrade cost savings for flexibility and elasticity, so you can grow intoyour solution. On the other hand, if you have a temporal and well-knownworkload, you can create a purpose-built file storage architecture that you cantear down and rebuild to meet your immediate storage needs.

One of the first decisions to make is whether you want to pay for a managedstorage service, a solution that includes product support, or anunsupported solution.

  • Managed file storage services are the easiest to operate, because eitherGoogle or a partner is handling all operations. These services mighteven provide a service level agreement (SLA) for availability like mostother Google Cloud services.
  • Unmanaged, yet supported, solutions provide additional flexibility. Partnerscan help with any issues, but the day-to-day operation of the storagesolution is left to the user.
  • Unsupported solutions require the most effort to deploy and maintain,leaving all issues to the user. These solutions are not covered in thisdocument.

Your next decision involves determining the solution's durability andavailability requirements. Most file solutions are zonal solutions and don'tprovide protection by default if the zone fails. So it's important to considerif a disaster recovery (DR) solution that protects against zonal failures isrequired. It's also important to understand the application requirements fordurability and availability. For example, the choice of local SSDs or persistentdisks in your deployment has a big impact, as does the configuration of the filesolution software. Each solution requires careful planning to achieve highdurability, availability, and even protection against zonal and regionalfailures.

Finally, consider the locations (that is, zones,regions,on-premises data centers) where you need to access the data. The locations ofthe compute farms that access your data influence your choice of filer solutionbecause only some solutions allow hybrid on-premises and in-cloud access.

Managed file storage solutions

This section describes the Google-managed solutions for file storage.

Filestore Basic

Filestore Basic instances are suitable for file sharing, software development, and GKEworkloads. You can choose either HDD or SSD for storing data. SSD providesbetter performance. With either option, capacity scales up incrementally, andyou can protect the data by using backups.

Filestore Zonal

Filestore Zonal simplifies enterprise storage and data management on Google Cloud andacross hybrid clouds. Filestore Zonal delivers cost-effective,high-performance parallel access to global data while maintaining strictconsistency powered by a dynamically scalable, distributed file system. WithFilestore Zonal, existing NFS applications and NAS workflows canrun in the cloud without requiring refactoring, yet retain the benefits ofenterprise data services (for example, snapshots and backups). TheFilestore CSI driver allows seamless data persistence, portability, and sharing for containerizedworkloads.

You can scale Filestore Zonal instances on demand. This lets youcreate and expand file system infrastructure when required, ensuring thatstorage performance and capacity always align with your dynamic workflowrequirements. As a Filestore Zonal cluster expands, both metadataand I/O performance scale linearly. This scaling lets you enhance and acceleratea broad range of data-intensive workflows, including high performance computing,analytics, cross-site data aggregation, DevOps, and many more. As a result,Filestore Zonal is a great fit for use in data-centric industriessuch as life sciences (for example, genome sequencing), financial services, andmedia and entertainment.

To further protect critical data, Filestore Zonal also lets youtake and keep periodic snapshots, create backups, and replicate to anotherregion. With Filestore, you can recover an individual file or anentire file system in less than 10 minutes from any of the prior recoverypoints.

Filestore Regional

Filestore Regionalis a fully managed cloud-native NFS solution that lets you deploy critical file-based applications in Google Cloud, backed by an SLA that delivers99.99% regional availability. With a 99.99% regional-availability SLA,Filestore Regional is designed for applications that demandhigh availability. With a few mouse clicks (or a fewgcloud commands or APIcalls), you can provision NFS shares that are synchronously replicated acrossthree zones within a region. If any zone within the region becomes unavailable,Filestore Regional continues to transparently serve data to theapplication with no operational intervention.

To further protect critical data, Filestore Regional also letsyou take and keep periodic snapshots, create backups, and replicate to anotherregion. With Filestore, you can recover an individual file or anentire file system in less than 10 minutes from any of the prior recoverypoints.

To further protect critical data, Filestore also lets you takeand keep periodic snapshots of the file system. With Filestore,you can recover an individual file or an entire file system in less than10 minutes from any of the prior recovery points.

For critical applications likeSAP,both the database and application tiers need to be highly available. To satisfythis requirement, you can deploy the SAP database tier toGoogle Cloud Hyperdisk Extreme,in multiple zones using built-in database high availability. Similarly, theNetWeaver application tier, which requires shared executables across many VMs,can be deployed to Filestore Regional, which replicates theNetweaver data across multiple zones within a region. The end result is a highlyavailable three-tier mission-critical application architecture.

Note: For more information about region-specific considerations, seeGeography and regions.

IT organizations are also increasingly deploying stateful applications incontainers on Google Kubernetes Engine (GKE). This often causes them to rethinkwhich storage infrastructure to use to support those applications. You can useblock storage (Hyperdisk or Persistent Disk), file storage (Filestore Basic,Zonal, or Regional), or object storage (Cloud Storage). Filestore Basic HDD for GKE andFilestore multishares for GKE combined with theFilestore CSI driver lets organizations that require multiple GKE Pods haveshared file access, providing an increased level of availability formission-critical workloads.

Managed Lustre

Managed Lustre is a Google-managed service that provides high-throughput and low-latencystorage for tightly coupled HPC workloads. It significantly accelerates HPCworkloads and AI training and inference by providing high-throughput,low-latency access to massive datasets. For information about usingManaged Lustre for AI and ML workloads, seeDesign storage for AI and ML workloads in Google Cloud.Managed Lustre distributes data across multiple storage nodes,which enables concurrent access by many VMs. This parallel access eliminatesbottlenecks that occur with conventional file systems and it enables workloadsto rapidly ingest and process the vast amounts of data required.

NetApp Volumes

NetApp Volumes is a fully managed Google service that lets youquickly mount shared file storage to your Google Cloud compute instances.NetApp Volumes supports SMB, NFS, and multi-protocol access.NetApp Volumes delivers high performance to your applications atlow latency, with robust data-protection capabilities: snapshots, copies,cross-region replication, and backup. The service is suitable for applicationsrequiring both sequential and random workloads, which can scale across hundredsor thousands of Compute Engine instances. In seconds, volumes that rangein size from GiBs to a PiB can be provisioned and protected with robustdata protection capabilities. With multiple service levels (Flex, Standard, Premium, andExtreme), NetApp Volumes deliversthe appropriate performance for your workload, without affecting availability.

Partner solutions in Cloud Marketplace

The following partner-provided solutions are available inCloud Marketplace.

NetApp Cloud Volumes ONTAP

NetApp Cloud Volumes ONTAP (NetApp CVO) is a customer-managed, cloud-based solutionthat brings the full feature set ofONTAP,NetApp's leading data management operating system, to Google Cloud. NetAppCVO is deployed within your VPC, with billing and support fromGoogle. The ONTAP software runs on a Compute Engine VM, and uses acombination of persistent disks and Cloud Storage buckets (if tieringis enabled) to store the NAS data. The built-in filer accommodates the NASvolumes using thin provisioning so that you pay only for the storage you use. Asthe data grows, additional persistent disks are added to the aggregate capacitypool.

NetApp CVO abstracts the underlying infrastructure and let you create virtualdata volumes carved out of the aggregate pool that are consistent with all otherONTAP volumes on any cloud or on-premises environment. The data volumes youcreate support all versions of NFS, SMB, multi-protocol NFS/SMB, and iSCSI. Theysupport a broad range of file-based workloads, including web and rich mediacontent, used across many industries such as electronic design automation (EDA)and media and entertainment.

NetApp CVO supports instant, space-saving point-in-time snapshots, built-inblock-level, incremental forever backup to Cloud Storage andcross-region asynchronous replication for disaster recovery. The option toselect the type of Compute Engine instance and persistent disks lets youachieve the performance you want for your workloads. Even when operating in ahigh-performance configuration, NetApp CVO implements storageefficiencies such as deduplication, compaction, and compressions as well asauto-tiering infrequently-used data to the Cloud Storage bucketenabling you to store petabytes of data while significantly reducing overallstorage costs.

DDN Infinia

If you need advanced AI data orchestration, you can useDDN Infinia, which isavailable in Google Cloud Marketplace. Infinia provides an AI-focused dataintelligence solution that's optimized for inference, training, and real-timeanalytics. It enables ultra-fast data ingestion, metadata-rich indexing, andseamless integration with AI frameworks like TensorFlow and PyTorch.

The following are the key features of DDN Infinia:

  • High performance: Delivers sub-millisecond latency and multiple TB/sthroughput.
  • Scalability: Supports scaling from terabytes to exabytes and can accommodateup to 100,000+ GPUs and one million simultaneous clients in a single deployment.
  • Multi-tenancy with predictable quality of service (QoS): Offers secure,isolated environments for multiple tenants with predictable QoS for consistentperformance across workloads.
  • Unified data access: Enables seamless integration with existing applicationsand workflows through built-in multi-protocol support, including forAmazon S3-compatible, CSI, and Cinder.
  • Advanced security: Features built-in encryption, fault-domain-aware erasurecoding, and snapshots that help to ensure data protection and compliance.

Nasuni Cloud File Storage

Nasuni replaces enterprise file servers and NAS devices and all associatedinfrastructures, including backup and DR hardware, with a simpler, low-costcloud alternative. Nasuni uses Google Cloud object storage to deliver amore efficient software-as-a-service (SaaS) storage solution that scalesto handle rapid, unstructured file data growth. Nasuni is designed to handledepartment, project, and organizational file shares and application workflowsfor every employee, wherever they work.

Nasuni Cloud File Storage.

Nasuni offers three packages, with pricing for companies and organizations ofall sizes so they can grow and expand as needed.

Its benefits include the following:

  • Cloud-based primary file storage for up to 70% less. Nasuni'sarchitecture takes advantage of built-in object lifecyclemanagement policies. These policies allow complete flexibility for use withCloud Storage classes, including Standard, Nearline,Coldline, and Archive. By using the immediate-accessArchive class for primary storage with Nasuni, you can realize costsavings of up to 70%.

  • Departmental and organizational file shares in the cloud. Nasuni'scloud-based architecture offers a single global namespace acrossGoogle Cloud regions, with no limits on the number of files, filesizes, or snapshots, letting you store files directly from your desktop intoGoogle Cloud through standard NAS (SMB) drive-mapping protocols.

  • Built-in backup and disaster recovery. Nasuni's "set-it and forget-it"operations make it simple to manage global file storage. Backup and DR isincluded, and a single management console lets you oversee and controlthe environment anywhere, anytime.

  • Replaces aging file servers.Nasuni makes it simple to migrate Microsoft Windows file servers and otherexisting file storage systems to Google Cloud, reducing costs andmanagement complexity of these environments.

For more information, see the following:

Sycomp Intelligent Data Storage Platform

Sycomp Intelligent Data Storage Platform, which is available inGoogle Cloud Marketplace, lets you run your high performance computing (HPC),AI and ML, and big data workloads in Google Cloud. With Sycomp Storage youcan concurrently access data from thousands of VMs, reduce costs byautomatically managing tiers of storage, and run your application on-premises orin Google Cloud. Sycomp Storage can be deployed quickly and it supportsaccess to your data through NFS and the IBM Storage Scale client.

IBM Storage Scale is a parallel file system that helps to securely manage largevolumes (PBs) of data. Sycomp Storage Scale is a parallel file system that'swell suited for HPC, AI, ML, big data, and other applications that require aPOSIX-compliant shared file system. With adaptable storage capacity andperformance scaling, Sycomp Storage can support small to large HPC, AI, and MLworkloads.

After you deploy a cluster in Google Cloud, you decide how you want to useit. Choose whether you want to use the cluster only in the cloud or in hybridmode by connecting to existing on-premises IBM Storage Scale clusters,third-party NFS NAS solutions, or other object-based storage solutions.

Contributors

Author:Sean Derrington | Group Product Manager, Storage

Other contributors:

Before you begin

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-07-29 UTC.