Profile-based configurations for AI/ML workloads

This document describes how you can use profile-based configurations tostreamline adoption and enhance the performance of Cloud Storage FUSE for yourartificial intelligence or machine learning (AI/ML) workloads.

To help you streamline Cloud Storage FUSE configuration for your serving,checkpointing, or training workloads, you can apply pre-configured profilesbased on your workload type using theprofile field or--profile option. Using the field or option, you can specify apredefined, optimized set of Cloud Storage FUSE features for caching, threading, andbuffer sizes, ensuring high performance with minimal effort for training,checkpointing, and serving workloads, with profile valuesaiml-training,aiml-checkpointing, andaiml-serving respectively.

Considerations

  • You can only set the--profile option orprofile field during a mountoperation. If you need to update the--profile option orprofile field,you need to remount your Cloud Storage FUSE bucket.

  • When you use profile-based configurations, Cloud Storage FUSE sets the metadatacache capacity and time to live (TTL) to unlimited, meaning that entries arenever evicted from the metadata cache. If your virtual machine doesn't haveenough memory, you might experience Out of Memory (OOM) errors. Therefore, werecommend reviewing your memory capacity before you apply profile-basedconfigurations. OOM errors are more likely to occur on machines with less thanone TiB of memory.

  • When a Cloud Storage FUSE parameter is configured in multiple ways, the followingorder of precedence applies (from highest to lowest):

    1. Values set directly in agcsfuse command or a Cloud Storage FUSE configurationfile.
    2. Values set by a profile, where the profile is specified using the--profile option in agcsfuse command or theprofile field in aCloud Storage FUSE configuration file.
    3. Default values automatically applied when Cloud Storage FUSE detects ahigh-performance machine type. For more information, seeAutomated configuration values for high-performance machine types.
  • Cloud Storage FUSE CSI volumes in Google Kubernetes Engine Pods don't support theprofilefield or--profile option.

  • File caching cannot be enabled using profile-based configurations becausefile caching requires the use of Cloud Storage FUSE configuration fields andCloud Storage FUSE CLI options that can't be generalized. To enable file cachingfor serving, training, or checkpointing workloads, you must configurefile caching options or fields explicitly.

Apply profile-based configurations for training workloads

The training-specific profile optimizes performance for high throughputreads of large datasets and prevents Cloud GPUs and Cloud TPU hardwarefrom waiting for data.

To apply the training-specific profile, specify eitherprofile: aiml-training using a Cloud Storage FUSE configuration file or--profile=aiml-training using theCloud Storage FUSE CLI. The following configurations are then applied:

# Create implicit directories locally when accessed:-implicit-dirs# Disable caching for lookups of files or directories that don't exist:-metadata-cache:negative-ttl-secs:0# Keep cached metadata (file attributes, types) indefinitely time-wise:-metadata-cache:ttl-secs:-1# Allow unlimited size for the file attribute (stat) cache:-metadata-cache:stat-cache-max-size-mb:-1# Allow unlimited size for the file/directory type cache:-metadata-cache:type-cache-max-size-mb:-1

Apply profile-based configurations for checkpointing workloads

The checkpointing-specific profile optimizes performance for highthroughput writes for large files by drastically reducing the time it takesto save multi-gigabyte checkpoints, minimizing training pauses.

To apply the checkpointing-specific profile, specify eitherprofile: aiml-checkpointing using a Cloud Storage FUSE configuration file or--profile=aiml-checkpointing using theCloud Storage FUSE CLI. The following configurations are then applied:

# Create implicit directories locally when accessed:-implicit-dirs# Disable caching for lookups of files/dirs that don't exist:-metadata-cache:negative-ttl-secs:0# Keep cached metadata (file attributes, types) indefinitely time-wise:-metadata-cache:ttl-secs:-1# Allow unlimited size for the file attribute (stat) cache:-metadata-cache:stat-cache-max-size-mb:-1# Allow unlimited size for the file/directory type cache:-metadata-cache:type-cache-max-size-mb:-1# Cache the entire file when any part is read sequentially:-file-cache:cache-file-for-range-read:true# Allow renaming directories with a lot of files in non-HNS buckets.-file-system:rename-dir-limit:200000

Apply profile-based configurations for serving workloads

Serving optimizes performance for serving workloadsby improving data access and caching mechanisms.

To apply the serving-specific profile, specify eitherprofile: aiml-serving using a Cloud Storage FUSE configuration file or--profile=aiml-serving using theCloud Storage FUSE CLI. The following configurations are then applied:

# Create implicit directories locally when accessed:-implicit-dirs# Disable caching for lookups of files/dirs that don't exist:-metadata-cache:negative-ttl-secs:0# Keep cached metadata (file attributes, types) indefinitely time-wise:-metadata-cache:ttl-secs:-1# Allow unlimited size for the file attribute (stat) cache:-metadata-cache:stat-cache-max-size-mb:-1# Allow unlimited size for the file/directory type cache:-metadata-cache:type-cache-max-size-mb:-1# Cache the entire file when any part is read sequentially:-file-cache:cache-file-for-range-read:true# Enable kernel-list-cache to make listing faster as this is a readonly file system hierarchy.-file-system:kernel-list-cache-ttl-secs:-1

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.