Hammerspace boosts AI data access performance three ways

Hammerspace has accelerated its data orchestrating platform software with faster and more scalable metadata reads, and better data placement in a GPU server’s directly-attached storage drives.

The company is also supporting OCI, the Oracle public cloud, and adding finer-grain access control to prevent inappropriate data exposure. Hammerspace’sData Platform software product, renamed from its prior Global Data Environment moniker, adds data from file and object storage systems into a global namespace and tiers data between storage devices and media to optimize costs and access speeds. It uses pNFS and supports Nvidia’s GPU Direct so that it can pump data at high-speed to Nvidia’s GPU servers.

Molly Presley.

Hammerspace SVP for Global Marketing, Molly Presley, said: “AI is fundamentally changing how organizations interact with their data. Workloads that were once separate are now deeply interconnected, and the data platform must keep pace. 

“The v5.2 advancements  strengthen our ability to unify and accelerate data for AI, HPC and enterprise environments without requiring customers to rebuild storage silos or redesign their infrastructure. It marks another important step toward enabling truly AI-ready data everywhere.”

Hammerspace says its v5.2 Data Platform software achieved a 33.7 percent higher IO500 overall  score than results on the previous version published five months ago, with total bandwidth doubling and individual sub-tests showing dramatic improvements — including an over 800 percent gain in IOR-Hard-Read. The IOR (Interleaved or Random I/O) test is scored as bandwidth, and carries out small, unaligned, and interleaved reads (typically 47 KB transfers) from a single shared file across multiple MPI (Message Passing Interface) processes. This simulates contended, metadata-intensive workloads, such as those involving lock contention in distributed file systems such as Lustre, GPFS/Spectrum Scale, or NFS. It’s distinct from caching.

The company says it’s achieved this acceleration from client-side NFS performance enhancement software its contributed to the standard Linux kernel. The company has made sure its Data Platform SW uses these new Linux kernel features. That means, it says, all the file and object storage systems it orchestrates benefit from the acceleration when their stored data is accessed through the Hammerspace software.

The company has also added Tier 0 affinitization; locality-aware intelligence or both reads and writes within a GPU compute cluster, to itstier 0 software. This includes a GPU server’s local and direct-attached storage within the Hammerspace namespace. A GPU server can access data in these drives, Hammerspace says, even faster than data in GPU Direct-accelerated external storage systems.

Tier 0 affinitization ensures data is transferred to the requesting GPU server’s local drives in a cluster, and not any GPU server’s local drives. This reduces east-west network traffic within the cluster. It’d automatic, transparent, and enabled by default.

Hammerpace says: “Tier 0 delivers the best possible performance when a compute node can use its own local Tier 0 volume for I/O, instead of crossing the network to read or write to another node’s NVMe. To make that happen, the Anvil [Hammerspace SW component] needs to recognize when the pNFS client requesting a layout is also hosting a Tier 0 storage volume and then place that local volume first in the layout.”

A third performance booster removes metadata scaling limitations. Its Shared Referrals mechanism distributes the namespace across as many metadata servers as are needed to accommodate extreme file counts. Hammerspace says it “ensures linear scalability so performance and responsiveness  remain steady even as data estates for AI and HPC environments explode.“

Inadvertent data exposure is being controlled by adding Kerberos authentication and Labeled NFS support. This enables SELinux and other Mandatory Access Control (MAC) systems to transport and enforce security labels across NFS, providing consistent, fine-grained control over data access. It should please customers in regulated industries, government and private research areas.

Hammerspace already supports its software running in the AWS, Azure and Google clouds providing a hybrid, data orchestrated,  on-premises-to-public cloud environment. Now it’s adding support for Oracle Cloud Infrastructure (OCI). It says new shapes – the OCI term for a server or VM instance configuration – will be supported, including bare metal. It will add support for dedicated OCI regions to help with data sovereignty requirements. 

The Hammerspace v5.2 Data Platform software will be generally available in December. Read more about Tier 0 affinitization in ablog.