Movatterモバイル変換


[0]ホーム

URL:


US20100223539A1 - High efficiency, high performance system for writing data from applications to a safe file system - Google Patents

High efficiency, high performance system for writing data from applications to a safe file system
Download PDF

Info

Publication number
US20100223539A1
US20100223539A1US12/734,201US73420108AUS2010223539A1US 20100223539 A1US20100223539 A1US 20100223539A1US 73420108 AUS73420108 AUS 73420108AUS 2010223539 A1US2010223539 A1US 2010223539A1
Authority
US
United States
Prior art keywords
data
server
disk
compute node
disks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US12/734,201
Other versions
US8316288B2 (en
Inventor
Paul Nowoczynski
Nathan Stone
Jared Yanovich
Jason Sommerfield
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Carnegie Mellon University
Original Assignee
Carnegie Mellon University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Carnegie Mellon UniversityfiledCriticalCarnegie Mellon University
Priority to US12/734,201priorityCriticalpatent/US8316288B2/en
Assigned to CARNEGIE MELLON UNIVERSITYreassignmentCARNEGIE MELLON UNIVERSITYASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: STONE, NATHAN, NOWOCZYNSKI, PAUL, YANOVICH, JARED
Publication of US20100223539A1publicationCriticalpatent/US20100223539A1/en
Application grantedgrantedCritical
Publication of US8316288B2publicationCriticalpatent/US8316288B2/en
Expired - Fee Relatedlegal-statusCriticalCurrent
Adjusted expirationlegal-statusCritical

Links

Images

Classifications

Definitions

Landscapes

Abstract

Systems and methods for increasing the efficiency of data storage processes for high performance, high core number computing systems. In one embodiment, the systems of the present invention perform sequential I/O whenever possible. To achieve a high degree of sequentiality, the block allocation scheme is determined by the next available block on the next available disk. This simple, non-deterministic data placement method is extremely effective for providing sequential data streams to the spindle by minimizing costly seeks. The sequentiality of the allocation scheme is not affected by the number of clients, the degree of randomization within the incoming data streams, the logical byte addresses of incoming request's file extents, or the RAID attributes (i.e., parity position) of the block.

Description

Claims (18)

1. A computing platform comprising:
a compute node comprising:
multiple processing cores for executing an application; and
multiple vector-based cache buffers, wherein data from1/0 calls from execution of the application are aggregated in the cache buffers according to a plurality of parity groups;
a storage server cluster in communication with the compute node, wherein the storage server cluster comprises a plurality of I/O servers, wherein each I/O server is connected to and controls a plurality of disk drive data storage systems, wherein:
data in the cache buffers are transmitted from the compute node to at least one of the I/O server servers of the storage server cluster and stored in queues on the at least I/O server; and
the disk drives data storage systems write the data from the queues sequentially in data fragments to disks of the disk drive data storage systems such that data fragments of differing parity groups are not written on the same disk.
10. A method for storing data from a compute node, wherein the compute node comprises multiple processing cores for executing an application and multiple vector-based cache buffers, the method comprising:
aggregating data from I/O calls from execution of the application in the cache buffers according to a plurality of parity groups;
transmitting the data in the buffers to at least one I/O server of a storage server cluster that is in communication with the compute node, wherein each I/O server is connected to and controls a plurality of disk drive data storage systems
storing the data transmitted from the compute node in queues on the at least I/O server; and
writing the data from the queues sequentially in data fragments to disks of the disk drive data storage systems such that data fragments of differing parity groups are not written on the same disk.
US12/734,2012007-11-092008-11-07High efficiency, high performance system for writing data from applications to a safe file systemExpired - Fee RelatedUS8316288B2 (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
US12/734,201US8316288B2 (en)2007-11-092008-11-07High efficiency, high performance system for writing data from applications to a safe file system

Applications Claiming Priority (3)

Application NumberPriority DateFiling DateTitle
US247907P2007-11-092007-11-09
US12/734,201US8316288B2 (en)2007-11-092008-11-07High efficiency, high performance system for writing data from applications to a safe file system
PCT/US2008/082794WO2009062029A1 (en)2007-11-092008-11-07Efficient high performance system for writing data from applications to a safe file system

Related Parent Applications (1)

Application NumberTitlePriority DateFiling Date
US61002479Division2007-11-09

Publications (2)

Publication NumberPublication Date
US20100223539A1true US20100223539A1 (en)2010-09-02
US8316288B2 US8316288B2 (en)2012-11-20

Family

ID=40626191

Family Applications (1)

Application NumberTitlePriority DateFiling Date
US12/734,201Expired - Fee RelatedUS8316288B2 (en)2007-11-092008-11-07High efficiency, high performance system for writing data from applications to a safe file system

Country Status (2)

CountryLink
US (1)US8316288B2 (en)
WO (1)WO2009062029A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20130304775A1 (en)*2012-05-112013-11-14Xyratex Technology LimitedStorage unit for high performance computing system, storage network and methods
US20140089459A1 (en)*2012-09-212014-03-27Nyse Group, Inc.High performance data streaming
US20140280680A1 (en)*2013-03-152014-09-18International Business Machines CorporationData transmission for transaction processing in a networked environment
US8862561B1 (en)*2012-08-302014-10-14Google Inc.Detecting read/write conflicts
US9477682B1 (en)*2013-03-132016-10-25Emc CorporationParallel compression of data chunks of a shared data object using a log-structured file system
US10353775B1 (en)*2014-08-062019-07-16SK Hynix Inc.Accelerated data copyback
US11163447B2 (en)*2017-09-032021-11-02Ashish Govind KhurangeDedupe file system for bulk data migration to cloud platform
US20230409234A1 (en)*2022-05-172023-12-21Western Digital Technologies, Inc.Data Storage Device and Method for Host Multi-Command Queue Grouping Based on Write-Size Alignment in a Multi-Queue-Depth Environment

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US9286261B1 (en)2011-11-142016-03-15Emc CorporationArchitecture and method for a burst buffer using flash technology
KR102295769B1 (en)2014-05-202021-08-30삼성전자주식회사Storage system and method of operation of the storage system
JP6818982B2 (en)2015-06-012021-01-27エスゼット ディージェイアイ テクノロジー カンパニー リミテッドSz Dji Technology Co.,Ltd How to store files
US10140041B1 (en)2017-07-282018-11-27EMC IP Holding Company LLCMapped RAID (redundant array of independent disks) in a data storage system with RAID extent sub-groups that are used to perform drive extent allocation and data striping for sequential data accesses to a storage object
US10592111B1 (en)2017-07-282020-03-17EMC IP Holding Company LLCAssignment of newly added data storage drives to an original data storage drive partnership group and a new data storage drive partnership group in a mapped RAID (redundant array of independent disks) system
US10318169B2 (en)2017-10-272019-06-11EMC IP Holding Company LLCLoad balancing of I/O by moving logical unit (LUN) slices between non-volatile storage represented by different rotation groups of RAID (Redundant Array of Independent Disks) extent entries in a RAID extent table of a mapped RAID data storage system
CN110096219B (en)2018-01-312022-08-02伊姆西Ip控股有限责任公司Effective capacity of a pool of drive zones generated from a group of drives

Citations (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5163131A (en)*1989-09-081992-11-10Auspex Systems, Inc.Parallel i/o network file server architecture
US5832198A (en)*1996-03-071998-11-03Philips Electronics North America CorporationMultiple disk drive array with plural parity groups
US5996088A (en)*1997-01-221999-11-30Oracle CorporationHigh-speed database checkpointing through sequential I/O to disk
US20080256427A1 (en)*2007-01-312008-10-16International Business Machines CorporationSystem, method, and service for providing a generic raid engine and optimizer
US7519636B2 (en)*2005-03-302009-04-14Sap AgKey sequenced clustered I/O in a database management system
US7593972B2 (en)*2001-04-132009-09-22Ge Medical Systems Information Technologies, Inc.Application service provider based redundant archive services for medical archives and/or imaging systems
US7593975B2 (en)*2001-12-212009-09-22Netapp, Inc.File system defragmentation technique to reallocate data blocks if such reallocation results in improved layout
US20090249173A1 (en)*2008-03-282009-10-01Hitachi, Ltd.Storage system and data storage method
US20100023847A1 (en)*2008-07-282010-01-28Hitachi, Ltd.Storage Subsystem and Method for Verifying Data Using the Same
US20110145535A1 (en)*2004-02-192011-06-16Takato KusamaMethod for rearranging logical volume

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5163131A (en)*1989-09-081992-11-10Auspex Systems, Inc.Parallel i/o network file server architecture
US5832198A (en)*1996-03-071998-11-03Philips Electronics North America CorporationMultiple disk drive array with plural parity groups
US5996088A (en)*1997-01-221999-11-30Oracle CorporationHigh-speed database checkpointing through sequential I/O to disk
US7593972B2 (en)*2001-04-132009-09-22Ge Medical Systems Information Technologies, Inc.Application service provider based redundant archive services for medical archives and/or imaging systems
US7593975B2 (en)*2001-12-212009-09-22Netapp, Inc.File system defragmentation technique to reallocate data blocks if such reallocation results in improved layout
US20110145535A1 (en)*2004-02-192011-06-16Takato KusamaMethod for rearranging logical volume
US7519636B2 (en)*2005-03-302009-04-14Sap AgKey sequenced clustered I/O in a database management system
US20080256427A1 (en)*2007-01-312008-10-16International Business Machines CorporationSystem, method, and service for providing a generic raid engine and optimizer
US20090249173A1 (en)*2008-03-282009-10-01Hitachi, Ltd.Storage system and data storage method
US20100023847A1 (en)*2008-07-282010-01-28Hitachi, Ltd.Storage Subsystem and Method for Verifying Data Using the Same

Cited By (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20130304775A1 (en)*2012-05-112013-11-14Xyratex Technology LimitedStorage unit for high performance computing system, storage network and methods
US8862561B1 (en)*2012-08-302014-10-14Google Inc.Detecting read/write conflicts
US9407677B2 (en)2012-09-212016-08-02Nyse Group, Inc.High performance data streaming
US20140089459A1 (en)*2012-09-212014-03-27Nyse Group, Inc.High performance data streaming
US9450999B2 (en)*2012-09-212016-09-20Nyse Group, Inc.High performance data streaming
US9477682B1 (en)*2013-03-132016-10-25Emc CorporationParallel compression of data chunks of a shared data object using a log-structured file system
US20140280759A1 (en)*2013-03-152014-09-18International Business Machines CorporationData transmission for transaction processing in a networked environment
US9473565B2 (en)*2013-03-152016-10-18International Business Machines CorporationData transmission for transaction processing in a networked environment
US9473561B2 (en)*2013-03-152016-10-18International Business Machines CorporationData transmission for transaction processing in a networked environment
US20140280680A1 (en)*2013-03-152014-09-18International Business Machines CorporationData transmission for transaction processing in a networked environment
US10353775B1 (en)*2014-08-062019-07-16SK Hynix Inc.Accelerated data copyback
US11163447B2 (en)*2017-09-032021-11-02Ashish Govind KhurangeDedupe file system for bulk data migration to cloud platform
US20230409234A1 (en)*2022-05-172023-12-21Western Digital Technologies, Inc.Data Storage Device and Method for Host Multi-Command Queue Grouping Based on Write-Size Alignment in a Multi-Queue-Depth Environment
US12067293B2 (en)*2022-05-172024-08-20SanDisk Technologies, Inc.Data storage device and method for host multi-command queue grouping based on write-size alignment in a multi-queue-depth environment

Also Published As

Publication numberPublication date
WO2009062029A1 (en)2009-05-14
US8316288B2 (en)2012-11-20

Similar Documents

PublicationPublication DateTitle
US8316288B2 (en)High efficiency, high performance system for writing data from applications to a safe file system
US5881311A (en)Data storage subsystem with block based data management
US9378093B2 (en)Controlling data storage in an array of storage devices
US11074129B2 (en)Erasure coded data shards containing multiple data objects
US8914597B2 (en)Data archiving using data compression of a flash copy
US10042751B1 (en)Method and system for multi-tier all-flash array
US6745286B2 (en)Interface architecture
US7146476B2 (en)Emulated storage system
US7054927B2 (en)File system metadata describing server directory information
US6862692B2 (en)Dynamic redistribution of parity groups
US10621057B2 (en)Intelligent redundant array of independent disks with resilvering beyond bandwidth of a single drive
US20020169827A1 (en)Hot adding file system processors
US20020191311A1 (en)Dynamically scalable disk array
US20020124137A1 (en)Enhancing disk array performance via variable parity based load balancing
US20160191665A1 (en)Computing system with distributed compute-enabled storage group and method of operation thereof
US20110296422A1 (en)Switch-Aware Parallel File System
US10572464B2 (en)Predictable allocation latency in fragmented log structured file systems
KR20230131486A (en) Delivery file system and method
US10474572B2 (en)Intelligent redundant array of independent disks with high performance recompaction
Nowoczynski et al.Zest checkpoint storage system for large supercomputers
WO2015161140A1 (en)System and method for fault-tolerant block data storage
CN120803378A (en)Load balancing method and device of storage system, storage medium and electronic equipment
Nowoczynski et al.Checkpoint Storage System for Large Supercomputers

Legal Events

DateCodeTitleDescription
ASAssignment

Owner name:CARNEGIE MELLON UNIVERSITY, PENNSYLVANIA

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:NOWOCZYNSKI, PAUL;STONE, NATHAN;YANOVICH, JARED;SIGNING DATES FROM 20090106 TO 20090107;REEL/FRAME:022074/0799

STCFInformation on status: patent grant

Free format text:PATENTED CASE

REMIMaintenance fee reminder mailed
FPAYFee payment

Year of fee payment:4

SULPSurcharge for late payment
FEPPFee payment procedure

Free format text:MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPSLapse for failure to pay maintenance fees

Free format text:PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCHInformation on status: patent discontinuation

Free format text:PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FPLapsed due to failure to pay maintenance fee

Effective date:20201120


[8]ページ先頭

©2009-2025 Movatter.jp