CROSS-REFERENCE TO RELATED APPLICATIONSThis application is a continuation of and claims a priority benefit from U.S. application Ser. No. 13/613,373 filed on Sep. 13, 2012 and entitled Tape Library Emulation with Automatic Configuration and Data Retention, which is a continuation of and claims a priority benefit from U.S. application Ser. No. 11/356,726, filed on Feb. 17, 2006 and entitled Tape Library Emulation with Automatic Configuration and Data Retention, which claims priority to under 35 U.S.C. Section 119(e) to Provisional Application 60/654,714, filed on Feb. 17, 2005. The disclosures of these applications are hereby incorporated by reference in their entireties. Furthermore, any and all priority claims identified in the Application Data Sheet, or any correction thereto, are hereby incorporated by reference under 37 C.F.R. §1.57.
BACKGROUND OF THE INVENTION1. Field of the Invention
This invention relates to systems and methods for storing electronic data and has applicability to enterprise data backup systems.
2. Description of the Related Art
Improving backup and restore performance is a continuing desire of enterprise data managers. In a typical computing environment, magnetic disk drives are used as the primary storage mechanism for active data, whereas magnetic tapes are used for data backup and archive. The magnetic disks provide rapid and reliable access to data, but they are perceived as being more expensive. In addition, since they are non-removable, they are at risk of physical disasters. Magnetic tape storage is perceived as being less expensive and, because tape cartridges are removable, they can be moved to offsite locations to protect against physical disasters. Therefore, most backup software in use has been optimized for use with magnetic tape technology.
Reading and writing data on a tape requires that the reel be unwound until the desired location is found. Once in the appropriate location, the read or write operation can begin. Because of the mechanical nature of this access, read and write operations are slow and often fail. In many situations, it would be beneficial to provide the random access speed and the reliability of a magnetic disk drive to backup systems while still allowing for the possibility of offsite storage. As a result, a new category of magnetic disk systems is becoming popular called virtual tape technology.
Virtual tape systems are magnetic disk systems that transparently emulate a tape drive and/or a tape library. They provide the same physical connections to a host, such as SCSI, Fibre Channel or Ethernet. This allows them to connect in the same way as the tape systems they are replacing or augmenting. They also provide the same logical response to tape drive and robot commands, which allows the same backup software to remain in use. The emulator is also able to send the host computer the expected tape-drive interrupt signals such as beginning-of-tape, end-of-tape, and inter-record-gap. In this case, such a system can plug right in to an existing tape based storage system without a need for the user to change the storage network or software environment.
Although such systems have been successful in the marketplace, the currently available devices still do not fully take advantage of the properties of disk storage in a way that provides maximum flexibility and usefulness.
SUMMARY OF THE INVENTIONIn one embodiment, the invention comprises a method of emulating a tape library data storage system using one or more hard disk drives. The method comprises querying one or more physical tape libraries in a data storage system to acquire a configuration of the one or more physical tape libraries in the data storage system. Data storage space is allocated on the one or more hard disk drives to virtual devices of one or more virtual tape libraries, wherein the one or more virtual tape libraries comprise virtual devices emulating physical devices in the acquired configuration of the one or more physical tape libraries in the tape storage system. Data storage space is also allocated on the one or more hard disk drives to at least one additional virtual device associated with the one or more virtual tape libraries, wherein the extra virtual device has no corresponding physical device in the data storage system.
In another embodiment, a method of emulating data storage on a magnetic tape media using one or more hard disk drives comprises allocating data storage space of the one or more hard disk drives to one or more virtual tape libraries, wherein the one or more virtual tape libraries comprise one or more virtual devices emulating states of one or more physical devices of the one or more physical tape libraries in a tape storage system, and storing data in the storage space allocated to the one or more virtual tape libraries according to a first user defined periodic schedule. This method further includes replicating the data stored on the one or more virtual tape libraries onto the one or more physical tape libraries according to a second user defined periodic schedule.
In another embodiment, a method of handling data storage on a hard disk storage system implemented to emulate one or more attached tape libraries comprises requesting export of at least one physical tape from a physical tape library and write-protecting the data on the hard disk storage system that is associated with the virtual tape corresponding to the physical tape to be exported.
In another embodiment, a method of emulating a tape library data storage system using one or more hard disk drives comprises allocating data storage space on the one or more hard disk drives to virtual devices of one or more virtual tape libraries, wherein the one or more virtual tape libraries comprise virtual devices emulating physical devices of one or more physical tape libraries in the tape storage system. The method further includes allocating data storage space on the one or more hard disk drives to at least one additional virtual tape library that has no corresponding physical tape library in the data storage system.
In another embodiment, the invention comprises a storage system comprising at least one disk based storage appliance and at least one tape library. The disk based storage appliance is configured to respond to commands generated by backup software as at least one emulated tape library. The disk based storage appliance stores data files in an emulated tape library accessible to the backup software that are also stored on tapes that have been previously removed from the tape library. Thus, tape archive and disk based read access to a set of data files is simultaneously provided.
BRIEF DESCRIPTION OF THE DRAWINGSFIG. 1 is a schematic of one embodiment of a data backup system in which the invention may advantageously be used.
FIG. 2 is a functional block diagram of certain components of an embodiment of the backup system ofFIG. 1.
FIG. 3 is a flow chart of the operation of one embodiment of the system ofFIG. 2.
FIG. 4 is a flow chart of a method of configuring a virtual tape library in one embodiment of the invention.
FIG. 5 is a functional block diagram of a virtual and physical library configuration in one embodiment of the invention.
FIG. 6 is a flow chart illustrating a method of physical and virtual library synchronization.
FIG. 7 is a functional block diagram of a virtual and physical library configuration in another embodiment of the invention.
FIG. 8 is a flow chart of a method of disk storage data retention in one embodiment of the invention.
FIGS. 9A-9C are functional block diagrams of virtual and physical library configurations in another embodiment of the invention.
FIG. 10 is a flow chart of the operation of a virtual shelf in the system ofFIG. 2.
FIGS. 11A and 11B are functional block diagrams of virtual and physical library configurations in another embodiment of the invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTPreferred embodiments of the present invention will now be described with reference to the accompanying Figures, wherein like numerals refer to like elements throughout. The terminology used in the description presented herein is intended to be interpreted in its broadest reasonable manner, even though it is being utilized in conjunction with a detailed description of certain specific preferred embodiments of the present invention. This is further emphasized below with respect to some particular terms used herein. Any terminology intended to be interpreted by the reader in any restricted manner will be overtly and specifically defined as such in this specification.
FIG. 1 illustrates one example of a system including a hard disk based appliance with tape emulation features that can be used in a data protection environment. In this system,application servers12 are connected to each other and to abackup server16 over anetwork14. In one embodiment, thebackup server16 communicates directly with thedisk appliance18 and has no direct communication to thetape system20. In this embodiment; thetape system20 is under the control of thedisk appliance18 via a SCSI, iSCSI, Ethernet, Fibre Channel, or otherprotocol communication link21. It will be appreciated that multiple tape systems may be connected tocommunication link21.
Backups fromapplication servers12 are received by appliance18 (via the backup server16) and are written to disk based (preferably RAID) storage ofappliance18. Thedisk appliance18 may include an internal disk drive array, and may alternatively or additionally connect to an external disk drive array through a storage adapter which may, for example, be configured as a Fibre Channel or SCSI interface.
Appliance18 may then automate the process of transferring the data stored on disk media to physical tape media intape system20 for archival purposes. As explained further below, the transfer of the disk stored data to physical tape media may be done without user intervention on a periodic basis. Furthermore, theappliance18 may periodically monitor thetape system20 for changes such as tape import or export (a tape being installed or removed from one of the physical tape libraries) and generate appropriate actions to ensure that the RAID storage virtual media emulates the physical media ontape system20.
It will be appreciated that the hardware components, functionality, and software present in thebackup server16,disk appliance18, and tape drive/library can be combined and/or separated in various ways. For example, the disks ofappliance18 can be located in a separate device. As another example, the tape drive/library20 hardware and functions can be integral with thedisk appliance18 rather than provided as a separate unit. As described above, theappliance18 can be configured to interact with thebackup server16 in exactly the same manner and format of communication as the tape drive/library20. In this way, software on thebackup server16 that is configured to communicate and store data using tape commands and tape data formats can utilize the disk basedappliance18 without modification. Speed is still improved in many cases such as restore operations, however, because tape commands such as moving to a desired block can be accomplished on disk with the virtual tape much faster than a physical tape cartridge in a physical tape drive.
FIG. 2 is a functional block diagram of certain components of an embodiment of the backup system ofFIG. 1. In the example ofFIG. 2, thedisk appliance18 has a first communication link23 (of any protocol) connected to abackup server16. The backup server will typically contain abackup software program24A that controls data transfer from the application servers12 (FIG. 1) to theappliance18. Thedisk appliance18 will also typically host anothersoftware program24B that is used to configure theappliance18, and define how theappliance18 responds to commands and data received over thelink23 from theserver backup software24A.Software24B can wholly or partly reside in memory in theappliance18 and/or thebackup server16. In one embodiment, the disk appliance software is accessed via a browser program onbackup server16 or any other computer on the network.
Theappliance18 is also coupled to anothercommunication link21 that is connected to three physical tape libraries (PTL)20A,20B and20C. More or fewer tape libraries may be provided, it will be appreciated that three is merely an example. In some embodiments, a single physical tape library can be partitioned to behave as if it were multiple separate tape libraries, as described in U.S. Pat. No. 6,328,766, the entire disclosure of which is hereby incorporated by reference. It will be appreciated that any or all communication links connecting the devices ofFIG. 1 could be over the same network link.
Thedisk appliance18 in the example ofFIG. 2 is configured to include three virtual tape libraries (VTL)22A,22B and22C. The VTL1 (22A) contains virtual devices emulating the physical devices of PTL1 (20A), while virtual devices in VTL2 (22B) and VTL3 (22C) emulate the physical devices of PTL2 (20B) and PTL3 (20C), respectively. The “emulation” of tape libraries in disk appliances is known in the art, and those of skill in the art understand and can create hardware and software components for anappliance18 that can emulate tape libraries.
Generally, tape library emulation is understood to mean that theappliance18 responds to commands and data transfers from the backup server with responses and data that the backup server expects from a tape library being emulated, even though no physical tape library is in direct communication with thebackup server16. In the discussion that follows, manipulation of both physical objects and virtual objects is described. When such terminology is applied to virtual objects, such as “moving” or “creating” a virtual tape, or “allocating” some portion ofdisk appliance18 to a virtual object, these terms are intended as generally used in the art to mean that thedisk appliance18 is configured or re-configured to respond to commands from the backup server with the same responses that would be produced by a physical device having characteristics corresponding to the virtual device in its latest virtual configuration.
Although emulation of tape libraries in disk based appliances is known, their usefulness has been limited due to complexities in the management of the combined disk and tape storage environment. In accordance with the inventions described herein, data storage systems with improved properties, data access, and simplified management are provided.
It is often desirable for the emulatedtape libraries22 to be configured identically to thephysical tape libraries20. To accomplish this in an efficient and easily administered manner, theappliance18 may be configured to use standard query commands to detect the configuration of the physical tape libraries such as the number and type of tape drives, the number of storage slots, etc. Upon receiving this information, theappliance18 can be configured automatically to emulate a tape library identical to a given physical library coupled to theappliance18. In addition, if a new PTL, PTL4 (20D) for example, is added to the tape storage system, the disk appliance can query the PTL4 (20D) and then emulate the discovered physical devices with virtual devices in VTL4 (22D).
Thedisk appliance software24B comprises user interface software to control the configuration of features ofdisk appliance18. Features controllable by thedisk appliance software24B described further below include, for example, control over replication of the data stored on the VTL's by thebackup software24A to the PTL's for synchronization, data retention time limits upon removal of tapes from the PTL and virtual device configuration management software.
During operation, the backup software is configured to perform backup and restore operations to PTL1, PTL2, and PTL3 as defined and managed by the system administrator as if theappliance18 was not present. This is typically performed on a periodic schedule fixed by a system administrator viabackup software24A. The commands and interactions are received by theappliance18, and theappliance18 interacts with thebackup server16 as if it were a collection of PTL's. As described above, the data stored on theappliance18 by the backup software is periodically or on command (via theappliance software24B) transferred to the physical tapes in thephysical tape libraries20 so that the data intended for storage on tapes in the PTL's is physically present on those tapes within reasonable and desired time frames defined by a system administrator. A second periodic schedule different from the periodic backup schedule can be used for synchronizing data in the virtual libraries with data in the physical libraries.
It is one aspect of some embodiments of the invention that the disk appliance can implement an additional emulated tape library, referred to herein as the “virtual shelf” (VTS)26. Theshelf26 can be used to allow access to data on tape cartridges that have been exported from the PTL's. This feature is described further below.
FIG. 3 illustrates a process that may be performed by the system ofFIG. 2. Referring now toFIG. 3,process100 starts atstep105 by performing configuration management of the various virtual devices in thedisk appliance18.FIGS. 4 and 5 discussed below provide additional detail concerning this process that can be performed in some invention embodiments.Disk appliance18 can query PTL's connected to it via various network connections to acquire configuration information atstep106. Configuration information acquired atstep106 can include newly added components such as the PTL4 (20D) shown inFIG. 2. Configuration information changes can also include modification of existing hardware and deletion of existing hardware in the attached PTL's20.
Process100 continues atstep110 where data transferred under control of backup software on thebackup server16 is received by theappliance18. As discussed above, thebackup server16 communicates data to be backed up onPTL20 directly todisk appliance18.Disk appliance18 stores the received data to the various virtual devices that correspond to the physical devices requested by thebackup server16.Process100 continues atstep115 where data stored onVTL22 is replicated on thePTL20 so as to synchronize the data between theVTL22 and thePTL20. This replication may be periodic (e.g., every day, every 12 hours, etc.). The replication policies may be user settable as discussed above.
Atstep120, theprocess100 continues with the monitoring of the PTL(s)20 for detection of imported or exported physical tapes. The import and/or export handling acts are carried out atstep125. Several example cases of tape import/export and the handling of these cases are discussed below. It will be appreciated that data backups ofstep110, replication to the physical libraries ofstep115, and the import/export handling of tapes are performed periodically and/or on command as required in any order on any desired schedule in an ongoing manner during operation of the system.
Referring now toFIGS. 4 and 5, virtual library configuration management in some embodiments of the invention is illustrated. In the example ofFIG. 5, thePTL20 contains onephysical tape drive28, arobotic media changer30, five tape storage slots32 (labeled1 through5) containing five tape cartridges labeled A-E, and atape export slot34. Accordingly, duringstep106 ofFIG. 4, theappliance18 queries thePTL20 do detect the PTL configuration. After gathering the information, the correspondingVTL22 is created in the appliance atstep107, emulating the hardware ofPTL20 to contain onevirtual tape drive36, a virtualmedia changer robot38, five virtual tape slots40 (labeled V1 through V5) containing five virtual tape cartridges labeled VA through VE, and atape export slot42. Thedisk appliance18 may implement thevirtual media changer38 as a standard SCSI media changer device, as defined in the T10 SMC-2 document. Eachvirtual tape drive36 inside theVTL22 may comprise an emulation of a standard SCSI sequential device, as defined in the T10 SSC-2 document. The process of “creating” or “implementing” the virtual library typically comprises the creation of data files stored in the control/memory circuits31 of theappliance18 that define the appliance response to commands and communications received from thebackup server16. The data format, organization, and disk space allocation to implement virtual tape cartridges in thedisk appliance18 may be performed as described in U.S. patent application Ser. Nos. 11/215,740 and 10/943,779, the entire disclosures of which are hereby incorporated by reference in their entireties.
In addition to emulating physical devices of attached PTL's20,disk appliance18 can (at step108) receive requests from users, e.g., users of thebackup server16 utilizing thedisk appliance software24B shown inFIG. 2, to add additional virtual devices that do not have corresponding physical devices in attached PTL's. For example, a user may request to add a secondvirtual drive37, when the attachedPTL20 has only onephysical drive28. The disk appliance will then emulate the requested additional virtual devices atstep109.
As discussed above in relation to thedisk appliance software24B as shown inFIG. 2, theconfiguration management task105 may also modify data retention policies, replication time periods and other user settable options.
FIGS. 6 and 7 illustrate certain steps in a process of handling export of a physical tape from thePTL20. For example, thebackup software24A may send a command to theappliance18 to export from the tape library the tape that is inslot4, designated D inFIG. 7. In other cases, a user with access to thePTL20 may use a keypad or separate control input to thePTL20 to command export of tape D. In the first case, theappliance18 receives the command from the backup server. In the second case, theappliance18 receives a message from thePTL20 that the selected tape is being exported. Thus, tape export requests to the PTL are monitored by theappliance18 atstep150. The process then continues to decision block155, where a check is made as to whether any replication is needed to synchronize the physical tape to be exported with the corresponding virtual tape. If the tapes are fully synchronized then permission to export the physical tape is issued atstep160. If the physical tape and virtual tape do not match, then replication is performed atstep165. After the data on the virtual tape is replicated on the physical tape, then the permission to export the physical tape is issued atstep160. As shown inFIG. 7, when the tape is exported form the PTL, the corresponding virtual tape is removed from theVTL22, and the virtual slot appears empty.
FIG. 8 shows a flow diagram illustrating certain steps in another process of handling export of a physical tape where a data retention policy is implemented. Atdecision block165, it is determined whether a tape is being exported. Next, instead of having the virtual tape removed from thevirtual library22 upon physical tape export as shown inFIG. 7, atstep175 the virtual tape cartridge remains in the virtual library but is indicated as write protected, illustrated inFIG. 9A as shaded. In this way, the content of the virtual tape cartridge D will remain synchronized with exported physical tape cartridge D, but the backup server will have read access to the files stored there, even though the physical tape is no longer present, perhaps having been moved to an offsite remote location for disaster recovery safekeeping. This feature ofappliance18 allows the benefits of both on-site read access to archived files for fast restores, and simultaneous offsite data storage of the same files in an easily managed way. In advantageous embodiments, a data retention time limit may be selected by the system administrator via thedisk appliance software24B shown inFIG. 2.
While a virtual tape is write-protected, the virtual slot that it is stored in (virtual slot V4 in this example) is not available for importation of a new virtual tape. If a new physical tape D′ (e.g., a replacement tape for the exported physical tape) is imported to thePTL20 in thephysical slot4, then it will be detected atstep185, also illustrated inFIG. 9B. A check is made atdecision block190 as to whether or not the write protection of the virtual tape VD located in the virtual slot V4 where the new physical tape was inserted has expired. If the write protection time has been reached, then the write protection is canceled atstep195 and a new virtual tape VD′ is emulated (step200) in the virtual slot V4 where the expired write protected virtual tape D was located as shown inFIG. 9C. If the write protection has not expired, then the disk appliance masks the imported tape atstep205. During backups, the backup software will utilize space on the non-write protected tapes until the write protection for the tape cartridge VD expires and the cartridge VD is exported from theVTL22.
FIG. 10 shows a flow diagram illustrating certain steps in an exemplary process of utilizing the virtual tape shelf (VTS)26 as discussed above and shown inFIG. 2. The exporting of physical tape D fromPTL20 will be used as an example embodiment of a way to use theVTS26 and the process ofFIG. 10. Configurations ofPTL20,VTL22, andVTS26 are illustrated inFIGS. 11A and 11B. Starting atstep210, a data storage area is allocated to theVTS26. Step210 may be performed as part of theconfiguration management step105 shown inFIG. 4A. TheVTS26 may include one or more virtual tape drives52,54, as well as a virtualrobotic cartridge exchanger56. Also included are multiple virtual tape storage slots, designated V1 through V6 inFIGS. 11A and 11B. Atstep215, a command to thePTL20 corresponding to removal or export of physical tape D is detected. Atstep220, in response to the detected removal of physical tape D, the virtual tape VD disappears fromVTL22 and appears in a storage slot ofVTS26. The virtual tape D is still available to thedisk appliance18 when it is in theVTS26.
While the above detailed description has shown, described, and pointed out novel features of the invention as applied to various embodiments, it will be understood that various omissions, substitutions, and changes in the form and details of the device or process illustrated may be made by those skilled in the art without departing from the spirit of the invention. As will be recognized, the present invention may be embodied within a form that does not provide all of the features and benefits set forth herein, as some features may be used or practiced separately from others.