Movatterモバイル変換

[0]ホーム

Jump to content

Data set (IBM mainframe)

Edit links

From Wikipedia, the free encyclopedia

Type of computer file existing on IBM mainframe operating systems

This article is about computer files. For data communications, seemodem.

In the context ofIBM mainframe computers in theIBM System/360 line and its successors, adata set (IBM preferred) ordataset is acomputer file having arecord organization. Use of this term began with, e.g.,DOS/360 andOS/360, and is still used by their successors, including the currentVSE andz/OS. Documentation for these systems historically preferred this term rather thanfile.

A data set is typically stored on adirect access storage device (DASD) ormagnetic tape,^[1] however unit record devices, such as punch card readers, card punches, line printers and page printers can provide input/output (I/O) for a data set (file).^[2]

Data sets are not unstructured streams ofbytes, but rather are organized in various logical record^[3] and block structures determined by theDSORG (data set organization),RECFM (record format), and other parameters. These parameters are specified at the time of the data set allocation (creation), for example withJob Control LanguageDD statements. Within a running program they are stored in theData Control Block (DCB) or Access Control Block (ACB), which are data structures used to access data sets usingaccess methods.

Records in a data set may be fixed, variable, or “undefined” length.^[4]

Data set organization

[edit]

For OS/360, the DCB'sDSORG parameter specifies how the data set is organized. It may be^[5]

CQ: Queued Telecommunications Access Method (QTAM) in Message Control Program (MCP)
CX: Communications line group
DA: Basic Direct Access Method (BDAM)
GS: Graphics device for Graphics Access Method(GAM)
IS: Indexed Sequential Access Method (ISAM)
MQ: QTAM message queue in application
PO: Partitioned Organization
PS: Physical Sequential

among others.Data sets on tape may only beDSORG=PS. The choice of organization depends on how the data is to be accessed, and in particular, how it is to be updated.

Programmers utilize variousaccess methods (such asQSAM orVSAM) in programs for reading and writing data sets. Access method depends on the given data set organization.

Record format (RECFM)

[edit]

Regardless of organization, the physical structure of each record is essentially the same, and is uniform throughout the data set. This is specified in the DCBRECFM parameter.RECFM=F means that the records are of fixed length, specified via theLRECL parameter.RECFM=V specifies a variable-length record. V records when stored on media are prefixed by a Record Descriptor Word (RDW) containing the integer length of the record in bytes and flag bits. WithRECFM=FB andRECFM=VB, multiple logical records are grouped together into a singlephysical block on tape or DASD. FB and VB arefixed-blocked, andvariable-blocked, respectively.RECFM=U (undefined) is also variable length, but the length of the record is determined by the length of the block rather than by a control field.

TheBLKSIZE parameter specifies the maximum length of the block.RECFM=FBS^[6] could be also specified, meaningfixed-blocked standard, meaning all the blocks except the last one were required to be in fullBLKSIZE length.RECFM=VBS, orvariable-blocked spanned, means a logical record could be spanned across two or more blocks, with flags in the RDW indicating whether a record segment is continued into the next block and/or was continued from the previous one.

This mechanism eliminates the need for using any "delimiter" byte value to separate records. Thus data can be of any type, including binary integers, floating-point, or characters, without introducing a false end-of-record condition. The data set is an abstraction of a collection of records, in contrast to files as unstructured streams of bytes.

Partitioned data set

[edit]

Not to be confused withPassive data structure.

Apartitioned data set (PDS)^[7] is a data set containing multiplemembers, each of which holds a separate sub-data set, similar to adirectory in other types offile systems. This type of data set is often used to holdload modules (old format bound executable programs), source program libraries (especially Assembler macro definitions),ISPF screen definitions, andJob Control Language. A PDS may be compared to aZip file orCOM Structured Storage.

A Partitioned Data Set can only be allocated on a single volume and have a maximum size of 65,535 tracks.

Besides members, a PDS contains also a directory. Each member can be accessed indirectly via the directory structure. Once a member is located, the data stored in that member are handled in the same manner as a PS (sequential) data set.

Whenever a member is deleted, the space it occupied is unusable for storing other data. Likewise, if a member is re-written, it is stored in a new spot at the back of the PDS and leaves wasted “dead” space in the middle. The only way to recover “dead” space is to perform file compression.^[8] Compression, which is done using theIEBCOPY utility,^[9] moves all members to the front of the data space and leaves free usable space at the back. (Note that in modern parlance, this kind of operation might be calleddefragmentation orgarbage collection;data compression nowadays refers to a different, more complicated concept.) PDS files can only reside onDASD, not onmagnetic tape, in order to use the directory structure to access individual members. Partitioned data sets are most often used for storing multiplejob control language files,utility control statements, and executable modules.

An improvement of this scheme is aPartitioned Data Set Extended (PDSE or PDS/E, sometimes justlibraries) introduced withDFSMSdfp forMVS/XA andMVS/ESA systems. A PDS/E library can store program objects or other types of members, but not both. BPAM cannot process a PDS/E containing program objects.

PDS/E structure is similar to PDS and is used to store the same types of data. However, PDS/E files have a better directory structure which does not require pre-allocation of directory blocks when the PDS/E is defined (and therefore does not run out of directory blocks if not enough were specified). Also, PDS/E automatically stores members in such a way that compression operation is not needed to reclaim "dead" space.^[8] PDS/E files can only reside on DASD in order to use the directory structure to access individual members.

Generation Data Group

[edit]

AGeneration Data Group^[10] (GDG)^[11] is a group of non-VSAM data sets^[12] that are successive generations of historically-related data^[13] stored on an IBM mainframe (runningOS/360 and its successors orDOS/360 and its successors).^[14]

A GDG is usually cataloged.^[13]

An individual member of the GDG collection is called a "Generation Data Set."^[13]^[15] The latter may be identified by an absolute number,ACCTG.OURGDG(1234), or a relative number:(-1) for the previous generation,(0) for the current one, and(+1) the next generation.^[16]

A GDG specifies how many generations of a data set are to be kept and at what age a generation will be deleted. Whenever a new generation is created, the system checks whether one or more obsolete generations are to be deleted.

The purpose of GDGs is to automate archival, using the command languageJCL, the data set name given is generic. When DSN appears, the GDG data set appears along with the history number, where

(0) is the most recent version

(-1), (-2), ... are previous generations

(+1) a new generation (see DD)

Another use of GDGs is to be able to address all generations simultaneously within a JCL script without having to know the number of currently available generations. To do this, you have to omit the parentheses and the generation number in the JCL when specifying the dataset.

GDG JCL & features

[edit]

Generation Data Groups are defined using either the BLDG statement^[17] of theIEHPROGM utility or theDEFINE GENERATIONGROUP statement^[18] of the newerIDCAMS utility,^[19] which allows setting various parameters.

LIMIT(10) would limit the number of generations limit to 10.
SCRATCH FOR (91) would retain each member, up to the limited#generations, at least 91 days.

IDCAMS can also delete (and optionally uncatalog) a GDG.^[20]

Example

[edit]

Creation of a standard GDG for five safety scopes, each at least 35 days old:

//STEP1EXECPGM=IDCAMS//SYSPRINT DDSYSOUT=*//SYSIN DD*DEFINE GDG (NAME('DB2.FULLCOPY.DSNDB04.TSTEST') LIMIT(5) SCRATCH FOR(35))/*

Delete a standard GDG:

//STEP3EXECPGM=IDCAMS//SYSPRINT DDSYSOUT=*//SYSIN DD*DELETE DB2.FULLCOPY.DSNDB04.TSTEST GDG FORCE/*

References

[edit]

^"What is a catalog?".IBM.Cataloging of data sets on magnetic tape ...
^"IBM Knowledge Center - Home of IBM product documentation".publib.boulder.ibm.com.
^"What is a data set?".IBM.data set .. a file that contains one or more records.
^"Data set record formats".IBM.Records are either fixed length or variable length in a given data set.
^"Section IV: The DD Statement -- DCB Parameter"(PDF).IBM System/3S0 Operating System: Job Control Language Reference - OS Release 21.7(PDF). IBM Systems Reference Library. IBM. pp. 138–139. GC28-6704-4.
^"Example: Record format VBS".IBM.Variable-length, blocked, spanned (VBS)
^"Structure of a PDS",z/OS DFSMS Using Data Sets Version 2 Release 3(PDF), October 2, 2018, SC23-6855-30
^^a ^bStephens, David (Oct 2008).What On Earth is a Mainframe?. Lulu.com. p. 52.ISBN 978-1-4092-2535-5. RetrievedMay 11, 2018.
^"Compressing a Partitioned Data Set",z/OS DFSMSdfp Utilities Version 2 Release 3(PDF), IBM Corporation, July 17, 2017, SC23-6864-30,A partitioned data set will contain unused areas (sometimes called gas) where a deleted member or the old version of an updated member once resided. This unused space is only reclaimed when a partitioned data set is copied to a new data set, or after a compress-in-place operation successfully completes. It has no meaning for a PDSE and is ignored if requested.
^"Generation Data Groups (GDG's), an Introduction with Examples".create and process a Generation Data Group or GDG on ...
^"JCL TUTORIAL REFERENCE - Generation Data Groups".Generation Data Groups (GDG)
^"What is a generation data group?".IBM.com.... non-VSAM ...
^^a ^b ^c"Generation data sets".IBM.successive, historically related,
^"VSE/VSAM Commands"(PDF). Archived fromthe original(PDF) on 2022-01-31. Retrieved2021-10-11.
^"A generation data set is one of ...
^"What is a GDG?".
^"BLDG (Build Generation Index) Statement"(PDF).OS Utilities(PDF). IBM Systems Reference Library (Sixteenth ed.).IBM. April 1973. p. 269. GC28-6586-15. RetrievedMay 19, 2022.
^"Defining a Generation Data Group"(PDF).OS/VS Access Method Services(PDF). Systems (Second ed.).IBM. May 1974. pp. 107–110. GC26-3836-1. RetrievedMay 19, 2022.
^"IBM How to create and use Generation Data Groups (GDG)".IBM. 2 March 2012.Create a GDG... IDCAMS will do it
^"IDCAMS – Create and delete GDG base using JCL".Use this code. 16 May 2011.

Introduction to the New Mainframe: z/OS Basics Archived 2019-04-25 at theWayback Machine, Ch. 5, "Working with data sets", March 29, 2011.ISBN 0738435341

v t e OS/360 and successors I/Oaccess methods
Low-level	EXCP EXCPVR STARTIO
Storage	XDAP BDAM BSAM QSAM BPAM ISAM VSAM OAM
Network	BTAM QTAM TCAM VTAM

Retrieved from "https://en.wikipedia.org/w/index.php?title=Data_set_(IBM_mainframe)&oldid=1323401199"

Categories:

Hidden categories:

[8]ページ先頭