Incomputing,channel I/O is a high-performanceinput/output (I/O) architecture that is implemented in various forms on a number of computer architectures, especially onmainframe computers. In the past, channels were generally[a] implemented with custom devices, variously namedchannel,I/O processor,I/O controller,I/O synchronizer, orDMA controller.
Many I/O tasks can be complex and require logic to be applied to the data to convert formats and other similar duties. In these situations, the simplest solution is to ask theCPU to handle the logic, but because I/O devices are relatively slow, a CPU could waste time waiting for the data from the device. This situation is called 'I/O bound'.
Channel architecture avoids this problem by processing some or all of the I/O task without the aid of the CPU by offloading the work to dedicated logic. Channels are logically[a] self-contained, with sufficient logic and working storage to handle I/O tasks. Some are powerful or flexible enough to be used as a computer on their own and can be construed as a form ofcoprocessor, for example, the 7909 Data Channel on anIBM 7090 orIBM 7094; however, most are not. On some systems the channels use memory or registers addressable by the central processor as their working storage, while on other systems it is present in the channel hardware. Typically, there are standard interfaces[b] between channels and external peripheral devices, and multiple channels can operate concurrently.
A CPU typically designates a block of storage as, or sends, a relatively smallchannel program to the channel in order to handle I/O tasks, which the channel and controller can, in many cases, complete without further intervention from the CPU (exception: those channel programs which utilize 'program controlled interrupts', PCIs, to facilitate program loading, demand paging and other essential system tasks).
When I/O transfer is complete or an error is detected, the controller typically communicates with the CPU through the channel using aninterrupt. Since the channel normally has direct access to the main memory, it is also often referred to as adirect memory access (DMA) controller.
In the most recent implementations, the channel program is initiated and the channel processor performsall required processing until either an ending condition or a program controlled interrupt (PCI). This eliminates much of the CPU—Channel interaction and greatly improves overall system performance. The channel may report several different types of ending conditions, which may be unambiguously normal, may unambiguously indicate an error or whose meaning may depend on the context and the results of a subsequent sense operation. In some systems an I/O controller can request an automatic retry of some operations without CPU intervention. In earlier implementations,any error, no matter how small, required CPU intervention, and the overhead was, consequently, much higher. A program-controlled interruption (PCI) is still used by certain legacy operations, but the trend is to move away from such PCIs, except where unavoidable.
The first use of channel I/O was with theIBM 709[2] vacuum tube mainframe in 1957, whose Model 766 Data Synchronizer was the first channel controller. The 709's transistorized successor, theIBM 7090,[3] had two to eight 6-bit channels (the 7607) and a channel multiplexor (the 7606) which could control up to eight channels. The 7090 and 7094 could also have up to eight 8-bit channels with the 7909.
While IBM used datachannel commands on some of its computers, and allowedcommand chaining on, e.g., the 7090, most other vendors used channels that dealt with single records. However, some systems, e.g.,GE-600 series, had more sophisticated I/O architectures.
Later, theIBM System/360 andSystem/370 families of computer offered channel I/O on all models. For the lower-end System/360 Models50 and below and System/370 Model158 and below, channels were implemented inmicrocode on the CPU, and the CPU itself operated in one of two modes, either "CPU Mode" or "Channel Mode", with the channel mode 'stealing' cycles from the CPU mode. For largerIBM System/360 andSystem/370 computers the channels were still bulky and expensive separate components, such as the IBM 2860 Selector channel (one to three selector channels in a single box), the IBM 2870 Byte multiplexor channel (one multiplexer channel, and, optionally, one to four selector subchannels in a single box), and the IBM 2880 Block multiplexor channel (one or two block multiplexor channels in a single box). On the303x processor complexes, the channels were implemented in independent channel directors in the same cabinet as the CPU, with each channel director implementing a group of channels.[4]
Much later, the channels were implemented as an on-board processor residing in the same box as the CPU, generally referred to as a "channel processor", and which was usually aRISC processor, but which could be a System/390 microprocessor with special microcode as in IBM'sCMOS mainframes.
Amdahl Corporation's hardware implementation of System/370 compatible channels was quite different. A single internal unit, called the "C-Unit", supported up to sixteen channels using the very same hardware for all supported channels. Two internal "C-Units" were possible, supporting up to 32 total channels. Each "C-Unit" independently performed a process generally called a "shifting channel state processor" (a type ofbarrel processor), which implemented a specializedfinite-state machine (FSM). Each CPU cycle, every 32 nanoseconds in the 470V/6 and /5 and every 26 nanoseconds in the 470V/7 and /8, the "C-unit" read the complete status of next channel in priority sequence and itsI/O Channel in-tags. The necessary actions defined by that channel'slast state and itsin-tags were performed: data was read from or written to main storage, the operating system program was interrupted if such interruption was specified by the channel program's Program Control Interrupt flag, and the "C-Unit" finally stored that channel's next state and set itsI/O Channel out-tags, and then went on to the next lower priority channel. Preemption was possible, in some instances. Sufficient FIFO storage was provided within the "C-Unit" for all channels which were emulated by this FSM. Channels could be easily reconfigured to the customer's choice of selector, byte multiplexor, or block multiplexor channel, without any significant restrictions by using maintenance console commands. "Two-byte interface" was also supported as was "Data-In/Data-Out" and other high-performance IBM channel options. Built-inchannel-to-channel adapters were also offered, called CCAs in Amdahl-speak, but called CTCs or CTCAs in IBM-speak. A real game-changer, and this forced IBM to redesign its mainframes to provide similar channel capability and flexibility. IBM's initial response was to include stripped-down Model 158s, operating in "Channel Mode", only, as the Model 303x channel units. In the Amdahl "C-unit" any channel could be any type, selector, byte multiplexor, or block multiplexor, without reserving channels 0 and 4 for the byte multiplexers, as on some IBM models.
Some of the earliest commercial non-IBM channel systems were on theUNIVAC 490,CDC 1604,Burroughs B5000,UNIVAC 1107 andGE 635. Since then, channel controllers have been a standard part of most mainframe designs and primary advantage mainframes have over smaller, faster, personal computers and network computing.
The 1965CDC 6600supercomputer utilized 10 logically independent computers called peripheral processors (PPs) and 12 simple I/O channels for this role. PPs were a modified version of CDC's first personal computers, the 12-bitCDC 160 and 160A. The operating system initially resided and executed in PP0. The channels had no direct access to memory and could not cause interrupts; software on a PP used synchronous instructions[c] to transfer data between the channel and either the A register or PP memory.
SCSI introduced in 1981 as a low cost channel equivalent to the IBM Block Multiplexer Channel[5] is now ubiquitous in the form of theFibre Channel Protocol andSerial Attached SCSI.
Modern computers may have channels in the form ofbus mastering peripheral devices, such asPCIdirect memory access (DMA) devices. The rationale for these devices is the same as for the original channel controllers, namely off-loading transfer, interrupts, andcontext switching from the main CPU.
Channel controllers have been made as small as single-chip designs with multiple channels on them, used in theNeXT computers for instance.
The reference implementation of channel I/O is that of the IBM System/360 family of mainframes and its successors, but similar implementations have been adopted by IBM on other lines, e.g.,1410 and 7010,7030, and by other mainframe vendors, such asControl Data,Bull (General Electric/Honeywell) andUnisys.
Computer systems that use channel I/O have special hardware components that handle all input/output operations in their entirety independently of the systems' CPU(s). The CPU of a system that uses channel I/O typically has only onemachine instruction in its repertoire for input and output; this instruction is used to pass input/output commands to the specialized I/O hardware in the form ofchannel programs. I/O thereafter proceeds without intervention from the CPU until an event requiring notification of the operating system occurs, at which point the I/O hardware signals an interrupt to the CPU.
A channel is an independent hardware component that coordinates all I/O to a set of controllers or devices. It is not merely a medium of communication, despite the name; it is aprogrammable device that handles all details of I/O after being given a list of I/O operations to carry out (the channel program).
Each channel may support one or more controllers and/or devices, but each channel program may only be directed at one of those connected devices. A channel program contains lists of commands to the channel itself and to the controller and device to which it is directed. Once the operating system has prepared a complete list of channel commands, it executes a single I/O machine instruction to initiate the channel program; the channel thereafter assumes control of the I/O operations until they are completed.
It is possible to develop very complex channel programs, including testing of data and conditional branching within that channel program. This flexibility frees the CPU from the overhead of starting, monitoring, and managing individual I/O operations. The specialized channel hardware, in turn, is dedicated to I/O and can carry it out more efficiently than the CPU (and entirely in parallel with the CPU). Channel I/O is not unlike theDirect Memory Access (DMA) of microcomputers, only more complex and advanced.
On large mainframe computer systems, CPUs are only one of several powerful hardware components that work in parallel. Special input/output controllers (the exact names of which vary from one manufacturer to another) handle I/O exclusively, and these, in turn, are connected to hardware channels that also are dedicated to input and output. There may be several CPUs and several I/O processors. The overall architecture optimizes input/output performance without degrading pure CPU performance. Since most real-world applications of mainframe systems are heavily I/O-intensive business applications, this architecture helps provide the very high levels ofthroughput that distinguish mainframes from other types of computers.
InIBMESA/390 terminology, a channel is a parallel data connection inside the tree-like or hierarchically organized I/O subsystem. In System/390 I/O cages, channels either directly connect to devices which are installed inside the cage (communication adapter such asESCON,FICON,Open Systems Adapter) or they run outside of the cage, below theraised floor as cables of the thickness of a thumb and directly connect to channel interfaces on bigger devices like tape subsystems,direct access storage devices (DASDs), terminal concentrators and other ESA/390 systems.
Channels differ in the number and type of concurrent I/O operations they support. In IBM terminology, amultiplexer channel supports a number of concurrent interleaved slow-speed operations, each transferring one byte from a device at a time. Aselector channel supports one high-speed operation, transferring ablock of data at a time. Ablock multiplexer supports a number of logically concurrent channel programs, but only one high-speed data transfer at a time.
Channels may also differ in how they associate peripheral devices with storage buffers. In UNIVAC terminology, a channel may either beinternally specified index (ISI), with a single buffer and device active at a time, orexternally specified index (ESI), with the device selecting which buffer to use.
In the IBM System/360 and subsequent architectures, achannel program is a sequence of channel command words (CCWs) that are executed by the I/O channel subsystem. A channel program consists of one or more channel command words. The operating system signals the I/O channel subsystem to begin executing the channel program with an SSCH (start sub-channel) instruction. The central processor is then free to proceed with non-I/O instructions until interrupted. When the channel operations are complete, the channel interrupts the central processor with an I/O interruption. In earlier models of the IBM mainframe line, the channel unit was an identifiable component, one for each channel. In modern mainframes, the channels are implemented using an independent RISC processor, the channel processor, one for all channels. IBM System/370 Extended Architecture[6] and its successors replaced the earlier SIO (start I/O) and SIOF (start I/O fast release) machine instructions (System/360 and early System/370) with the SSCH (start sub-channel) instruction (ESA/370 and successors).
Channel I/O provides considerable economies in input/output. For example, on IBM'sLinux on IBM Z, the formatting of an entire track of aDASD requires only one channel program (and thus only one I/O instruction), but multiple channel command words (one per block). The program is executed by thededicated I/O processor, while theapplication processor (the CPU) is free for other work.
Achannel command word (CCW) is aninstruction to a specialized I/O channel processor which is, in fact, an FSM. It is used to initiate an I/O operation, such as "read", "write" or "sense", on a channel-attached device. On system architectures that implement channel I/O, typically all devices are connected by channels, and soall I/O requires the use of CCWs.
CCWs are organized intochannel programs by the operating system, and I/O subroutine, a utility program, or by standalone software (such as test and diagnostic programs). A limited "branching" capability, hence a dynamically programmable capability, is available within such channel programs, by use of the "status modifier" channel flag and the "transfer-in-channel" CCW.
IBM CCWs arechained to form the channel program. Bits in the CCW indicates that the following location in storage contains a CCW that is part of the same channel program. The channel program normally executes sequential CCWs until an exception occurs, a Transfer-in-Channel (TIC) CCW is executed, or a CCW is executed without chaining indicated.Command chaining tells the channel that the next CCW contains a new command.Data chaining indicates that the next CCW contains the address of additional data for the same command, allowing, for example, portions of one record to be written from or read to multiple data areas in storage (gather-writing and scatter-reading).[7]
Channel programs can modify their own operation during execution based on data read. For example, self modification is used extensively in OS/360ISAM.[8]
The following example[9] reads a disk record identified by arecorded key. The track containing the record and the desired value of the key is known. The device control unit will search the track to find the requested record. In this example <> indicate that the channel program contains the storage address of the specified field.
SEEK <cylinder/head number> SEARCH KEY EQUAL <key value> TIC *-8 Back to search if not equal READ DATA <buffer>
The TIC (transfer in the channel) will cause the channel program to branch to the SEARCH command until a record with a matching key (or the end of the track) is encountered. When a record with a matching key is found the DASD controller will include Status Modifier in the channel status, causing the channel to skip the TIC CCW; thus the channel program will not branch and the channel will execute the READ command.
The above example is correct forunblocked records (one record per block). Forblocked records (more than one record per block), therecorded key must be the same as the highest key within that block (and the records must be in key sequence), and the following channel program would be utilized:
SEEK <cylinder/head number> SEARCH KEY HIGH OR EQUAL <key value> TIC *-8 Back to search if not high or equal READ DATA <buffer>
If the dataset is allocated in tracks, and the end of the track is reached without the requested record being found the channel program terminates and returns a "no record found" status indication. Similarly, if the dataset is allocated in cylinders, and the end of the cylinder is reached without the requested record being found the channel program terminates and returns a "no record found" status indication. In some cases, the system software has the option of updating the track or cylinder number andredriving the I/O operation without interrupting the application program.
|  | This sectionmay be weighted too heavily towards only one aspect of its subject. Please helpimprove it by introducing more general information. Relevant discussion may be found on thetalk page.(May 2021) | 
On most systems channels operate usingreal (or physical) addresses, while the channel programs are built usingvirtual addresses.[10] The operating system is responsible fortranslating these channel programs before executing them, and for this particular purpose theInput/Output Supervisor (IOS) has a specialfast fix function which was designed into the OS Supervisor just for those "fixes" which are of relatively short duration (i.e., significantly shorter than "wall-clock time"). Pages containing data to be used by the I/O operation are locked into real memory, orpage fixed. The channel program is copied and all virtual addresses are replaced by real addresses before the I/O operation is started. After the operation completes, the pages are unfixed.
As page fixing and unfixing is a CPU-expensive process long-term page fixing is sometimes used to reduce the CPU cost. Here the virtual memory is page-fixed for the life of the application, rather than fixing and freeing around each I/O operation. An example of a program that can use long-term page fixing isDb2.
An alternative to long-term page fixing is moving the entire application, including all its data buffers, to apreferred area of main storage. This is accomplished by a special SYSEVENT in MVS/370 through z/OS operating systems, wherein the application is, first, swapped-out from wherever it may be, presumably from anon-preferred area, to swap and page external storage, and is, second, swapped-in to apreferred area (SYSEVENT TRANSWAP). Thereafter, the application may be markednon-swappable by another special SYSEVENT (SYSEVENT DONTSWAP). Whenever such an application terminates, whether normally or abnormally, the operating system implicitly issues yet another special SYSEVENT on the application's behalf if it has not already done so (SYSEVENT OKSWAP).
Evenbootstrapping of the system, orInitial Program Load (IPL) in IBM nomenclature, is carried out by channels, although the process is partially simulated by the CPU through an implied Start I/O (SIO) instruction, an implied Channel Address Word (CAW) at location 0 and an implied channel command word (CCW) with an opcode of Read IPL, also at location 0. Command chaining is assumed, so the implied CCW at location 0 falls through to the continuation of the channel program at locations 8 and 16, and possibly elsewhere should one of those CCWs be a transfer-in-channel (TIC).[11]
To load a system, the implied Read IPL CCW reads the first block of the selected IPL device into the 24-byte data area at location 0, the channel continues with the second and third double words, which are CCWs, and this channel program loads the first portion of the system loading software elsewhere in main storage. The first double word contains a PSW which, when fetched at the conclusion of the IPL, causes the CPU to execute the IPL Text (bootstrap loader) read in by the CCW at location 8. The IPL Text then locates, loads and transfers control to the operating system's Nucleus. The Nucleus performs or initiates any necessary initialization and then commences normal OS operations.
This IPL concept is device-independent. It is capable of IPL-ing from a card deck, from a magnetic tape, or from adirect access storage device, (DASD), e.g., disk, drum. The Read IPL (X'02') command, which is simulated by the CPU, is a Read EBCDIC Select Stacker 1 read command on the card reader and a Read command on tape media (which are inherently sequential access in nature), but a special Read-IPL command on DASD.
DASD controllers accept the X'02' command, seek to cylinder X'0000' head X'0000', skip to the index point (i.e., just past the track descriptor record (R0)) and then treat the Read IPL command as if it were a Read Data (X'06') command. Without this special DASD controller behavior, device-independent IPL would not be possible. On a DASD, the IPL Text is contained on cylinder X'0000', track X'0000', and record X'01' (24 bytes), and cylinder X'0000', track X'0000', and record X'02' (fairly large, certainly somewhat more than 3,000 bytes). The volume label is always contained on cylinder X'0000', track X'0000', and block X'03' (80 bytes). The volume label always points to the VTOC, with a pointer of the form HHHH (that is, the VTOC must reside within the first 65,536 tracks). The VTOC'sFormat 4 DSCB defines the extent (size) of the VTOC, so the volume label only needs a pointer to the first track in the VTOC's extent, and as the Format 4 DSCB, which describes the VTOC, is always the very first DSCB in the VTOC, HHHH also points to the Format 4 DSCB.
If an attempt is made to IPL from a device that was not initialized with IPL Text, the system simply enters a wait state. The DASD (direct access storage device) initialization program, IBCDASDI, or the DASD initialization application, ICKDSF, places a wait state PSW and a dummy CCW string in the 24 bytes, should the device be designated for data only, not for IPL, after which these programs format theVTOC and perform other hard drive initialization functions.
Cycle-stealing is a form of interrupt in which the component needing access to memory or to the processor takes control for an entire machine cycle.
* Similarities to Mainframe, * System 360 Block Multiplexed Channel, *Trend to Microcomputers