Linux for S/390 and zSeries

Common Device Support (CDS)Device Driver I/O Support Routines

Authors:
  • Ingo Adlung
  • Cornelia Huck

Copyright, IBM Corp. 1999-2002

Introduction

This document describes the common device support routines for Linux/390.Different than other hardware architectures, ESA/390 has defined a unifiedI/O access method. This gives relief to the device drivers as they don’thave to deal with different bus types, polling versus interruptprocessing, shared versus non-shared interrupt processing, DMA versus portI/O (PIO), and other hardware features more. However, this implies thateither every single device driver needs to implement the hardware I/Oattachment functionality itself, or the operating system provides for aunified method to access the hardware, providing all the functionality thatevery single device driver would have to provide itself.

The document does not intend to explain the ESA/390 hardware architecture inevery detail.This information can be obtained from the ESA/390 Principles ofOperation manual (IBM Form. No. SA22-7201).

In order to build common device support for ESA/390 I/O interfaces, afunctional layer was introduced that provides generic I/O access methods tothe hardware.

The common device support layer comprises the I/O support routines definedbelow. Some of them implement common Linux device driver interfaces, whilesome of them are ESA/390 platform specific.

Note:
In order to write a driver for S/390, you also need to look into the interfacedescribed in Documentation/s390/driver-model.rst.

Note for porting drivers from 2.4:

The major changes are:

  • The functions use a ccw_device instead of an irq (subchannel).
  • All drivers must define a ccw_driver (see driver-model.txt) and the associatedfunctions.
  • request_irq() andfree_irq() are no longer done by the driver.
  • The oper_handler is (kindof) replaced by the probe() and set_online() functionsof the ccw_driver.
  • The not_oper_handler is (kindof) replaced by the remove() and set_offline()functions of the ccw_driver.
  • The channel device layer is gone.
  • The interrupt handlers must be adapted to use a ccw_device as argument.Moreover, they don’t return a devstat, but an irb.
  • Before initiating an io, the options must be set viaccw_device_set_options().
  • Instead of calling read_dev_chars()/read_conf_data(), the driver issuesthe channel program and handles the interrupt itself.
ccw_device_get_ciw()
get commands from extended sense data.
ccw_device_start(), ccw_device_start_timeout(), ccw_device_start_key(), ccw_device_start_key_timeout()
initiate an I/O request.
ccw_device_resume()
resume channel program execution.
ccw_device_halt()
terminate the current I/O request processed on the device.
do_IRQ()
generic interrupt routine. This function is called by the interrupt entryroutine whenever an I/O interrupt is presented to the system. The do_IRQ()routine determines the interrupt status and calls the device specificinterrupt handler according to the rules (flags) defined during I/O requestinitiation with do_IO().

The next chapters describe the functions other than do_IRQ() in more details.The do_IRQ() interface is not described, as it is called from the Linux/390first level interrupt handler only and does not comprise a device drivercallable interface. Instead, the functional description of do_IO() alsodescribes the input to the device specific interrupt handler.

Note:
All explanations apply also to the 64 bit architecture s390x.

Common Device Support (CDS) for Linux/390 Device Drivers

General Information

The following chapters describe the I/O related interface routines theLinux/390 common device support (CDS) provides to allow for device specificdriver implementations on the IBM ESA/390 hardware platform. Those interfacesintend to provide the functionality required by every device driverimplementation to allow to drive a specific hardware device on the ESA/390platform. Some of the interface routines are specific to Linux/390 and someof them can be found on other Linux platforms implementations too.Miscellaneous function prototypes, data declarations, and macro definitionscan be found in the architecture specific C header filelinux/arch/s390/include/asm/irq.h.

Overview of CDS interface concepts

Different to other hardware platforms, the ESA/390 architecture doesn’t defineinterrupt lines managed by a specific interrupt controller and bus systemsthat may or may not allow for shared interrupts, DMA processing, etc.. Instead,the ESA/390 architecture has implemented a so called channel subsystem, thatprovides a unified view of the devices physically attached to the systems.Though the ESA/390 hardware platform knows about a huge variety of differentperipheral attachments like disk devices (aka. DASDs), tapes, communicationcontrollers, etc. they can all be accessed by a well defined access method andthey are presenting I/O completion a unified way : I/O interruptions. Everysingle device is uniquely identified to the system by a so called subchannel,where the ESA/390 architecture allows for 64k devices be attached.

Linux, however, was first built on the Intel PC architecture, with its twocascaded 8259 programmable interrupt controllers (PICs), that allow for amaximum of 15 different interrupt lines. All devices attached to such a systemshare those 15 interrupt levels. Devices attached to the ISA bus system mustnot share interrupt levels (aka. IRQs), as the ISA bus bases on edge triggeredinterrupts. MCA, EISA, PCI and other bus systems base on level triggeredinterrupts, and therewith allow for shared IRQs. However, if multiple devicespresent their hardware status by the same (shared) IRQ, the operating systemhas to call every single device driver registered on this IRQ in order todetermine the device driver owning the device that raised the interrupt.

Up to kernel 2.4, Linux/390 used to provide interfaces via the IRQ (subchannel).For internal use of the common I/O layer, these are still there. However,device drivers should use the new calling interface via the ccw_device only.

During its startup the Linux/390 system checks for peripheral devices. Eachof those devices is uniquely defined by a so called subchannel by the ESA/390channel subsystem. While the subchannel numbers are system generated, eachsubchannel also takes a user defined attribute, the so called device number.Both subchannel number and device number cannot exceed 65535. During sysfsinitialisation, the information about control unit type and device types thatimply specific I/O commands (channel command words - CCWs) in order to operatethe device are gathered. Device drivers can retrieve this set of hardwareinformation during their initialization step to recognize the devices theysupport using the information saved in the struct ccw_device given to them.This methods implies that Linux/390 doesn’t require to probe for free (notarmed) interrupt request lines (IRQs) to drive its devices with. Whereapplicable, the device drivers can use issue the READ DEVICE CHARACTERISTICSccw to retrieve device characteristics in its online routine.

In order to allow for easy I/O initiation the CDS layer provides accw_device_start() interface that takes a device specific channel program (oneor more CCWs) as input sets up the required architecture specific control blocksand initiates an I/O request on behalf of the device driver. Theccw_device_start() routine allows to specify whether it expects the CDS layerto notify the device driver for every interrupt it observes, or with final statusonly. Seeccw_device_start() for more details. A device driver must never issueESA/390 I/O commands itself, but must use the Linux/390 CDS interfaces instead.

For long running I/O request to be canceled, the CDS layer provides theccw_device_halt() function. Some devices require to initially issue a HALTSUBCHANNEL (HSCH) command without having pending I/O requests. This function isalso covered byccw_device_halt().

get_ciw() - get command information word

This call enables a device driver to get information about supported commandsfrom the extended SenseID data.

struct ciw *ccw_device_get_ciw(struct ccw_device *cdev, __u32 cmd);
cdevThe ccw_device for which the command is to be retrieved.
cmdThe command type to be retrieved.

ccw_device_get_ciw() returns:

NULLNo extended data available, invalid device or command not found.
!NULLThe command requested.
ccw_device_start() - Initiate I/O Request

Theccw_device_start() routines is the I/O request front-end processor. Alldevice driver I/O requests must be issued using this routine. A device drivermust not issue ESA/390 I/O commands itself. Instead theccw_device_start()routine provides all interfaces required to drive arbitrary devices.

This description also covers the status information passed to the devicedriver’s interrupt handler as this is related to the rules (flags) definedwith the associated I/O request when callingccw_device_start().

int ccw_device_start(struct ccw_device *cdev,                     struct ccw1 *cpa,                     unsigned long intparm,                     __u8 lpm,                     unsigned long flags);int ccw_device_start_timeout(struct ccw_device *cdev,                             struct ccw1 *cpa,                             unsigned long intparm,                             __u8 lpm,                             unsigned long flags,                             int expires);int ccw_device_start_key(struct ccw_device *cdev,                         struct ccw1 *cpa,                         unsigned long intparm,                         __u8 lpm,                         __u8 key,                         unsigned long flags);int ccw_device_start_key_timeout(struct ccw_device *cdev,                                 struct ccw1 *cpa,                                 unsigned long intparm,                                 __u8 lpm,                                 __u8 key,                                 unsigned long flags,                                 int expires);
cdevccw_device the I/O is destined for
cpalogical start address of channel program
user_intparmuser specific interrupt information; will be presentedback to the device driver’s interrupt handler. Allows adevice driver to associate the interrupt with aparticular I/O request.
lpmdefines the channel path to be used for a specific I/Orequest. A value of 0 will make cio use the opm.
keythe storage key to use for the I/O (useful for operating on astorage with a storage key != default key)
flagdefines the action to be performed for I/O processing
expirestimeout value in jiffies. The common I/O layer will terminatethe running program after this and call the interrupt handlerwith ERR_PTR(-ETIMEDOUT) as irb.

Possible flag values are:

DOIO_ALLOW_SUSPENDchannel program may become suspended
DOIO_DENY_PREFETCHdon’t allow for CCW prefetch; usuallythis implies the channel program mightbecome modified
DOIO_SUPPRESS_INTERdon’t call the handler on intermediate status

The cpa parameter points to the first format 1 CCW of a channel program:

struct ccw1 {      __u8  cmd_code;/* command code */      __u8  flags;   /* flags, like IDA addressing, etc. */      __u16 count;   /* byte count */      __u32 cda;     /* data address */} __attribute__ ((packed,aligned(8)));

with the following CCW flags values defined:

CCW_FLAG_DCdata chaining
CCW_FLAG_CCcommand chaining
CCW_FLAG_SLIsuppress incorrect length
CCW_FLAG_SKIPskip
CCW_FLAG_PCIPCI
CCW_FLAG_IDAindirect addressing
CCW_FLAG_SUSPENDsuspend

Viaccw_device_set_options(), the device driver may specify the followingoptions for the device:

DOIO_EARLY_NOTIFICATIONallow for early interrupt notification
DOIO_REPORT_ALLreport all interrupt conditions

Theccw_device_start() function returns:

0successful completion or request successfully initiated
-EBUSYThe device is currently processing a previous I/O request, or there isa status pending at the device.
-ENODEVcdev is invalid, the device is not operational or the ccw_device isnot online.

When the I/O request completes, the CDS first level interrupt handler willaccumulate the status in a struct irb and then call the device interrupt handler.The intparm field will contain the value the device driver has associated with aparticular I/O request. If a pending device status was recognized,intparm will be set to 0 (zero). This may happen during I/O initiation or delayedby an alert status notification. In any case this status is not related to thecurrent (last) I/O request. In case of a delayed status notification no specialinterrupt will be presented to indicate I/O completion as the I/O request wasnever started, even thoughccw_device_start() returned with successful completion.

The irb may contain an error value, and the device driver should check for thisfirst:

-ETIMEDOUTthe common I/O layer terminated the request after the specifiedtimeout value
-EIOthe common I/O layer terminated the request due to an error state

If the concurrent sense flag in the extended status word (esw) in the irb isset, the field erw.scnt in the esw describes the number of device specificsense bytes available in the extended control word irb->scsw.ecw[]. No devicesensing by the device driver itself is required.

The device interrupt handler can use the following definitions to investigatethe primary unit check source coded in sense byte 0 :

SNS0_CMD_REJECT0x80
SNS0_INTERVENTION_REQ0x40
SNS0_BUS_OUT_CHECK0x20
SNS0_EQUIPMENT_CHECK0x10
SNS0_DATA_CHECK0x08
SNS0_OVERRUN0x04
SNS0_INCOMPL_DOMAIN0x01

Depending on the device status, multiple of those values may be set together.Please refer to the device specific documentation for details.

The irb->scsw.cstat field provides the (accumulated) subchannel status :

SCHN_STAT_PCIprogram controlled interrupt
SCHN_STAT_INCORR_LENincorrect length
SCHN_STAT_PROG_CHECKprogram check
SCHN_STAT_PROT_CHECKprotection check
SCHN_STAT_CHN_DATA_CHKchannel data check
SCHN_STAT_CHN_CTRL_CHKchannel control check
SCHN_STAT_INTF_CTRL_CHKinterface control check
SCHN_STAT_CHAIN_CHECKchaining check

The irb->scsw.dstat field provides the (accumulated) device status :

DEV_STAT_ATTENTIONattention
DEV_STAT_STAT_MODstatus modifier
DEV_STAT_CU_ENDcontrol unit end
DEV_STAT_BUSYbusy
DEV_STAT_CHN_ENDchannel end
DEV_STAT_DEV_ENDdevice end
DEV_STAT_UNIT_CHECKunit check
DEV_STAT_UNIT_EXCEPunit exception

Please see the ESA/390 Principles of Operation manual for details on theindividual flag meanings.

Usage Notes:

ccw_device_start() must be called disabled and with the ccw device lock held.

The device driver is allowed to issue the nextccw_device_start() call fromwithin its interrupt handler already. It is not required to schedule abottom-half, unless a non deterministically long running error recovery procedureor similar needs to be scheduled. During I/O processing the Linux/390 genericI/O device driver support has already obtained the IRQ lock, i.e. the handlermust not try to obtain it again when callingccw_device_start() or we end in adeadlock situation!

If a device driver relies on an I/O request to be completed prior to start thenext it can reduce I/O processing overhead by chaining a NoOp I/O commandCCW_CMD_NOOP to the end of the submitted CCW chain. This will force Channel-Endand Device-End status to be presented together, with a single interrupt.However, this should be used with care as it implies the channel will remainbusy, not being able to process I/O requests for other devices on the samechannel. Therefore e.g. read commands should never use this technique, as theresult will be presented by a single interrupt anyway.

In order to minimize I/O overhead, a device driver should use theDOIO_REPORT_ALL only if the device can report intermediate interruptinformation prior to device-end the device driver urgently relies on. In thiscase all I/O interruptions are presented to the device driver until finalstatus is recognized.

If a device is able to recover from asynchronously presented I/O errors, it canperform overlapping I/O using the DOIO_EARLY_NOTIFICATION flag. While somedevices always report channel-end and device-end together, with a singleinterrupt, others present primary status (channel-end) when the channel isready for the next I/O request and secondary status (device-end) when the datatransmission has been completed at the device.

Above flag allows to exploit this feature, e.g. for communication devices thatcan handle lost data on the network to allow for enhanced I/O processing.

Unless the channel subsystem at any time presents a secondary status interrupt,exploiting this feature will cause only primary status interrupts to bepresented to the device driver while overlapping I/O is performed. When asecondary status without error (alert status) is presented, this indicatessuccessful completion for all overlappingccw_device_start() requests that havebeen issued since the last secondary (final) status.

Channel programs that intend to set the suspend flag on a channel command word(CCW) must start the I/O operation with the DOIO_ALLOW_SUSPEND option or thesuspend flag will cause a channel program check. At the time the channel programbecomes suspended an intermediate interrupt will be generated by the channelsubsystem.

ccw_device_resume() - Resume Channel Program Execution

If a device driver chooses to suspend the current channel program execution bysetting the CCW suspend flag on a particular CCW, the channel program executionis suspended. In order to resume channel program execution the CIO layerprovides theccw_device_resume() routine.

int ccw_device_resume(struct ccw_device *cdev);
cdevccw_device the resume operation is requested for

Theccw_device_resume() function returns:

0suspended channel program is resumed
-EBUSYstatus pending
-ENODEVcdev invalid or not-operational subchannel
-EINVALresume function not applicable
-ENOTCONNthere is no I/O request pending for completion

Usage Notes:

Please have a look at theccw_device_start() usage notes for more details onsuspended channel programs.

ccw_device_halt() - Halt I/O Request Processing

Sometimes a device driver might need a possibility to stop the processing ofa long-running channel program or the device might require to initially issuea halt subchannel (HSCH) I/O command. For those purposes theccw_device_halt()command is provided.

ccw_device_halt() must be called disabled and with the ccw device lock held.

int ccw_device_halt(struct ccw_device *cdev,                    unsigned long intparm);
cdevccw_device the halt operation is requested for
intparminterruption parameter; value is only used if no I/Ois outstanding, otherwise the intparm associated withthe I/O request is returned

Theccw_device_halt() function returns:

0request successfully initiated
-EBUSYthe device is currently busy, or status pending.
-ENODEVcdev invalid.
-EINVALThe device is not operational or the ccw device is not online.

Usage Notes:

A device driver may write a never-ending channel program by writing a channelprogram that at its end loops back to its beginning by means of a transfer inchannel (TIC) command (CCW_CMD_TIC). Usually this is performed by networkdevice drivers by setting the PCI CCW flag (CCW_FLAG_PCI). Once this CCW isexecuted a program controlled interrupt (PCI) is generated. The device drivercan then perform an appropriate action. Prior to interrupt of an outstandingread to a network device (with or without PCI flag) accw_device_halt()is required to end the pending operation.

ccw_device_clear() - Terminage I/O Request Processing

In order to terminate all I/O processing at the subchannel, the clear subchannel(CSCH) command is used. It can be issued viaccw_device_clear().

ccw_device_clear() must be called disabled and with the ccw device lock held.

int ccw_device_clear(struct ccw_device *cdev, unsigned long intparm);
cdevccw_device the clear operation is requested for
intparminterruption parameter (seeccw_device_halt())

Theccw_device_clear() function returns:

0request successfully initiated
-ENODEVcdev invalid
-EINVALThe device is not operational or the ccw device is not online.

Miscellaneous Support Routines

This chapter describes various routines to be used in a Linux/390 devicedriver programming environment.

get_ccwdev_lock()

Get the address of the device specific lock. This is then used inspin_lock() / spin_unlock() calls.

__u8 ccw_device_get_path_mask(struct ccw_device *cdev);

Get the mask of the path currently available for cdev.