Bus-Independent Device Accesses

Author:Matthew Wilcox
Author:Alan Cox

Introduction

Linux provides an API which abstracts performing IO across all bussesand devices, allowing device drivers to be written independently of bustype.

Memory Mapped IO

Getting Access to the Device

The most widely supported form of IO is memory mapped IO. That is, apart of the CPU’s address space is interpreted not as accesses tomemory, but as accesses to a device. Some architectures define devicesto be at a fixed address, but most have some method of discoveringdevices. The PCI bus walk is a good example of such a scheme. Thisdocument does not cover how to receive such an address, but assumes youare starting with one. Physical addresses are of type unsigned long.

This address should not be used directly. Instead, to get an addresssuitable for passing to the accessor functions described below, youshould callioremap(). An address suitable for accessingthe device will be returned to you.

After you’ve finished using the device (say, in your module’s exitroutine), calliounmap() in order to return the addressspace to the kernel. Most architectures allocate new address space eachtime you callioremap(), and they can run out unless youcalliounmap().

Accessing the device

The part of the interface most used by drivers is reading and writingmemory-mapped registers on the device. Linux provides interfaces to readand write 8-bit, 16-bit, 32-bit and 64-bit quantities. Due to ahistorical accident, these are named byte, word, long and quad accesses.Both read and write accesses are supported; there is no prefetch supportat this time.

The functions are named readb(), readw(), readl(), readq(),readb_relaxed(), readw_relaxed(), readl_relaxed(), readq_relaxed(),writeb(), writew(), writel() and writeq().

Some devices (such as framebuffers) would like to use larger transfers than8 bytes at a time. For these devices, thememcpy_toio(),memcpy_fromio() andmemset_io() functions areprovided. Do not use memset or memcpy on IO addresses; they are notguaranteed to copy data in order.

The read and write functions are defined to be ordered. That is thecompiler is not permitted to reorder the I/O sequence. When the orderingcan be compiler optimised, you can use __readb() and friends toindicate the relaxed ordering. Use this with care.

While the basic functions are defined to be synchronous with respect toeach other and ordered with respect to each other the busses the devicessit on may themselves have asynchronicity. In particular many authorsare burned by the fact that PCI bus writes are posted asynchronously. Adriver author must issue a read from the same device to ensure thatwrites have occurred in the specific cases the author cares. This kindof property cannot be hidden from driver writers in the API. In somecases, the read used to flush the device may be expected to fail (if thecard is resetting, for example). In that case, the read should be donefrom config space, which is guaranteed to soft-fail if the card doesn’trespond.

The following is an example of flushing a write to a device when thedriver would like to ensure the write’s effects are visible prior tocontinuing execution:

static inline voidqla1280_disable_intrs(struct scsi_qla_host *ha){    struct device_reg *reg;    reg = ha->iobase;    /* disable risc and host interrupts */    WRT_REG_WORD(&reg->ictrl, 0);    /*     * The following read will ensure that the above write     * has been received by the device before we return from this     * function.     */    RD_REG_WORD(&reg->ictrl);    ha->flags.ints_enabled = 0;}

PCI ordering rules also guarantee that PIO read responses arrive after anyoutstanding DMA writes from that bus, since for some devices the result ofa readb() call may signal to the driver that a DMA transaction iscomplete. In many cases, however, the driver may want to indicate that thenext readb() call has no relation to any previous DMA writesperformed by the device. The driver can use readb_relaxed() forthese cases, although only some platforms will honor the relaxedsemantics. Using the relaxed read functions will provide significantperformance benefits on platforms that support it. The qla2xxx driverprovides examples of how to use readX_relaxed(). In many cases, a majorityof the driver’s readX() calls can safely be converted to readX_relaxed()calls, since only a few will indicate or depend on DMA completion.

Port Space Accesses

Port Space Explained

Another form of IO commonly supported is Port Space. This is a range ofaddresses separate to the normal memory address space. Access to theseaddresses is generally not as fast as accesses to the memory mappedaddresses, and it also has a potentially smaller address space.

Unlike memory mapped IO, no preparation is required to access portspace.

Accessing Port Space

Accesses to this space are provided through a set of functions whichallow 8-bit, 16-bit and 32-bit accesses; also known as byte, word andlong. These functions areinb(),inw(),inl(),outb(),outw() andoutl().

Some variants are provided for these functions. Some devices requirethat accesses to their ports are slowed down. This functionality isprovided by appending a_p to the end of the function.There are also equivalents to memcpy. Theins() andouts() functions copy bytes, words or longs to the givenport.

Public Functions Provided

phys_addr_tvirt_to_phys(volatile void * address)

map virtual addresses to physical

Parameters

volatilevoid*address

address to remap

The returned physical address is the physical (CPU) mapping forthe memory address given. It is only valid to use this function onaddresses directly mapped or allocated via kmalloc.

This function does not give bus mappings for DMA transfers. Inalmost all conceivable cases a device driver should not be usingthis function

void *phys_to_virt(phys_addr_t address)

map physical address to virtual

Parameters

phys_addr_taddress

address to remap

The returned virtual address is a current CPU mapping forthe memory address given. It is only valid to use this function onaddresses that have a kernel mapping

This function does not handle bus mappings for DMA transfers. Inalmost all conceivable cases a device driver should not be usingthis function

void __iomem *ioremap(resource_size_t offset, unsigned long size)

map bus memory into CPU space

Parameters

resource_size_toffset
bus address of the memory
unsignedlongsize
size of the resource to map

Description

ioremap performs a platform specific sequence of operations tomake bus memory CPU accessible via the readb/readw/readl/writeb/writew/writel functions and the other mmio helpers. The returnedaddress is not guaranteed to be usable directly as a virtualaddress.

If the area you are trying to map is a PCI BAR you should have alook atpci_iomap().

voidiosubmit_cmds512(void __iomem * __dst, const void * src, size_t count)

copy data to single MMIO location, in 512-bit units

Parameters

void__iomem*__dst
destination, in MMIO space (must be 512-bit aligned)
constvoid*src
source
size_tcount
number of 512 bits quantities to submit

Description

Submit data from kernel space to MMIO space, in units of 512 bits at atime. Order of access is not guaranteed, nor is a memory barrierperformed afterwards.

Warning: Do not use this helper unless your driver has checked that the CPUinstruction is supported on the platform.

void __iomem *pci_iomap_range(struct pci_dev * dev, int bar, unsigned long offset, unsigned long maxlen)

create a virtual mapping cookie for a PCI BAR

Parameters

structpci_dev*dev
PCI device that owns the BAR
intbar
BAR number
unsignedlongoffset
map memory at the given offset in BAR
unsignedlongmaxlen
max length of the memory to map

Description

Using this function you will get a __iomem address to your device BAR.You can access it using ioread*() and iowrite*(). These functions hidethe details if this is a MMIO or PIO address space and will just do whatyou expect from them in the correct way.

maxlen specifies the maximum length to map. If you want to get access tothe complete BAR from offset to the end, pass0 here.

void __iomem *pci_iomap_wc_range(struct pci_dev * dev, int bar, unsigned long offset, unsigned long maxlen)

create a virtual WC mapping cookie for a PCI BAR

Parameters

structpci_dev*dev
PCI device that owns the BAR
intbar
BAR number
unsignedlongoffset
map memory at the given offset in BAR
unsignedlongmaxlen
max length of the memory to map

Description

Using this function you will get a __iomem address to your device BAR.You can access it using ioread*() and iowrite*(). These functions hidethe details if this is a MMIO or PIO address space and will just do whatyou expect from them in the correct way. When possible write combiningis used.

maxlen specifies the maximum length to map. If you want to get access tothe complete BAR from offset to the end, pass0 here.

void __iomem *pci_iomap(struct pci_dev * dev, int bar, unsigned long maxlen)

create a virtual mapping cookie for a PCI BAR

Parameters

structpci_dev*dev
PCI device that owns the BAR
intbar
BAR number
unsignedlongmaxlen
length of the memory to map

Description

Using this function you will get a __iomem address to your device BAR.You can access it using ioread*() and iowrite*(). These functions hidethe details if this is a MMIO or PIO address space and will just do whatyou expect from them in the correct way.

maxlen specifies the maximum length to map. If you want to get access tothe complete BAR without checking for its length first, pass0 here.

void __iomem *pci_iomap_wc(struct pci_dev * dev, int bar, unsigned long maxlen)

create a virtual WC mapping cookie for a PCI BAR

Parameters

structpci_dev*dev
PCI device that owns the BAR
intbar
BAR number
unsignedlongmaxlen
length of the memory to map

Description

Using this function you will get a __iomem address to your device BAR.You can access it using ioread*() and iowrite*(). These functions hidethe details if this is a MMIO or PIO address space and will just do whatyou expect from them in the correct way. When possible write combiningis used.

maxlen specifies the maximum length to map. If you want to get access tothe complete BAR without checking for its length first, pass0 here.