Userspace verbs access¶
The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_VERBS,enables direct userspace access to IB hardware via “verbs,” asdescribed in chapter 11 of the InfiniBand Architecture Specification.
To use the verbs, the libibverbs library, available fromhttps://github.com/linux-rdma/rdma-core, is required. libibverbs contains adevice-independent API for using the ib_uverbs interface.libibverbs also requires appropriate device-dependent kernel anduserspace driver for your InfiniBand hardware. For example, to usea Mellanox HCA, you will need the ib_mthca kernel module and thelibmthca userspace driver be installed.
User-kernel communication¶
Userspace communicates with the kernel for slow path, resourcemanagement operations via the /dev/infiniband/uverbsN characterdevices. Fast path operations are typically performed by writingdirectly to hardware registers mmap()ed into userspace, with nosystem call or context switch into the kernel.
Commands are sent to the kernel via write()s on these device files.The ABI is defined in drivers/infiniband/include/ib_user_verbs.h.The structs for commands that require a response from the kernelcontain a 64-bit field used to pass a pointer to an output buffer.Status is returned to userspace as the return value of the write()system call.
Resource management¶
Since creation and destruction of all IB resources is done bycommands passed through a file descriptor, the kernel can keep trackof which resources are attached to a given userspace context. Theib_uverbs module maintains idr tables that are used to translatebetween kernel pointers and opaque userspace handles, so that kernelpointers are never exposed to userspace and userspace cannot trickthe kernel into following a bogus pointer.
This also allows the kernel to clean up when a process exits andprevent one process from touching another process’s resources.
Memory pinning¶
Direct userspace I/O requires that memory regions that are potentialI/O targets be kept resident at the same physical address. Theib_uverbs module manages pinning and unpinning memory regions viaget_user_pages() and put_page() calls. It also accounts for theamount of memory pinned in the process’s pinned_vm, and checks thatunprivileged processes do not exceed their RLIMIT_MEMLOCK limit.
Pages that are pinned multiple times are counted each time they arepinned, so the value of pinned_vm may be an overestimate of thenumber of pages pinned by a process.
/dev files¶
To create the appropriate character device files automatically withudev, a rule like:
KERNEL=="uverbs*", NAME="infiniband/%k"can be used. This will create device nodes named:
/dev/infiniband/uverbs0and so on. Since the InfiniBand userspace verbs should be safe foruse by non-privileged processes, it may be useful to add anappropriate MODE or GROUP to the udev rule.