futex2

Author:

André Almeida <andrealmeid@collabora.com>

futex, or fast user mutex, is a set of syscalls to allow userspace to createperformant synchronization mechanisms, such as mutexes, semaphores andconditional variables in userspace. C standard libraries, like glibc, uses itas a means to implement more high level interfaces like pthreads.

futex2 is a followup version of the initial futex syscall, designed to overcomelimitations of the original interface.

User API

futex_waitv()

Wait on an array of futexes, wake on any:

futex_waitv(struct futex_waitv *waiters, unsigned int nr_futexes,            unsigned int flags, struct timespec *timeout, clockid_t clockid)struct futex_waitv {      __u64 val;      __u64 uaddr;      __u32 flags;      __u32 __reserved;};

Userspace sets an array ofstructfutex_waitv (up to a max of 128 entries),usinguaddr for the address to wait for,val for the expected valueandflags to specify the type (e.g. private) and size of futex.__reserved needs to be 0, but it can be used for future extension. Thepointer for the first item of the array is passed aswaiters. An invalidaddress forwaiters or for anyuaddr returns-EFAULT.

If userspace has 32-bit pointers, it should do a explicit cast to make surethe upper bits are zeroed.uintptr_t does the tricky and it works forboth 32/64-bit pointers.

nr_futexes specifies the size of the array. Numbers out of [1, 128]interval will make the syscall return-EINVAL.

Theflags argument of the syscall needs to be 0, but it can be used forfuture extension.

For each entry inwaiters array, the current value atuaddr is comparedtoval. If it’s different, the syscall undo all the work done so far andreturn-EAGAIN. If all tests and verifications succeeds, syscall waits untilone of the following happens:

  • The timeout expires, returning-ETIMEOUT.

  • A signal was sent to the sleeping task, returning-ERESTARTSYS.

  • Some futex at the list was woken, returning the index of some waked futex.

An example of how to use the interface can be found attools/testing/selftests/futex/functional/futex_waitv.c.

Timeout

structtimespec*timeout argument is an optional argument that points to anabsolute timeout. You need to specify the type of clock being used atclockid argument.CLOCK_MONOTONIC andCLOCK_REALTIME are supported.This syscall accepts only 64bit timespec structs.

Types of futex

A futex can be either private or shared. Private is used for processes thatshares the same memory space and the virtual address of the futex will be thesame for all processes. This allows for optimizations in the kernel. To useprivate futexes, it’s necessary to specifyFUTEX_PRIVATE_FLAG in the futexflag. For processes that doesn’t share the same memory space and therefore canhave different virtual addresses for the same futex (using, for instance, afile-backed shared memory) requires different internal mechanisms to be getproperly enqueued. This is the default behavior, and it works with both privateand shared futexes.

Futexes can be of different sizes: 8, 16, 32 or 64 bits. Currently, the onlysupported one is 32 bit sized futex, and it need to be specified usingFUTEX_32 flag.