NT synchronization primitive driver

This page documents the user-space API for the ntsync driver.

ntsync is a support driver for emulation of NT synchronizationprimitives by user-space NT emulators. It exists because implementationin user-space, using existing tools, cannot match Windows performancewhile offering accurate semantics. It is implemented entirely insoftware, and does not drive any hardware device.

This interface is meant as a compatibility tool only, and should notbe used for general synchronization. Instead use generic, versatileinterfaces such as futex(2) and poll(2).

Synchronization primitives

The ntsync driver exposes three types of synchronization primitives:semaphores, mutexes, and events.

A semaphore holds a single volatile 32-bit counter, and a static 32-bitinteger denoting the maximum value. It is considered signaled (that is,can be acquired without contention, or will wake up a waiting thread)when the counter is nonzero. The counter is decremented by one when await is satisfied. Both the initial and maximum count are establishedwhen the semaphore is created.

A mutex holds a volatile 32-bit recursion count, and a volatile 32-bitidentifier denoting its owner. A mutex is considered signaled when itsowner is zero (indicating that it is not owned). The recursion count isincremented when a wait is satisfied, and ownership is set to the givenidentifier.

A mutex also holds an internal flag denoting whether its previous ownerhas died; such a mutex is said to be abandoned. Owner death is nottracked automatically based on thread death, but rather must becommunicated usingNTSYNC_IOC_MUTEX_KILL. An abandoned mutex isinherently considered unowned.

Except for the “unowned” semantics of zero, the actual value of theowner identifier is not interpreted by the ntsync driver at all. Theintended use is to store a thread identifier; however, the ntsyncdriver does not actually validate that a calling thread providesconsistent or unique identifiers.

An event is similar to a semaphore with a maximum count of one. It holdsa volatile boolean state denoting whether it is signaled or not. Thereare two types of events, auto-reset and manual-reset. An auto-resetevent is designaled when a wait is satisfied; a manual-reset event isnot. The event type is specified when the event is created.

Unless specified otherwise, all operations on an object are atomic andtotally ordered with respect to other operations on the same object.

Objects are represented by files. When all file descriptors to anobject are closed, that object is deleted.

Char device

The ntsync driver creates a single char device /dev/ntsync. Each filedescription opened on the device represents a unique instance intendedto back an individual NT virtual machine. Objects created by one ntsyncinstance may only be used with other objects created by the sameinstance.

ioctl reference

All operations on the device are done through ioctls. There are fourstructures used in ioctl calls:

struct ntsync_sem_args {     __u32 count;     __u32 max;};struct ntsync_mutex_args {     __u32 owner;     __u32 count;};struct ntsync_event_args {     __u32 signaled;     __u32 manual;};struct ntsync_wait_args {     __u64 timeout;     __u64 objs;     __u32 count;     __u32 owner;     __u32 index;     __u32 alert;     __u32 flags;     __u32 pad;};

Depending on the ioctl, members of the structure may be used as input,output, or not at all.

The ioctls on the device file are as follows:

NTSYNC_IOC_CREATE_SEM

Create a semaphore object. Takes a pointer to structntsync_sem_args, which is used as follows:

count

Initial count of the semaphore.

max

Maximum count of the semaphore.

Fails withEINVAL ifcount is greater thanmax.On success, returns a file descriptor the created semaphore.

NTSYNC_IOC_CREATE_MUTEX

Create a mutex object. Takes a pointer to structntsync_mutex_args, which is used as follows:

count

Initial recursion count of the mutex.

owner

Initial owner of the mutex.

Ifowner is nonzero andcount is zero, or ifowner iszero andcount is nonzero, the function fails withEINVAL.On success, returns a file descriptor the created mutex.

NTSYNC_IOC_CREATE_EVENT

Create an event object. Takes a pointer to structntsync_event_args, which is used as follows:

signaled

If nonzero, the event is initially signaled, otherwisenonsignaled.

manual

If nonzero, the event is a manual-reset event, otherwiseauto-reset.

On success, returns a file descriptor the created event.

The ioctls on the individual objects are as follows:

NTSYNC_IOC_SEM_POST

Post to a semaphore object. Takes a pointer to a 32-bit integer,which on input holds the count to be added to the semaphore, and onoutput contains its previous count.

If adding to the semaphore’s current count would raise the latterpast the semaphore’s maximum count, the ioctl fails withEOVERFLOW and the semaphore is not affected. If raising thesemaphore’s count causes it to become signaled, eligible threadswaiting on this semaphore will be woken and the semaphore’s countdecremented appropriately.

NTSYNC_IOC_MUTEX_UNLOCK

Release a mutex object. Takes a pointer to structntsync_mutex_args, which is used as follows:

owner

Specifies the owner trying to release this mutex.

count

On output, contains the previous recursion count.

Ifowner is zero, the ioctl fails withEINVAL. Ifowneris not the current owner of the mutex, the ioctl fails withEPERM.

The mutex’s count will be decremented by one. If decrementing themutex’s count causes it to become zero, the mutex is marked asunowned and signaled, and eligible threads waiting on it will bewoken as appropriate.

NTSYNC_IOC_SET_EVENT

Signal an event object. Takes a pointer to a 32-bit integer, which onoutput contains the previous state of the event.

Eligible threads will be woken, and auto-reset events will bedesignaled appropriately.

NTSYNC_IOC_RESET_EVENT

Designal an event object. Takes a pointer to a 32-bit integer, whichon output contains the previous state of the event.

NTSYNC_IOC_PULSE_EVENT

Wake threads waiting on an event object while leaving it in anunsignaled state. Takes a pointer to a 32-bit integer, which onoutput contains the previous state of the event.

A pulse operation can be thought of as a set followed by a reset,performed as a single atomic operation. If two threads are waiting onan auto-reset event which is pulsed, only one will be woken. If twothreads are waiting a manual-reset event which is pulsed, both willbe woken. However, in both cases, the event will be unsignaledafterwards, and a simultaneous read operation will always report theevent as unsignaled.

NTSYNC_IOC_READ_SEM

Read the current state of a semaphore object. Takes a pointer tostructntsync_sem_args, which is used as follows:

count

On output, contains the current count of the semaphore.

max

On output, contains the maximum count of the semaphore.

NTSYNC_IOC_READ_MUTEX

Read the current state of a mutex object. Takes a pointer to structntsync_mutex_args, which is used as follows:

owner

On output, contains the current owner of the mutex, or zeroif the mutex is not currently owned.

count

On output, contains the current recursion count of the mutex.

If the mutex is marked as abandoned, the function fails withEOWNERDEAD. In this case,count andowner are set tozero.

NTSYNC_IOC_READ_EVENT

Read the current state of an event object. Takes a pointer to structntsync_event_args, which is used as follows:

signaled

On output, contains the current state of the event.

manual

On output, contains 1 if the event is a manual-reset event,and 0 otherwise.

NTSYNC_IOC_KILL_OWNER

Mark a mutex as unowned and abandoned if it is owned by the givenowner. Takes an input-only pointer to a 32-bit integer denoting theowner. If the owner is zero, the ioctl fails withEINVAL. If theowner does not own the mutex, the function fails withEPERM.

Eligible threads waiting on the mutex will be woken as appropriate(and such waits will fail withEOWNERDEAD, as described below).

NTSYNC_IOC_WAIT_ANY

Poll on any of a list of objects, atomically acquiring at most one.Takes a pointer to structntsync_wait_args, which isused as follows:

timeout

Absolute timeout in nanoseconds. IfNTSYNC_WAIT_REALTIMEis set, the timeout is measured against the REALTIME clock;otherwise it is measured against the MONOTONIC clock. If thetimeout is equal to or earlier than the current time, thefunction returns immediately without sleeping. Iftimeoutis U64_MAX, the function will sleep until an object issignaled, and will not fail withETIMEDOUT.

objs

Pointer to an array ofcount file descriptors(specified as an integer so that the structure has the samesize regardless of architecture). If any object isinvalid, the function fails withEINVAL.

count

Number of objects specified in theobjs array.If greater thanNTSYNC_MAX_WAIT_COUNT, the function failswithEINVAL.

owner

Mutex owner identifier. If any object inobjs is a mutex,the ioctl will attempt to acquire that mutex on behalf ofowner. Ifowner is zero, the ioctl fails withEINVAL.

index

On success, contains the index (intoobjs) of the objectwhich was signaled. Ifalert was signaled instead,this containscount.

alert

Optional event object file descriptor. If nonzero, thisspecifies an “alert” event object which, if signaled, willterminate the wait. If nonzero, the identifier must point to avalid event.

flags

Zero or more flags. Currently the only flag isNTSYNC_WAIT_REALTIME, which causes the timeout to bemeasured against the REALTIME clock instead of MONOTONIC.

pad

Unused, must be set to zero.

This function attempts to acquire one of the given objects. If unableto do so, it sleeps until an object becomes signaled, subsequentlyacquiring it, or the timeout expires. In the latter case the ioctlfails withETIMEDOUT. The function only acquires one object, evenif multiple objects are signaled.

A semaphore is considered to be signaled if its count is nonzero, andis acquired by decrementing its count by one. A mutex is consideredto be signaled if it is unowned or if its owner matches theownerargument, and is acquired by incrementing its recursion count by oneand setting its owner to theowner argument. An auto-reset eventis acquired by designaling it; a manual-reset event is not affectedby acquisition.

Acquisition is atomic and totally ordered with respect to otheroperations on the same object. If two wait operations (with differentowner identifiers) are queued on the same mutex, only one issignaled. If two wait operations are queued on the same semaphore,and a value of one is posted to it, only one is signaled.

If an abandoned mutex is acquired, the ioctl fails withEOWNERDEAD. Although this is a failure return, the function mayotherwise be considered successful. The mutex is marked as owned bythe given owner (with a recursion count of 1) and as no longerabandoned, andindex is still set to the index of the mutex.

Thealert argument is an “extra” event which can terminate thewait, independently of all other objects.

It is valid to pass the same object more than once, including bypassing the same event in theobjs array and inalert. If awakeup occurs due to that object being signaled,index is set tothe lowest index corresponding to that object.

The function may fail withEINTR if a signal is received.

NTSYNC_IOC_WAIT_ALL

Poll on a list of objects, atomically acquiring all of them. Takes apointer to structntsync_wait_args, which is usedidentically toNTSYNC_IOC_WAIT_ANY, except thatindex isalways filled with zero on success if not woken via alert.

This function attempts to simultaneously acquire all of the givenobjects. If unable to do so, it sleeps until all objects becomesimultaneously signaled, subsequently acquiring them, or the timeoutexpires. In the latter case the ioctl fails withETIMEDOUT and noobjects are modified.

Objects may become signaled and subsequently designaled (throughacquisition by other threads) while this thread is sleeping. Onlyonce all objects are simultaneously signaled does the ioctl acquirethem and return. The entire acquisition is atomic and totally orderedwith respect to other operations on any of the given objects.

If an abandoned mutex is acquired, the ioctl fails withEOWNERDEAD. Similarly toNTSYNC_IOC_WAIT_ANY, all objects arenevertheless marked as acquired. Note that if multiple mutex objectsare specified, there is no way to know which were marked asabandoned.

As with “any” waits, thealert argument is an “extra” event whichcan terminate the wait. Critically, however, an “all” wait willsucceed if all members inobjs are signaled,or ifalert issignaled. In the latter caseindex will be set tocount. Aswith “any” waits, if both conditions are filled, the former takespriority, and objects inobjs will be acquired.

UnlikeNTSYNC_IOC_WAIT_ANY, it is not valid to pass the sameobject more than once, nor is it valid to pass the same object inobjs and inalert. If this is attempted, the function failswithEINVAL.