NT synchronization primitive driver¶
This page documents the user-space API for the ntsync driver.
ntsync is a support driver for emulation of NT synchronizationprimitives by user-space NT emulators. It exists because implementationin user-space, using existing tools, cannot match Windows performancewhile offering accurate semantics. It is implemented entirely insoftware, and does not drive any hardware device.
This interface is meant as a compatibility tool only, and should notbe used for general synchronization. Instead use generic, versatileinterfaces such as futex(2) and poll(2).
Synchronization primitives¶
The ntsync driver exposes three types of synchronization primitives:semaphores, mutexes, and events.
A semaphore holds a single volatile 32-bit counter, and a static 32-bitinteger denoting the maximum value. It is considered signaled (that is,can be acquired without contention, or will wake up a waiting thread)when the counter is nonzero. The counter is decremented by one when await is satisfied. Both the initial and maximum count are establishedwhen the semaphore is created.
A mutex holds a volatile 32-bit recursion count, and a volatile 32-bitidentifier denoting its owner. A mutex is considered signaled when itsowner is zero (indicating that it is not owned). The recursion count isincremented when a wait is satisfied, and ownership is set to the givenidentifier.
A mutex also holds an internal flag denoting whether its previous ownerhas died; such a mutex is said to be abandoned. Owner death is nottracked automatically based on thread death, but rather must becommunicated usingNTSYNC_IOC_MUTEX_KILL. An abandoned mutex isinherently considered unowned.
Except for the “unowned” semantics of zero, the actual value of theowner identifier is not interpreted by the ntsync driver at all. Theintended use is to store a thread identifier; however, the ntsyncdriver does not actually validate that a calling thread providesconsistent or unique identifiers.
An event is similar to a semaphore with a maximum count of one. It holdsa volatile boolean state denoting whether it is signaled or not. Thereare two types of events, auto-reset and manual-reset. An auto-resetevent is designaled when a wait is satisfied; a manual-reset event isnot. The event type is specified when the event is created.
Unless specified otherwise, all operations on an object are atomic andtotally ordered with respect to other operations on the same object.
Objects are represented by files. When all file descriptors to anobject are closed, that object is deleted.
Char device¶
The ntsync driver creates a single char device /dev/ntsync. Each filedescription opened on the device represents a unique instance intendedto back an individual NT virtual machine. Objects created by one ntsyncinstance may only be used with other objects created by the sameinstance.
ioctl reference¶
All operations on the device are done through ioctls. There are fourstructures used in ioctl calls:
struct ntsync_sem_args { __u32 count; __u32 max;};struct ntsync_mutex_args { __u32 owner; __u32 count;};struct ntsync_event_args { __u32 signaled; __u32 manual;};struct ntsync_wait_args { __u64 timeout; __u64 objs; __u32 count; __u32 owner; __u32 index; __u32 alert; __u32 flags; __u32 pad;};Depending on the ioctl, members of the structure may be used as input,output, or not at all.
The ioctls on the device file are as follows:
- NTSYNC_IOC_CREATE_SEM¶
Create a semaphore object. Takes a pointer to struct
ntsync_sem_args, which is used as follows:countInitial count of the semaphore.
maxMaximum count of the semaphore.
Fails with
EINVALifcountis greater thanmax.On success, returns a file descriptor the created semaphore.
- NTSYNC_IOC_CREATE_MUTEX¶
Create a mutex object. Takes a pointer to struct
ntsync_mutex_args, which is used as follows:countInitial recursion count of the mutex.
ownerInitial owner of the mutex.
If
owneris nonzero andcountis zero, or ifowneriszero andcountis nonzero, the function fails withEINVAL.On success, returns a file descriptor the created mutex.
- NTSYNC_IOC_CREATE_EVENT¶
Create an event object. Takes a pointer to struct
ntsync_event_args, which is used as follows:signaledIf nonzero, the event is initially signaled, otherwisenonsignaled.
manualIf nonzero, the event is a manual-reset event, otherwiseauto-reset.
On success, returns a file descriptor the created event.
The ioctls on the individual objects are as follows:
- NTSYNC_IOC_SEM_POST¶
Post to a semaphore object. Takes a pointer to a 32-bit integer,which on input holds the count to be added to the semaphore, and onoutput contains its previous count.
If adding to the semaphore’s current count would raise the latterpast the semaphore’s maximum count, the ioctl fails with
EOVERFLOWand the semaphore is not affected. If raising thesemaphore’s count causes it to become signaled, eligible threadswaiting on this semaphore will be woken and the semaphore’s countdecremented appropriately.
- NTSYNC_IOC_MUTEX_UNLOCK¶
Release a mutex object. Takes a pointer to struct
ntsync_mutex_args, which is used as follows:ownerSpecifies the owner trying to release this mutex.
countOn output, contains the previous recursion count.
If
owneris zero, the ioctl fails withEINVAL. Ifowneris not the current owner of the mutex, the ioctl fails withEPERM.The mutex’s count will be decremented by one. If decrementing themutex’s count causes it to become zero, the mutex is marked asunowned and signaled, and eligible threads waiting on it will bewoken as appropriate.
- NTSYNC_IOC_SET_EVENT¶
Signal an event object. Takes a pointer to a 32-bit integer, which onoutput contains the previous state of the event.
Eligible threads will be woken, and auto-reset events will bedesignaled appropriately.
- NTSYNC_IOC_RESET_EVENT¶
Designal an event object. Takes a pointer to a 32-bit integer, whichon output contains the previous state of the event.
- NTSYNC_IOC_PULSE_EVENT¶
Wake threads waiting on an event object while leaving it in anunsignaled state. Takes a pointer to a 32-bit integer, which onoutput contains the previous state of the event.
A pulse operation can be thought of as a set followed by a reset,performed as a single atomic operation. If two threads are waiting onan auto-reset event which is pulsed, only one will be woken. If twothreads are waiting a manual-reset event which is pulsed, both willbe woken. However, in both cases, the event will be unsignaledafterwards, and a simultaneous read operation will always report theevent as unsignaled.
- NTSYNC_IOC_READ_SEM¶
Read the current state of a semaphore object. Takes a pointer tostruct
ntsync_sem_args, which is used as follows:countOn output, contains the current count of the semaphore.
maxOn output, contains the maximum count of the semaphore.
- NTSYNC_IOC_READ_MUTEX¶
Read the current state of a mutex object. Takes a pointer to struct
ntsync_mutex_args, which is used as follows:ownerOn output, contains the current owner of the mutex, or zeroif the mutex is not currently owned.
countOn output, contains the current recursion count of the mutex.
If the mutex is marked as abandoned, the function fails with
EOWNERDEAD. In this case,countandownerare set tozero.
- NTSYNC_IOC_READ_EVENT¶
Read the current state of an event object. Takes a pointer to struct
ntsync_event_args, which is used as follows:signaledOn output, contains the current state of the event.
manualOn output, contains 1 if the event is a manual-reset event,and 0 otherwise.
- NTSYNC_IOC_KILL_OWNER¶
Mark a mutex as unowned and abandoned if it is owned by the givenowner. Takes an input-only pointer to a 32-bit integer denoting theowner. If the owner is zero, the ioctl fails with
EINVAL. If theowner does not own the mutex, the function fails withEPERM.Eligible threads waiting on the mutex will be woken as appropriate(and such waits will fail with
EOWNERDEAD, as described below).
- NTSYNC_IOC_WAIT_ANY¶
Poll on any of a list of objects, atomically acquiring at most one.Takes a pointer to struct
ntsync_wait_args, which isused as follows:timeoutAbsolute timeout in nanoseconds. If
NTSYNC_WAIT_REALTIMEis set, the timeout is measured against the REALTIME clock;otherwise it is measured against the MONOTONIC clock. If thetimeout is equal to or earlier than the current time, thefunction returns immediately without sleeping. Iftimeoutis U64_MAX, the function will sleep until an object issignaled, and will not fail withETIMEDOUT.objsPointer to an array of
countfile descriptors(specified as an integer so that the structure has the samesize regardless of architecture). If any object isinvalid, the function fails withEINVAL.countNumber of objects specified in the
objsarray.If greater thanNTSYNC_MAX_WAIT_COUNT, the function failswithEINVAL.ownerMutex owner identifier. If any object in
objsis a mutex,the ioctl will attempt to acquire that mutex on behalf ofowner. Ifowneris zero, the ioctl fails withEINVAL.indexOn success, contains the index (into
objs) of the objectwhich was signaled. Ifalertwas signaled instead,this containscount.alertOptional event object file descriptor. If nonzero, thisspecifies an “alert” event object which, if signaled, willterminate the wait. If nonzero, the identifier must point to avalid event.
flagsZero or more flags. Currently the only flag is
NTSYNC_WAIT_REALTIME, which causes the timeout to bemeasured against the REALTIME clock instead of MONOTONIC.padUnused, must be set to zero.
This function attempts to acquire one of the given objects. If unableto do so, it sleeps until an object becomes signaled, subsequentlyacquiring it, or the timeout expires. In the latter case the ioctlfails with
ETIMEDOUT. The function only acquires one object, evenif multiple objects are signaled.A semaphore is considered to be signaled if its count is nonzero, andis acquired by decrementing its count by one. A mutex is consideredto be signaled if it is unowned or if its owner matches the
ownerargument, and is acquired by incrementing its recursion count by oneand setting its owner to theownerargument. An auto-reset eventis acquired by designaling it; a manual-reset event is not affectedby acquisition.Acquisition is atomic and totally ordered with respect to otheroperations on the same object. If two wait operations (with different
owneridentifiers) are queued on the same mutex, only one issignaled. If two wait operations are queued on the same semaphore,and a value of one is posted to it, only one is signaled.If an abandoned mutex is acquired, the ioctl fails with
EOWNERDEAD. Although this is a failure return, the function mayotherwise be considered successful. The mutex is marked as owned bythe given owner (with a recursion count of 1) and as no longerabandoned, andindexis still set to the index of the mutex.The
alertargument is an “extra” event which can terminate thewait, independently of all other objects.It is valid to pass the same object more than once, including bypassing the same event in the
objsarray and inalert. If awakeup occurs due to that object being signaled,indexis set tothe lowest index corresponding to that object.The function may fail with
EINTRif a signal is received.
- NTSYNC_IOC_WAIT_ALL¶
Poll on a list of objects, atomically acquiring all of them. Takes apointer to struct
ntsync_wait_args, which is usedidentically toNTSYNC_IOC_WAIT_ANY, except thatindexisalways filled with zero on success if not woken via alert.This function attempts to simultaneously acquire all of the givenobjects. If unable to do so, it sleeps until all objects becomesimultaneously signaled, subsequently acquiring them, or the timeoutexpires. In the latter case the ioctl fails with
ETIMEDOUTand noobjects are modified.Objects may become signaled and subsequently designaled (throughacquisition by other threads) while this thread is sleeping. Onlyonce all objects are simultaneously signaled does the ioctl acquirethem and return. The entire acquisition is atomic and totally orderedwith respect to other operations on any of the given objects.
If an abandoned mutex is acquired, the ioctl fails with
EOWNERDEAD. Similarly toNTSYNC_IOC_WAIT_ANY, all objects arenevertheless marked as acquired. Note that if multiple mutex objectsare specified, there is no way to know which were marked asabandoned.As with “any” waits, the
alertargument is an “extra” event whichcan terminate the wait. Critically, however, an “all” wait willsucceed if all members inobjsare signaled,or ifalertissignaled. In the latter caseindexwill be set tocount. Aswith “any” waits, if both conditions are filled, the former takespriority, and objects inobjswill be acquired.Unlike
NTSYNC_IOC_WAIT_ANY, it is not valid to pass the sameobject more than once, nor is it valid to pass the same object inobjsand inalert. If this is attempted, the function failswithEINVAL.