If you need to ensure that a particular piece of data is only ever modified by one thread at once,you need a mutex. If you need more than one mutex, you need to be wary of deadlocks. But what if Itold you that there’s a trick to avoid ever reaching a deadlock at all? Just acquire them in theright order!
First of all, let’s define a deadlock. With two mutexes,foo
andbar
and two workers,alice
andbob
, we can have the following scenario:
alice
acquiresfoo
bob
acquiresbar
alice
tries to acquirebar
but can’t becausebob
already has it so they must waitbob
tries to acquirefoo
but can’t becausealice
already has it so they must wait
Now both workers are waiting for each other and neither can complete. Let’s see now how to fix this.The key is that we want to always acquire all of our mutexes in a consistent order.1 In thiscase, if, whenever we need bothfoo
andbar
, we make sure to always acquirefoo
beforebar
,we can never deadlock. Of course, as a code base and its number of mutexes grows, this becomesharder and we need to formalize and automate this.
We can model the acquisition of mutexes as a graph problem, where all mutexes are nodes and we addan edge between them when we acquire a mutex while holding another. The problem then becomes: isthis graph free of cycles?
Directed acyclic graphs
Adirected acyclic graph (orDAG) is a graph, of nodes with edgesbetween then, that is directed, meaning the edges can only be followed one way, and that does notcontain cycles, meaning you cannot start walking from a certain node and end end up where youstarted. It can look something like this:
You might think that this graph does have cycles, and it would if it were an undirected graph.However, due to the directed edges it’s only possible to move from the left to the right.
To check whether a directed graph is a graph is actually aDAG, you can try starting somewhere inthe graph and recursively walk all edges until you find a place you’ve been before. When you run outof edges to walk recursively, start somewhere else that you haven’t visited and repeat until you’veeither found a cycle or walked all edges. In pseudocode2:
1234567891011121314151617
visited=set()visiting=set()defvisit(node):ifnodeinvisiting:raiseCycleFound()elifnodeinvisited:returnvisiting.add(node)forneighbourinedges(node):visit(neighbour)visiting.remove(node)visited.add(node)fornodeinnodes():visit(nodes)
This simple algorithm is trivially or linear in the number of edges in the graph as eachedge is taken at most once. That’s good enough if you just need to check a graph once, but we wantto ensure it stays aDAG when we add new edges. This is called the…
DynamicDAG problem
We know from the previous section that we can relatively cheaply check whether a directed graph is aDAG by finding cycles, but if we know more about the current state of the graph, we can do somethingmore clever. In this section we’ll be looking atan algorithm by David J. Pearce and Paul H.J.Kelly. It uses the existing topological order of the graph to decide whether a new edge caneven add a cycle. I’ve mentionedusing a topological sort in the past but didn’texplain it, so let’s do that now.
Topological order
In a directed graph, you can compute a topological order. This is a function on a node thathas one defining property: if can be reached by travelling from, then.You can prove from this definition that a graph only has a topological order if and only if it isacyclic graph. If the graph contains a cycle, then there is some reachable from some andvice versa, which leads to, which is impossible.
To compute a topological order, we can adapt the Python code from before:
12345678910111213141516171819
visited=set()visiting=set()defvisit(node):ifnodeinvisiting:raiseCycleFound()elifnodeinvisited:returnvisiting.add(node)forneighbourinedges(node):visit(neighbour)visiting.remove(node)visited.add(node)fornodeinnodes():visit(nodes)order=reversed(visited)# Add just this line
If we remember that python sets maintain insertion order, we can see how this algorithm wouldprovide us with a list of nodes in topological order. Child nodes are added before their parents andnew nodes are only added later if they cannot be reached from nodes seen before. If you must have amapping function instead, you can map each node to the index it appears in in the list. Revisitingour graph from before, we can get somethings like the following:
The order shown here is not unique, but it shows the properties of the topological sort. Forexample, even though is closer to the source than (layer 2 vs. layer 3) it has a highernumber, and this is fine because neither is reachable from the other. Now let’s put this newframework to use.
The algorithm
Now that we know what a topological sort is, we can start using its information to more quicklydecide if a modification to a graph introduces a cycle. Our basis is that we have a graph that doesnot contain cycles and for which we know a topological sort. Luckily, the empty graph triviallysatisfies both.
Edge deletions are trivial; if there is no cycle, removing an edge does not introduce any newcycles. The topological sort will also remain valid if it was valid before. The order depends onlyon what nodes are reachable. The relative order between nodes that are not connected (any more) donot matter.
When we add edges, we may or may not need to do something depending on the current order between thenodes currently under consideration. If the source node is already lower in the topological sort, wecan safely add the new edge. Either the destination node was already reachable from the source nodeor it wasn’t, but we can be certain that the source was not reachable from the destination. Forexample, if we add an edge in the graph below from node to node, we know we can safelydo so. This optimization saves us quite a bit of work in at least half the cases.
When the nodes are in the reverse order on the other hand, we do need to do some work. In that case,we must reorder the nodes such that we have a valid order again. If this is impossible, then the newedge must have introduced a cycle into our graph, and we should not add it.
Before we get to how this is done, let’s look at what part of the graph is needs to be rearranged.So, when adding an edge from to and, what area do we need to consider?This turns out to be two categories:
First we must find all nodes reachable from that have. We can dothis with a depth first search,and if we find this way, we know that we’re introducing acycle and we should stop. Nodes that have are already ordered behindso the new edge will not make a difference. We put all visited nodes into a set Thisset will contain.
Then, we must similarly consider the nodes that can reach and have.Once again, depth first search suffices. Nodes with were already ordered behind so they need not be considered. Notably This search will not find, otherwise we wouldhave found in the previous step. We put all visited nodes into a set. This setwill contain.
In practice this looks like the diagram below. Here we are adding a new edge, between the greennodes. The nodes found by searching forward from are coloured in red, the ones found bysearching backwards from blue.
Assuming we did not find a cycle, we can start rearranging the topological order with these twosets. From our definitions, we know that the final order of topological sort indices should be:
From this we can start building the new topological indices. It helps to first think about whatorder these sets should be in at the end:
- First we need to see the nodes that can reach,, which contains
- Then comes all of which contains.
Conveniently, we already know a correct relative order for and, which istheir current relative order. All we have to do it put all of before.
To do this, we sort both and according to their current order. We thentake all of the topological indexes from all the nodes in both and andcreate a sorted list out of that. Finally we assign the first entries of that sorted list to in order, and the rest to.
In pseudocode2, this looks like this:
12345678
delta_F.sort(key=lambdav:T[v])delta_B.sort(key=lambdav:T[v])nodes=delta_B+delta_Findices=sorted(T[node]fornodeinnodes)fornode,indexinnodes,indices:T[node]=index
After that, we can be sure that, if we did not find a cycle during our searches through the graph,we can be sure that there is no cycle, and that we have updated our topological order to a validone. Applying these steps to the graph above, we get this as our result:
Tracing your Mutexes
With the algorithm above in mind, I present thetracing-mutex
crate. It provides a set ofsynchronization primitives that keep track of their own dependency graph. Every time you try toacquire a mutex, the dependency graph is updated based on the locks you’re already holding, then ifthat would introduce a cycle, it panics instead of acquiring the lock.
To keep track of what locks are currently held, a thread-localVec
is maintained that refers toevery lock currently held. An implementation of the algorithm described above is used to maintain aDAG. The crate does not directly implement any mutex behaviour, instead it wraps primitives exposedbystd::sync
,parking_lot
,3 andlock_api
.
Some overhead is to be expected from this tracing, as the algorithm, while reasonably fast,introduces some overhead. In the case of uncontended locks, you might see significant slowdowns. Tocombat this, this crate provide type aliases that will only apply tracing while debug assertions areenabled. At runtime, they will simply resolve to whatever type the tracing version wraps.
There are some limitations to the current design. Similarly to how Rust’s borrow-checker disallowspatterns that could otherwise be data-race free, some access patterns that would not normally causeissues are prohibited. In addition, the thread-local mutex ownership tracking currently prohibitsmutex guards to beSend
. This currently makes it impossible to properly implement support for thespin
crate or async mutexes like the ones found intokio
. This limitation isshared with thetracing
4 crate so I don’t expect a workaround any time soon.
Despite the shortcomings, this crate should suffice for general use.Try itout and see if you too can prevent runtime bugs before the stars align to causethem in production. And if nothing else be glad the plural of mutex isn’t mutices.
For any deadlock to occur, there must be a sequence of mutexes where an owner is holding mutex while waiting for if and for the final. Having a consistent order would makethis impossible, as would only be acquired after, thus removing the deadlock. ↩︎
Due to problems instrumenting the internals, the
Condvar
primitive is notsupported forparking_lot
. This might change in the future. ↩︎While there is support in
tracing
for asynchronous rust, theSpan::enter
method relies on the same thread-local trickery we do to keep track of currently open spans.tracing
uses specially instrumented futures to ensure the span is held only when it’s correct,but that method does not transfer well to our use case here. ↩︎