Movatterモバイル変換

If you need to ensure that a particular piece of data is only ever modified by one thread at once,you need a mutex. If you need more than one mutex, you need to be wary of deadlocks. But what if Itold you that there’s a trick to avoid ever reaching a deadlock at all? Just acquire them in theright order!

First of all, let’s define a deadlock. With two mutexes,foo andbar and two workers,aliceandbob, we can have the following scenario:

alice acquiresfoo
bob acquiresbar
alice tries to acquirebar but can’t becausebob already has it so they must wait
bob tries to acquirefoo but can’t becausealice already has it so they must wait

Now both workers are waiting for each other and neither can complete. Let’s see now how to fix this.The key is that we want to always acquire all of our mutexes in a consistent order.¹ In thiscase, if, whenever we need bothfoo andbar, we make sure to always acquirefoo beforebar,we can never deadlock. Of course, as a code base and its number of mutexes grows, this becomesharder and we need to formalize and automate this.

We can model the acquisition of mutexes as a graph problem, where all mutexes are nodes and we addan edge between them when we acquire a mutex while holding another. The problem then becomes: isthis graph free of cycles?

Directed acyclic graphs

Adirected acyclic graph (orDAG) is a graph $(V, E) (V, E)$ , of nodes $V V$ with edges $E E$ between then, that is directed, meaning the edges can only be followed one way, and that does notcontain cycles, meaning you cannot start walking from a certain node and end end up where youstarted. It can look something like this:

You might think that this graph does have cycles, and it would if it were an undirected graph.However, due to the directed edges it’s only possible to move from the left to the right.

To check whether a directed graph is a graph is actually a DAG, you can try starting somewhere inthe graph and recursively walk all edges until you find a place you’ve been before. When you run outof edges to walk recursively, start somewhere else that you haven’t visited and repeat until you’veeither found a cycle or walked all edges. In pseudocode 2:

1234567891011121314151617 visited=set()visiting=set()defvisit(node):ifnodeinvisiting:raiseCycleFound()elifnodeinvisited:returnvisiting.add(node)forneighbourinedges(node):visit(neighbour)visiting.remove(node)visited.add(node)fornodeinnodes():visit(nodes)

This simple algorithm is trivially $O (∣ E ∣) O(|E|)$ or linear in the number of edges in the graph as eachedge is taken at most once. That’s good enough if you just need to check a graph once, but we wantto ensure it stays aDAG when we add new edges. This is called the…

Dynamic DAG problem

We know from the previous section that we can relatively cheaply check whether a directed graph is a DAG by finding cycles, but if we know more about the current state of the graph, we can do somethingmore clever. In this section we’ll be looking at an algorithm by David J. Pearce and Paul H.J.Kelly. It uses the existing topological order of the graph to decide whether a new edge caneven add a cycle. I’ve mentionedusing a topological sort in the past but didn’texplain it, so let’s do that now.

Topological order

In a directed graph, you can compute a topological order. This is a function on a node $T (v) T(v)$ thathas one defining property: if $u u$ can be reached by travelling from $v v$ , then $T (u) > T (v) T(u) > T(v)$ .You can prove from this definition that a graph only has a topological order if and only if it isacyclic graph. If the graph contains a cycle, then there is some $u u$ reachable from some $v v$ andvice versa, which leads to $T (u) > T (v) \land T (v) > T (u) T(u) > T(v) \land T(v) > T(u)$ , which is impossible. $■ \blacksquare$

To compute a topological order, we can adapt the Python code from before:

12345678910111213141516171819 visited=set()visiting=set()defvisit(node):ifnodeinvisiting:raiseCycleFound()elifnodeinvisited:returnvisiting.add(node)forneighbourinedges(node):visit(neighbour)visiting.remove(node)visited.add(node)fornodeinnodes():visit(nodes)order=reversed(visited)# Add just this line

If we remember that python sets maintain insertion order, we can see how this algorithm wouldprovide us with a list of nodes in topological order. Child nodes are added before their parents andnew nodes are only added later if they cannot be reached from nodes seen before. If you must have amapping function instead, you can map each node to the index it appears in in the list. Revisitingour graph from before, we can get somethings like the following:

Topological order

The order shown here is not unique, but it shows the properties of the topological sort. Forexample, even though $88$ is closer to the source than $66$ (layer 2 vs. layer 3) it has a highernumber, and this is fine because neither is reachable from the other. Now let’s put this newframework to use.

The algorithm

Now that we know what a topological sort is, we can start using its information to more quicklydecide if a modification to a graph introduces a cycle. Our basis is that we have a graph that doesnot contain cycles and for which we know a topological sort. Luckily, the empty graph triviallysatisfies both.

Edge deletions are trivial; if there is no cycle, removing an edge does not introduce any newcycles. The topological sort will also remain valid if it was valid before. The order depends onlyon what nodes are reachable. The relative order between nodes that are not connected (any more) donot matter.

When we add edges, we may or may not need to do something depending on the current order between thenodes currently under consideration. If the source node is already lower in the topological sort, wecan safely add the new edge. Either the destination node was already reachable from the source nodeor it wasn’t, but we can be certain that the source was not reachable from the destination. Forexample, if we add an edge in the graph below from node $22$ to node $55$ , we know we can safelydo so. This optimization saves us quite a bit of work in at least half the cases.

Original graph

Graph with new edge from `2` to `5`

When the nodes are in the reverse order on the other hand, we do need to do some work. In that case,we must reorder the nodes such that we have a valid order again. If this is impossible, then the newedge must have introduced a cycle into our graph, and we should not add it.

Before we get to how this is done, let’s look at what part of the graph is needs to be rearranged.So, when adding an edge from $u u$ to $v v$ and $T (u) > T (v) T(u) > T(v)$ , what area do we need to consider?This turns out to be two categories:

First we must find all nodes $w w$ reachable from $T (v) T(v)$ that have $T (w) \leq T (u) T(w) \leq T(u)$ . We can dothis with a depth first search,and if we find $u u$ this way, we know that we’re introducing acycle and we should stop. Nodes $w w$ that have $T (w) > T (u) T(w) > T(u)$ are already ordered behind $u u$ so the new edge will not make a difference. We put all visited nodes into a set $δ_{F} \delta_F$ Thisset will contain $v v$ .
Then, we must similarly consider the nodes $t t$ that can reach $u u$ and have $T (t) \geq T (v) T(t) \geq T(v)$ .Once again, depth first search suffices. Nodes with $T (t) < T (v) T(t) < T(v)$ were already ordered behind $v v$ so they need not be considered. Notably This search will not find $v v$ , otherwise we wouldhave found $u u$ in the previous step. We put all visited nodes into a set $δ_{B} \delta_B$ . This setwill contain $u u$ .

In practice this looks like the diagram below. Here we are adding a new edge, between the greennodes. The nodes found by searching forward from $v v$ are coloured in red, the ones found bysearching backwards from $u u$ blue.

Adding an edge that requires rearranging

Assuming we did not find a cycle, we can start rearranging the topological order with these twosets. From our definitions, we know that the final order of topological sort indices should be:

\begin{aligned} T (u) & < T (v) \\ \forall w \in δ_{B} : T (w) & \leq T (u) \\ \forall w \in δ_{F} : T (v) & \leq T (w) \end{aligned} \begin{align*}T(u) &< T(v) \\\forall w \in \delta_B: T(w) &\leq T(u) \\\forall w \in \delta_F: T(v) &\leq T(w)\end{align*}

From this we can start building the new topological indices. It helps to first think about whatorder these sets should be in at the end:

First we need to see the nodes that can reach $u u$ , $δ_{B} \delta_B$ , which contains $u u$
Then comes all of $δ_{F} \delta_F$ which contains $v v$ .

Conveniently, we already know a correct relative order for $δ_{B} \delta_B$ and $δ_{F} \delta_F$ , which istheir current relative order. All we have to do it put all of $δ_{B} \delta_B$ before $δ_{F} \delta_F$ .

To do this, we sort both $δ_{B} \delta_B$ and $δ_{F} \delta_F$ according to their current order. We thentake all of the topological indexes from all the nodes in both $δ_{F} \delta_F$ and $δ_{B} \delta_B$ andcreate a sorted list out of that. Finally we assign the first entries of that sorted list to $δ_{B} \delta_B$ in order, and the rest to $δ_{F} \delta_F$ .

In pseudocode², this looks like this:

12345678 delta_F.sort(key=lambdav:T[v])delta_B.sort(key=lambdav:T[v])nodes=delta_B+delta_Findices=sorted(T[node]fornodeinnodes)fornode,indexinnodes,indices:T[node]=index

After that, we can be sure that, if we did not find a cycle during our searches through the graph,we can be sure that there is no cycle, and that we have updated our topological order to a validone. Applying these steps to the graph above, we get this as our result:

New topological order resulting from the previous modification

Tracing your Mutexes

With the algorithm above in mind, I present thetracing-mutex crate. It provides a set ofsynchronization primitives that keep track of their own dependency graph. Every time you try toacquire a mutex, the dependency graph is updated based on the locks you’re already holding, then ifthat would introduce a cycle, it panics instead of acquiring the lock.

To keep track of what locks are currently held, a thread-localVec is maintained that refers toevery lock currently held. An implementation of the algorithm described above is used to maintain aDAG. The crate does not directly implement any mutex behaviour, instead it wraps primitives exposedby std::sync,parking_lot,³ andlock_api.

Some overhead is to be expected from this tracing, as the algorithm, while reasonably fast,introduces some overhead. In the case of uncontended locks, you might see significant slowdowns. Tocombat this, this crate provide type aliases that will only apply tracing while debug assertions areenabled. At runtime, they will simply resolve to whatever type the tracing version wraps.

There are some limitations to the current design. Similarly to how Rust’s borrow-checker disallowspatterns that could otherwise be data-race free, some access patterns that would not normally causeissues are prohibited. In addition, the thread-local mutex ownership tracking currently prohibitsmutex guards to beSend. This currently makes it impossible to properly implement support for thespin crate or async mutexes like the ones found intokio. This limitation isshared with thetracing⁴ crate so I don’t expect a workaround any time soon.

Despite the shortcomings, this crate should suffice for general use.Try itout and see if you too can prevent runtime bugs before the stars align to causethem in production. And if nothing else be glad the plural of mutex isn’t mutices.

For any deadlock to occur, there must be a sequence of mutexes $M_{1} \to M_{2} \to \dots \to M_{n} M_1 \to M_2 \to \ldots\to M_n$ where an owner $o : 1 \leq o \leq n o: 1 \leq o \leq n$ is holding mutex $M_{o} M_o$ while waiting for $M_{o + 1} M_{o+1}$ if $o < n o < n$ and $M_{1} M_1$ for the final $o o$ . Having a consistent order would makethis impossible, as $M_{n} M_n$ would only be acquired after $M_{1} M_1$ , thus removing the deadlock. ↩︎
In Python but that’s basically the same thing. ↩︎ ↩︎²
Due to problems instrumenting the internals, theCondvar primitive is notsupported forparking_lot. This might change in the future. ↩︎
While there is support intracing for asynchronous rust, theSpan::entermethod relies on the same thread-local trickery we do to keep track of currently open spans.tracing uses specially instrumented futures to ensure the span is held only when it’s correct,but that method does not transfer well to our use case here. ↩︎

Movatterモバイル変換

Directed acyclic graphs

DynamicDAG problem

Topological order

The algorithm

Tracing your Mutexes

Dynamic DAG problem