Posted onJul 31, 2024

Total Madness #2: Async Locks

In thefirst episode of this series, we discussed why locks are important and the issues that come with them.Last time we unraveled the inners of the async/await concurrency model. Now, let us finally merge these two topics to exploreAsync Locks: a way to overcome some traditional locking issues using the async/await model. Finally, we'll explore the difference between Rust'sstd::sync::Mutex andtokio::sync::Mutex and how this relates to async locks.

Obs: I'll use the terms "Lock" and "Mutex" interchangeably (lock == mutex), as well as the verbs "Acquire" and "Lock" (lock == acquire). So to avoid saying "lock the lock" I'll say "acquire the lock" or "lock the mutex", just know it is all the same thing.

Episode 2: Async Locks

As a reminder, let's review our original example and see why we even need locks in the first place:

Two siblings want to use the toothbrush they share, but obviously only one of them can use it at a time. The childish, sibling-like thing to do is to say “I’m first!” before the other person. The rule is simple: whoever starts saying “I’m first” first, gets to use it.

A lock, orMutex, is just a way to allow tasks to do that. In this analogy, saying "I'm first!" is analogous tomutex.lock(), so if a task acquires a mutex, all other tasks trying to acquire it have to wait until the first one is done (i.e. when it callsmutex.unlock(), ormutex.release() or whatever name the developers decided to use).

As we saw onEpisode #0: Locks, this is all fun and games until we faceDeadlocks, which is when tasks get stuck waiting forever on each other. Several different situations can cause a deadlock, but a very common one is interrupts. Let's see an example of interrupts causing a deadlock from that episode:

fnmain(){letmuttb_lock=ToothbrushLock::new();tb_lock.lock();// gabriel brushes teeth// <--- INTERRUPT (Fefê)tb_lock.unlock();}fninterrupt(){iftb_lock.lock(){// brush cousins' teeth...}}

In the above example, themain() function is interrupted right after locking the toothbrush, but theinterrupt() function needs to locktb_lock as well, which causes the interrupt code to be deadlocked. This is a typical situation of: "your computer stopped working" because interrupts usually can't be themselves interrupted, so, since they can't be interrupted and are stuck, there's nothing to help the computer regain control of execution. We talked about a way to avoid this in the first post: instead of usinglock(), we usetry_lock(), which returns eithertrue if it acquired the lock, orfalse otherwise. Either way, the function returns. It doesn'tblock (this is what we call anon-blocking function).

Aside: why (most) interrupts can't be interrupted
Think about a keyboard buffer: when a key gets pressed on the keyboard, the processor generates an interrupt so that the kernel can get the key from the keyboard's memory, which could very well only hold a maximum of one key) and put it in a buffer (i.e. an array/vector). If the code that copies the key gets interrupted by another key press event, the key is lost. Avoiding interrupts is called Interrupt Masking. However, some less critical interrupts can in fact be interrupted.

An async lock is an alternative to thetry_lock() idea using, you guessed it,async/await. Here's more or less what we'll do to the originallock() code, can you tell why this would be better than the first approach?

asyncfnmain(){letmuttb_lock=ToothbrushLock::new();tb_lock.lock().await;// gabriel brushes teeth// <--- INTERRUPT (Fefê)tb_lock.unlock();}asyncfninterrupt(){tb_lock.lock().await;// brush cousins' teeth...}

Well, the first obvious advantage is that you can guarantee that a lock will be locked. I mean think about it, in this snippet of thetry_lock() example, I actually hid something very bad, can you figure out what it is?

fninterrupt(){iftb_lock.try_lock(){// brush cousins' teeth...}}

It's theelse statement! The big downside oftry_locking is that, if you can'tlock(), then you just... deal with it. There are no guarantees thattry_lock() will succeed, and the big advantage of async locks is that you can tell the computer: "Can I lock now? Not yet? That's alright, I'll just let someone else run, and try again later". Let's now see how an implementation of async locks would look.

Implementing an async lock

We're going to get somewhat Rust-specific, but the errors raised by the compiler will help us understand the issues and limitations of async locks. In this sense, getting a bit Rust-specific is a feature, not a bug!

First, let's use some boilerplate code. This is the code we wrote in the last post for our toothbrushing example:

modexecutor;modtask;useexecutor::Executor;usefutures::pending;usetask::Task;asyncfnbrush_teeth(times:usize)->String{foriin0..times{// TODO: lockprintln!("Brushing teeth {}",i);pending!();// new!}return"Done".to_string();}fnmain(){letmutexecutor=Executor::new();lettask_gabe=Task::new("gabe".to_string(),brush_teeth(20));lettask_nat=Task::new("nat".to_string(),brush_teeth(30));lettask_fefe=Task::new("fefe".to_string(),brush_teeth(100));executor.tasks.push(task_gabe);executor.tasks.push(task_nat);executor.tasks.push(task_fefe);whileexecutor.tasks.len()!=0{fortaskinexecutor.tasks.iter_mut(){print!("{}: ",task.name);task.poll();}println!("--- Went through all tasks once");// clean up tasks that are doneexecutor.tasks.retain(|task|!task.is_done());}}

Now, what we want to do is change it to have aToothbrushLock. For this, let's start with the code we wrote in thefirst episode:

structToothbrushLock{locked:bool,}implToothbrushLock{pubfnnew()->Self{Self{locked:false}}pubfnlock(&mutself){whileself.locked==true{}self.locked=true;}pubfntry_lock(&mutself)->bool{ifself.locked==true{returnfalse;}self.locked=true;returntrue;}pubfnunlock(&mutself){self.locked=false;}}

And finally, let's make thebrush_teeth function use our lock:

asyncfnbrush_teeth(brush_lock:&mutToothbrushLock,times:usize)->String{foriin0..times{brush_lock.lock();// new!println!("Brushing teeth {}",i);pending!();brush_lock.unlock();// new!}return"Done".to_string();}fnmain(){letmutexecutor=Executor::new();letmutbrush_lock=ToothbrushLock::new();// new! (literally)lettask_gabe=Task::new("gabe".to_string(),brush_teeth(&mutbrush_lock,20));lettask_nat=Task::new("nat".to_string(),brush_teeth(&mutbrush_lock,30));lettask_fefe=Task::new("fefe".to_string(),brush_teeth(&mutbrush_lock,100));executor.tasks.push(task_gabe);executor.tasks.push(task_nat);executor.tasks.push(task_fefe);// ... (rest of the code)

There's nothing async in our lock for now. Not to worry though, we'll get there soon enough. First, let's get this code to compile because as of now it doesn't. Let's see why:

error[E0597]:`brush_lock` does not live long enough--> src/main.rs:27:63   |25 |letmut brush_lock= ToothbrushLock::new(); // new!(literally)   |-------------- binding`brush_lock` declared here26 |27 |lettask_gabe= Task::new("gabe".to_string(), brush_teeth(&mut brush_lock, 20));   |------------------------------------------^^^^^^^^^^^^^^^------   |                     |                                         |   |                     |                                         borrowed value does not live long enough   |                     argument requires that`brush_lock` is borrowedfor`'static`...45 | }   | - `brush_lock` dropped here while still borrowed

This error happens three times, one for each call ofTask::new(), but I've redacted the other two since they're the same. The problem here is that, sincebrush_teeth() is aFuture, it doesn't run right away. Actually, it can run whenever! The compiler doesn't know when it'll run, much less when it'll finish. As a result, it can't guarantee that it will finish running beforemain() finishes (which is whenbrush_lock will be freed).

Scopes: Scopes are how the Rust compiler knows which variables get discarded/freed/dropped. A good rule of thumb is: if a variable has been declared inside brackets{}, it belongs to this "block". For example, if a variable is declared inside of a function (e.g.brush_lock inside ofmain()), it will belong to the function's scope, and therefore will be dropped when the scope of the function ends (a.k.a.: in line 45). Another example would be afor{...} loop. Variables declared inside loops belong to the scope of that single loop iteration.

One way around this is to useReference Counting (RC), or, even betterAtomic Reference Counting (ARC), which is the same thing but with atomic operations.Arc allows an object to have several owners, and, when the scopes of all its owners end, only then theArc is dropped. Here's how to use it:

asyncfnbrush_teeth(brush_lock:Arc<ToothbrushLock>,times:usize)->String{foriin0..times{brush_lock.lock();// new!println!("Brushing teeth {}",i);pending!();brush_lock.unlock();// new!}return"Done".to_string();}fnmain(){letmutexecutor=Executor::new();letbrush_lock=Arc::new(ToothbrushLock::new());// new! (literally)lettask_gabe=Task::new("gabe".to_string(),brush_teeth(brush_lock.clone(),20));lettask_nat=Task::new("nat".to_string(),brush_teeth(brush_lock.clone(),30));lettask_fefe=Task::new("fefe".to_string(),brush_teeth(brush_lock.clone(),100));executor.tasks.push(task_gabe);executor.tasks.push(task_nat);executor.tasks.push(task_fefe);

We enclose our thing (ToothbrushLock in this case) withArc and it should do its magic! Unfortunately for us, this is not the only issue we have to deal with. Here's what we get when we try to compile now:

error[E0596]: cannot borrow datainan`Arc` as mutable--> src/main.rs:14:9   |14 |         brush_lock.lock(); // new!   |         ^^^^^^^^^^ cannot borrow as mutable   |=help: trait`DerefMut` is required to modify through a dereference, but it is not implementedfor`Arc<ToothbrushLock>`error[E0596]: cannot borrow datainan`Arc` as mutable--> src/main.rs:17:9   |17 |         brush_lock.unlock(); // new!   |         ^^^^^^^^^^ cannot borrow as mutable   |=help: trait`DerefMut` is required to modify through a dereference, but it is not implementedfor`Arc<ToothbrushLock>`For more information about this error, try`rustc--explain E0596`.error: could not compile`episode-2-async-locks`(bin"episode-2-async-locks") due to 2 previous errors

The problem now is that, whileArc allows a variable to be owned by multiple scopes, it doesn't allow them tomutate the data. This is one of the mechanisms that Rust has to ensure memorysafety: having two or more tasks altering the same memory region can cause data races for the region. As a reminder, take a look at this example, where bothtaskA andtaskB want to flip the value ofvar1: bool (totrue if it isfalse and tofalse if it istrue):

taskA readsvar1: value istrue
taskB readsvar1: value istrue
taskA writes tovar1: value isfalse
taskB writes tovar1: value isfalse

At the end of this examplevar1 == false, while it should betrue since we flip it twise. Allowing only one mutable reference to the variable solves this issue, but also restricts the code quite a bit. Another solution that allows us to have multiple mutable references is precisely what we've been doing: locks! So now we need a way to ignore this rule, and for that, we'll useUnsafeCells:

usestd::cell::UnsafeCell;// new!pubstructToothbrushLock{locked:UnsafeCell<bool>,// new!}implToothbrushLock{pubfnnew()->Self{Self{locked:UnsafeCell::new(false),}}pubfnlock(&self){unsafe{// new! unsafewhile*self.locked.get()==true{}*self.locked.get()=true;}}pubfnunlock(&self){unsafe{// new! unsafe*self.locked.get()=false;}}}

Obs: I've removed thetry_lock() method since our async lock is trying to get rid of it anyway.

UnsafeCells allow us to have multiple mutable references to a variable, but, as the name suggests, it isunsafe, meaning that Rust's borrow checker won't enforce its memory safety rules to it. In fact, we have to enclose all calls toself.locked.get() inunsafe {..} blocks.unsafe {...} blocks basically tell the compiler to ignore checking for memory safety on these regions, it's our way of saying "don't worry, I know what I'm doing".

And with this change, our code compiles!!! Hooray! But wait, what is this output?

gabe: Brushing teeth 0

The code hangs after printing this first line, try to figure out what is wrong with the code. Here's the solution:

asyncfnbrush_teeth(brush_lock:Arc<ToothbrushLock>,times:usize)->String{foriin0..times{brush_lock.lock();println!("Brushing teeth {}",i);brush_lock.unlock();pending!();}return"Done".to_string();}// rest of code...

The change is minimal but effective: putting theunlock() call beforepending!(). Of course, we need to release the mutex before we return execution flow to the scheduler/executor because if we don't, the next task (in this casenat orfefe) will be stuck trying tolock(). But what we wanted, in the beginning, was precisely a way to return execution flow to the scheduler until we're able tolock(). In other words, this is what we want to be able to do:

asyncfnbrush_teeth(brush_lock:Arc<ToothbrushLock>,times:usize)->String{foriin0..times{brush_lock.lock().await;// try to lock the mutex, continue running other tasks otherwiseprintln!("Brushing teeth {}",i);// no need for unlock(), mutex is unlocked magically}return"Done".to_string();}// rest of code...

We're close, but not quite there yet. As a next step, let's keep theunlock() method, and focus on makinglock() async so we can callawait on it:

usefutures::pending;usestd::cell::UnsafeCell;pubstructToothbrushLock{locked:UnsafeCell<bool>,}implToothbrushLock{pubfnnew()->Self{Self{locked:UnsafeCell::new(false),}}pubasyncfnlock(&self){// new! async fnunsafe{while*self.locked.get()==true{pending!();// new!}*self.locked.get()=true;}}pubfnunlock(&self){unsafe{*self.locked.get()=false;}}}

With those two small adjustments, we can call the lock like so:

asyncfnbrush_teeth(brush_lock:Arc<ToothbrushLock>,times:usize)->String{foriin0..times{brush_lock.lock().await;println!("Brushing teeth {}",i);brush_lock.unlock();}return"Done".to_string();}// rest of code...

And our new output is:

gabe: Brushing teeth 0Brushing teeth 1...Brushing teeth 18Brushing teeth 19nat: Brushing teeth 0Brushing teeth 1Brushing teeth 2Brushing teeth 3...Brushing teeth 28Brushing teeth 29fefe: Brushing teeth 0Brushing teeth 1Brushing teeth 2Brushing teeth 3Brushing teeth 4...Brushing teeth 98Brushing teeth 99--- Went through all tasks once

As you can see, we're not circling through tasks. This happens because we only yield execution on thelock() method. This means that a task that owns the lock will do:

printBrushing teeth 1
unlock the lock
lock/acquire the lock
yield execution

Then all the tasks will try to lock, but not have any success and yield execution back. When the scheduler finally goes back to the task that owns the lock, it'll repeat the sequence above. To fix that we can makeunlock() also yield execution back:

usefutures::pending;usestd::cell::UnsafeCell;pubstructToothbrushLock{locked:UnsafeCell<bool>,}implToothbrushLock{pubfnnew()->Self{Self{locked:UnsafeCell::new(false),}}pubasyncfnlock(&self){unsafe{while*self.locked.get()==true{pending!();}*self.locked.get()=true;}}pubasyncfnunlock(&self){unsafe{*self.locked.get()=false;pending!();// new!}}}

And our call becomes:

asyncfnbrush_teeth(brush_lock:Arc<ToothbrushLock>,times:usize)->String{foriin0..times{brush_lock.lock().await;println!("Brushing teeth {}",i);brush_lock.unlock().await;}return"Done".to_string();}// rest of code...

Yes! Now our execution is back to being:

gabe: Brushing teeth 0nat: Brushing teeth 0fefe: Brushing teeth 0--- Went through all tasks oncegabe: Brushing teeth 1nat: Brushing teeth 1fefe: Brushing teeth 1--- Went through all tasks oncegabe: Brushing teeth 2nat: Brushing teeth 2fefe: Brushing teeth 2--- Went through all tasks oncegabe: Brushing teeth 3nat: Brushing teeth 3fefe: Brushing teeth 3--- Went through all tasks oncegabe: Brushing teeth 4nat: Brushing teeth 4fefe: Brushing teeth 4--- Went through all tasks oncegabe: Brushing teeth 5nat: Brushing teeth 5fefe: Brushing teeth 5...

Yes! Now our async lock works. The only last detail we need to take care of is the fact that we need to callunlock(). You see, we can use the magic of Rust to make the compiler callunlock() for us whenever is appropriate to release the lock (i.e. when the lock goes out of scope). In our example, the scope of the lock is one iteration of thefor loop, so the unlock would happen exactly where we put it, just before the end of a loop. To do that, we need two things:

Have aToothbrushGuard type that is returned bylock, which will be in thefor scope (i.e. a variable that is declared inside thefor{..} loop and thus belongs to the scope of that iteration).
Implement theDrop trait forToothbrushGuard,Drop tells the compiler what to do when an object goes out of scope. In our case, we want to make it unlock whenToothbrushGuard goes out of scope.

pubstructToothbrushGuard<'guard_life>{mutex:&'guard_lifeToothbrushLock,}impl<'guard_life>ToothbrushGuard<'guard_life>{pubfnnew(mutex:&'guard_lifeToothbrushLock)->Self{ToothbrushGuard{mutex}}}implDropforToothbrushGuard<'_>{// this function is called by the compiler when the ToothbrushGuard goes out of scopefndrop(&mutself){self.mutex.unlock();}}

And havelock() return a guard:

implToothbrushLock{pubfnnew()->Self{Self{locked:UnsafeCell::new(false),}}pubasyncfnlock(&self)->ToothbrushGuard{unsafe{while*self.locked.get()==true{pending!();}*self.locked.get()=true;ToothbrushGuard::new(self)// new!}}fnunlock(&self){unsafe{*self.locked.get()=false;// no longer able to call pending!() here}}}

With these things in place, we can finally call our functions the way we wanted:

asyncfnbrush_teeth(brush_lock:Arc<ToothbrushLock>,times:usize)->String{foriin0..times{brush_lock.lock().await;println!("Brushing teeth {}",i);}return"Done".to_string();}

Amazing! There's only one last thing to do though. You see, because thedrop() function cannot beasync, andunlock() is called indrop(), this means thatunlock() can no longer beasync (aka it can no longer callpending!() to yield execution back. As a result, the previous example does compile and works, but we're once again not circling through the tasks. A quick hack fixes that (try to do that yourself without peeking first!):

implToothbrushLock{pubfnnew()->Self{Self{locked:UnsafeCell::new(false),}}pubasyncfnlock(&self)->ToothbrushGuard{unsafe{pending!();// new: return pending before!while*self.locked.get()==true{pending!();}*self.locked.get()=true;ToothbrushGuard::new(self)}}fnunlock(&self){unsafe{*self.locked.get()=false;}}}

Returningpending!() before thewhile{...} loop inlock() ensures that execution will be yielded back even if the lock was previously acquired by the task.

That's it! Async locks.

Appendix I: tokio::sync::Mutex vs. std::sync::Mutex

This is somewhat Rust-specific, but very interesting nontheless. If you're a Rust developer you might have encouteredtokio::sync::Mutex and wondered how it is different fromstd::sync::Mutex (I know I have).

Obs: For the curious non-Rust people,tokio is an async runtime for Rust, in other words, it's like ourExecutor, only much more powerful and well-written.

Let's look at Tokio's documentation for itsMutex:

This type acts similarly to std::sync::Mutex, with two major differences: lock is an async method so does not block, and the lock guard is designed to be held across .await points.
Source

Interesting! It's just like ours then. Let's think about this. Why would a Mutex not be safe to be held across.awaits? Well, one obvious answer is: if it causes deadlocks. Remember when we first tried implementing our Mutex and we got a deadlock? Allow me to refresh your memory, Our lock was:

usestd::cell::UnsafeCell;// new!pubstructToothbrushLock{locked:UnsafeCell<bool>,// new!}implToothbrushLock{pubfnnew()->Self{Self{locked:UnsafeCell::new(false),}}pubfnlock(&self){unsafe{// new! unsafewhile*self.locked.get()==true{}*self.locked.get()=true;}}pubfnunlock(&self){unsafe{// new! unsafe*self.locked.get()=false;}}}

But the code got stuck after printing:

gabe: Brushing teeth 0

Because our lock wasblocking, so when thenat task tried to acquire the lock (inwhile *self.locked.get() == true {}) it got stuck on that loop, and it neverawaited, so no other task could run. This, my friends, is a deadlock.

By changing thelock() function to the following, we allowed other tasks to run, at the cost of having more context switches (because you have to circle through all tasks until you go back to the one that has the lock):

pubasyncfnlock(&self)->ToothbrushGuard{unsafe{pending!();while*self.locked.get()==true{pending!();}*self.locked.get()=true;ToothbrushGuard::new(self)}}

Suck it Tokio.

Obs: I'm kidding of course, you absolutely should not be writing your own locks if can you have the super smart Tokio people do that for you. Go do something else, i don't know, go fish or whatever.

Appendix II: Atomic Async Locks

What we did was all fun and games, but if an interrupt occurs in the middle of the code we wrote, it all goes to shit. Solving this requires atomic operations that guarantee that they won't be interrupted. We did the same thing inEpisode #0: Locks:

usefutures::pending;usestd::cell::UnsafeCell;usestd::sync::atomic::{AtomicBool,Ordering};pubstructToothbrushLock{locked:UnsafeCell<AtomicBool>,}implToothbrushLock{pubfnnew()->Self{Self{locked:UnsafeCell::new(AtomicBool::new(false)),}}pubasyncfnlock(&self)->ToothbrushGuard{unsafe{pending!();while(&*self.locked.get()).load(Ordering::SeqCst)==true{pending!();}(&*self.locked.get()).store(true,Ordering::SeqCst);ToothbrushGuard::new(self)}}fnunlock(&self){unsafe{(&*self.locked.get()).store(false,Ordering::SeqCst);}}}pubstructToothbrushGuard<'guard_life>{mutex:&'guard_lifeToothbrushLock,}impl<'guard_life>ToothbrushGuard<'guard_life>{pubfnnew(mutex:&'guard_lifeToothbrushLock)->Self{ToothbrushGuard{mutex}}}implDropforToothbrushGuard<'_>{fndrop(&mutself){self.mutex.unlock();}}