Movatterモバイル変換

Posted at 2013-08-17 02:19 |RSS feed (Full text feed) |Blog Index
Next article:Friday Q&A 2013-08-30: Model Serialization With Property Lists
Previous article:Friday Q&A 2013-08-02: Type-Safe Scalars with Single-Field Structs
Tags:fridayqna gcd letsbuild

Friday Q&A 2013-08-16: Let's Build Dispatch Groups

byMike Ash

Dispatch groups are a handy facility for synchronizing multiple tasks, and an anonymous reader suggested them for the subject of today's Let's Build.

Overview
Dispatch groups provide four basic operations:

Enter, to indicate that a task has begun.
Exit, to indicate that a task has completed.
Notify, which invokes a block when every enter gets a corresponding exit.
Wait, which is like notify, but synchronous.

You can use this to spin off a bunch of parallel operations and wait for them to complete:

dispatch_group_tgroup=dispatch_group_create();for(inti=0;i<100;i++){dispatch_group_enter(group);DoAsyncWorkWithCompletionBlock(^{dispatch_group_leave(group);});}dispatch_group_wait(group,DISPATCH_TIME_FOREVER);

You can also use it to asynchronously invoke a block when they all complete:

dispatch_group_tgroup=dispatch_group_create();for(inti=0;i<100;i++){dispatch_group_enter(group);DoAsyncWorkWithCompletionBlock(^{dispatch_group_leave(group);});}dispatch_group_notify(group,dispatch_get_main_queue(),^{UpdateUI();});

Given that it's part of GCD and we're talking about asynchronous operations, it should go without saying that a major feature of dispatch groups is that all operations are thread safe.

Code
As usual, I have posted the full code for my reimplementation on GitHub here:

https://github.com/mikeash/MADispatchGroup

Interface
The API forma_dispatch_group closely mirrors that ofdispatch_group:

typedefstructma_dispatch_group_internal*ma_dispatch_group_t;ma_dispatch_group_tma_dispatch_group_create(void);voidma_dispatch_group_destroy(ma_dispatch_group_tgroup);voidma_dispatch_group_enter(ma_dispatch_group_tgroup);voidma_dispatch_group_leave(ma_dispatch_group_tgroup);voidma_dispatch_group_notify(ma_dispatch_group_tgroup,void(^block)(void));voidma_dispatch_group_wait(ma_dispatch_group_tgroup);

There are a few differences from thedispatch_group interface:

Ama_dispatch_group_t is not a dispatch object, so it doesn't use retain/release semantics, instead using a singledestroy function to clean one up.
Thedispatch_group_async functions are missing. These are just simple wrappers aroundenter,leave, anddispatch_async, so not all that important.
Thenotify function doesn't take a dispatch queue, and instead invokes the block immediately in the context of the last piece of code to callleave. It's trivial to wrap the block in adispatch_async, so this is no major change.
Thewait function doesn't take a timeout. This simplifies the code considerably while still illustrating the overall concept.

Fields
Thestruct ma_dispatch_group_internal contains two fields, a counter and an action block:

structma_dispatch_group_internal{uint32_tcounter;void(^action)(void);};

The counter keeps track of how many timesenter has been called without a correspondingexit. The action block is the action set by thenotify function.

Creation and Deletion
Creating a new group is extremely simple. Just allocate a chunk of memory, usingcalloc to ensure it's zeroed:

ma_dispatch_group_tma_dispatch_group_create(void){ma_dispatch_group_tgroup=calloc(1,sizeof*group);returngroup;}

Destroying a group is likewise simple. I assume that any action set withnotify always fires before a group is destroyed, and that the action block is destroyed as part of that. As such, there's no cleanup required indestroy beyond simply callingfree:

voidma_dispatch_group_destroy(ma_dispatch_group_tgroup){free(group);}

Enter
The implementation ofma_dispatch_group_enter is extremely simple. It's just an atomic increment, using an atomic compiler builtin:

voidma_dispatch_group_enter(ma_dispatch_group_tgroup){__sync_fetch_and_add(&group->counter,1);}

Using the atomic builtin ensures that this is thread safe. The implementation ofma_dispatch_group_leave is a bit more complex. It first performs an atomic decrement:

voidma_dispatch_group_leave(ma_dispatch_group_tgroup){uint32_tnewCounterValue=__sync_sub_and_fetch(&group->counter,1);

The__sync_sub_and_fetch builtin first performs an atomic decrement, then returns the new value of the counter. If this was the lastleave call, thennewCounterValue will contain0, and it's time to execute the group's notification action.

if(newCounterValue==0){

The action may not exist yet, for example if allenter calls were balanced with aleave before the call tonotify, so check for that:

if(group->action){

If an action has been set, execute it:

group->action();

Once it returns, destroy the block and set the action toNULL:

Block_release(group->action);group->action=NULL;}}}

Notify
The implementation ofma_dispatch_group_notify is interesting, but ultimately extremely simple. Conceptually, there are two completely separate cases to consider:

There are still pendingenter calls that have not been balanced with aleave. In that case, set the group'saction block.
Allenter calls have been balanced with aleave. In that case, execute the block immediately.

Seems straightforward enough. However, a simple implementation of the first case creates a race condition. Consider the following sequence of events:

Thenotify function checkscount and sees that it is non-zero.
The pending actions callleave and decrease the count to zero.
The action that decreases the count to zero checks the action. No action is set, so it does nothing.
Thenotify function sets the group's action.
The action never runs, because no code is left to run it.

There's an elegant solution that both fixes this race condition and consolidates both separate cases into a single code path. The solution is to wrap the assignment of the action in anenter/leave pair. This effectively eliminates the second case, since there's always at least one unbalancedenter when assigning the action. This also solves the potential race condition, since the assignment occurs before at least one of the pendingleave calls. Here's what the function looks like:

voidma_dispatch_group_notify(ma_dispatch_group_tgroup,void(^block)(void)){ma_dispatch_group_enter(group);group->action=Block_copy(block);ma_dispatch_group_leave(group);}

Wait
The implementation ofma_dispatch_group_wait is simple in concept, although a bit complex in code. It usesma_dispatch_group_notify for most of the work. The idea is simply to callnotify with a block that notes when it gets run, then wait for that block to run. The trick is how to efficiently wait.

It's possible to not bother with efficiency and just poll. For example, this is a valid, albeit dumb, implementation:

voidma_dispatch_group_wait(ma_dispatch_group_tgroup){__blockvolatileintdone=0;ma_dispatch_group_notify(group,^{done=1;});while(!done)/* nothing */;}

However, spinning the CPU at 100% for no reason is a bad idea, so let's try to do better.

There are many different ways to implement this. I chose to usepthread condition variables. A condition variable pairs with a mutex and allows one thread to block and wait for another thread to signal it. In the signaling thread, you do:

pthread_mutex_lock(&lock);// make your changepthread_cond_broadcast(&cond);// or _signalpthread_mutex_unlock(&lock);

The lock ensures that the change is atomic with respect to the waiting thread. Thecond_broadcast call then informs any waiting threads that it's time to wake up.

In the waiting thread, you do:

pthread_mutex_lock(&lock);while(!condition)pthread_cond_wait(&cond);pthread_mutex_unlock(&lock);

The lock insures that the check for the condition is atomic with respect to the signaling thread. Thewhile loop serves two purposes. First, it's possible that the condition was already set beforehand. In this case, thewhile avoids callingpthread_cond_wait, which would end up waiting forever since the signal had already occurred previously. Second, it's possible forpthread_cond_wait to return even when nothing has signaled the condition variable. This is known asspurious wakeup and is a consequence of how condition variables are implemented internally. In the event of spurious wakeup, the while loop ensures that the waiting thread doesn't exit prematurely.

Thema_dispatch_group_wait function first declares and initializes a mutex and a condition variable:

voidma_dispatch_group_wait(ma_dispatch_group_tgroup){pthread_mutex_tmutex;pthread_cond_tcond;pthread_mutex_init(&mutex,NULL);pthread_cond_init(&cond,NULL);

Next, it grabs pointers to these variables:

pthread_mutex_t*mutexPtr=&mutex;pthread_cond_t*condPtr=&cond;

This is done to work around an unfortunate collision with blocks. If the block passed tonotify were to capturemutex andcond directly, they would be copied. These data types don't tolerate being copied. Specifically,pthread_mutex_t has some internal alignment checks, at least on some implementations. Rather than figure out how to force the compiler to meet the library's alignment needs,pthread_mutex_t is set up with some extra storage internally, and then the proper alignment for the internal storage is figured out when it's initialized. In essence, there's an internal field that has (at least) two possible positions, and the position is decided byinit. When the variable is copied, that alignment may no longer be correct, and this can lead to a crash. By instead capturing pointers to these variables, the copy and potential crash are avoided.

Adone variable is also declared to track when the block actually executes:

__blockintdone=0;

Nownotify is called with a block that acquires the lock, setsdone, notifies the waiting thread, and releases the lock:

ma_dispatch_group_notify(group,^{pthread_mutex_lock(mutexPtr);done=1;pthread_cond_broadcast(condPtr);pthread_mutex_unlock(mutexPtr);});

With that in place, the function waits fordone to be set:

pthread_mutex_lock(mutexPtr);while(!done)pthread_cond_wait(condPtr,mutexPtr);pthread_mutex_unlock(mutexPtr);}

That's everything!

Conclusion
Dispatch groups are an extremely useful API that makes it easy to coordinate multiple asynchronous actions and execute followup code once they all complete. The API is bare, but highly functional. Such a useful facility is relatively simple to implement. With the right idea, a little code can go a long way.

That's it for today. Come back next time for more wacky adventures. Friday Q&A is driven by reader suggestions so, in the meantime, pleasesend in your ideas for topics to cover.

Did you enjoy this article? I'm selling whole books full of them! Volumes II and III are now out! They're available as ePub, PDF, print, and on iBooks and Kindle.Click here for more information.

Comments:

Daniel Steffenat2013-08-17 19:00:27:

It may be instructive to compare with the implementation in libdispatch:
http://libdispatch.macosforge.org/trac/browser/trunk/src/semaphore.c

Note thatdispatch_group_notify can be called multiple times for a single group (in addition to one or moredispatch_group_waits), it maintains a queue of notifiers that is traversed when the group becomes empty.

(ma_dispatch_group_notify also leaks the previous block if it is called more than once. That could be fixed with__sync_swap even if you didn't want to support more than a single notifier.)

Paul Fennemaat2013-10-29 08:42:41:

Hi Mike,

What is the advantage to use your implementation (MADispatchGroup) instead of using the standard dispatch_group calls as in your excellent example?

Comments RSS feed for this page

Add your thoughts, post a comment:

Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.

Code syntax highlighting thanks toPygments.

Name:
The Answer to the Ultimate Question of Life, the Universe, and Everything?
Comment:
	Formatting:`<i> <b> <blockquote> <code>`.
	NOTE: Due to an increase in spam, URLs are forbidden! Please provide search terms or fragment your URLs so they don't look like URLs.