Movatterモバイル変換

Posted at 2010-02-26 18:19 |RSS feed (Full text feed) |Blog Index
Next article:Friday Q&A 2010-03-05: Compound Futures
Previous article:Friday Q&A 2010-02-19: Character Encodings
Tags:blocks fridayqna futures

Friday Q&A 2010-02-26: Futures

byMike Ash

Welcome back to another shiny edition of Friday Q&A.Guy English suggested taking a look at implementing futures in Objective-C using blocks, and for this week's post I'm going to talk about the futures implementation that I built.

Futures
A future is, in short, an object which hides a calculation. When a future isresolved, the resolution blocks until the calculation has finished. My code involvesimplicit futures: a future is a proxy for the calculated result and, when messaged, transparently resolves the future and then passes the message on to the result. To code which uses it, a future is essentially indistinguishable from the object that it represents.

In Objective-C, it's natural to represent the calculation using a block. Futures are created by simply calling a function which takes a block representing the calculation. They return aproxy object which captures messages sent to that object and resolves the future as needed. My code has two kinds of futures.

Background futures begin the calculation immediately on a background thread as soon as the future is created. When the future is resolved, if the calculation is not yet complete, the call blocks until it's done. If the calculation finishes first, then resolution completes immediately. Background futures provide a way to write parallel code without needing to worry about details of synchronization. For example, if you're going to pass anNSData to another object that won't actually use thatNSData for a while, you can use a background future to allow other code to run concurrently with the disk access with very little effort:

NSString*filename=...;NSData*future=MABackgroundFuture(^{return[NSDatadataWithContentsOfFile:filename];});[objectdoSomethingLaterWithData:future];

Lazy futures do not begin the calculation until the future is resolved. If the future is never resolved, then the calculation is never performed. Lazy futures make it possible to provide an object immediately to an API which may or may not actually make use of it, and not pay the cost of creating that object until and unless it's actually requested. For example, you could use a lazy future to defer the reading of a file until and unless it's needed:

NSString*filename=...;NSData*future=MALazyFuture(^{return[NSDatadataWithContentsOfFile:filename];});[objectdoSomethingOrNotWithData:future];

Getting the Code
As usual, I'm just going to cover the highlights of the code here, but the library is available from my subversion repository:

    svn cohttp://mikeash.com/svn/MAFuture/

Or just click on the hyperlink above to browse it.

A Custom Proxy Class
When building object proxies, Cocoa provides a convenient root proxy class to subclass in the form ofNSProxy. The idea is thatNSProxy provides a minimal implementation, which allows almost all messages to be captured and proxied.

Unfortunately,NSProxy doesn't quite live up to its promise. It implements a whole bunch of unnecessary methods, including important ones like-hash and-isEqual:. This is a problem because I don't wantNSProxy's implementations of these, I want the implementations in the proxied object. To make that happen, I'd have to either manually override these methods, or play runtime tricks to make them hit the forwarding path. Neither alternative is particularly appealing.

Instead, I chose a third route: implement my own proxy class.MAProxy is a true minimalistic proxy class. It implements memory management methods and-isProxy. It also implements+initialize, which the runtime requires to exist in every class. Theheader is really basic and theimplementation nearly as simple. The only tricky stuff at work is the inline reference count using atomic functions for thread safety.

Future Basics
To make implementing futures easier, I created a base class calledMABaseFuture which provides common facilities. A basic future needs to be able to store a value, to store whether the future has been resolved yet or not, and a condition variable to make it all thread safe:

@interfaceMABaseFuture :MAProxy{id_value;NSCondition*_lock;BOOL_resolved;}

For code, the class obviously needs creation/destruction methods:

-(id)init{_lock=[[NSConditionalloc]init];returnself;}-(void)dealloc{[_valuerelease];[_lockrelease];[superdealloc];}

Then, accessors for the future's value. There are both locked and unlocked setters because a subclass may want to set the future's value after already acquiring the lock:

-(void)setFutureValue:(id)value{[_locklock];[selfsetFutureValueUnlocked:value];[_lockunlock];}-(id)futureValue{// skip the usual retain/autorelease dance here// because the setter is never called more than// once, thus value lifetime is same as future// lifetime[_locklock];idvalue=_value;[_lockunlock];returnvalue;}-(void)setFutureValueUnlocked:(id)value{[valueretain];[_valuerelease];_value=value;_resolved=YES;[_lockbroadcast];}

A quick getter to see if the future has been resolved (which relies on the subclass to manually acquire the lock first):

-(BOOL)futureHasResolved{return_resolved;}

Then the one part that's a bit interesting, a method to wait for the future to resolve, handy for implementing background futures:

-(id)waitForFutureResolution{[_locklock];while(!_resolved)[_lockwait];[_lockunlock];return_value;}

This class also has a-resolveFuture method which is abstract. Subclasses must override it and do whatever they need to do:

-(id)resolveFuture{NSLog(@"-[MABaseFuture resolveFuture] called, this should never happen! Did you forget to implement -[%@ resolveFuture]?",NSStringFromClass(isa));NSParameterAssert(0);returnnil;}

Actually there are two interesting parts to this class, and the second one is here. It's an implementation of-class. Normally this implementation wouldn't be necessary, as the proxy mechanism will proxy that method just fine. The problem arises in the implementation of-[NSCFString isEqual:], part of theCFStringtoll-free bridging, and with other bridged classes. That code checks the class of the other object, and if it's anNSCFString as well, hits a fast path that depends on internal implementation details ofNSCFString. If-class returnsNSCFString when the object is really a proxy, that code fails and the two strings will never compare as equal, even when they are.

The fix is simple, if bizarre. Get the real class, check to see if it starts withNSCF, and return the superclass if it does. If the real class isNSCFString, this will returnNSString, the code goes through the general equality path, and all is well. This is the implementation of-class:

-(Class)class{Classc=[[selfresolveFuture]class];if([NSStringFromClass(c)hasPrefix:@"NSCF"])return[csuperclass];elsereturnc;}

And that's all there is toMABaseFuture.

Deepening the Hierarchy
Building onMABaseFuture, I want to then create a tree of subclasses. First,_MASimpleFuture will contain some more common facilities for "simple" futures (futures which immediately resolve when accessed), then I'll create two subclasses of that for background and lazy futures.

_MASimpleFuture is very, well, simple. It implements-forwardingTargetForSelector: to resolve the future and return the object that resulted:

-(id)forwardingTargetForSelector:(SEL)sel{LOG(@"%p forwardingTargetForSelector: %@, resolving future",self,NSStringFromSelector(sel));return[selfresolveFuture];}

Using this class, subclasses just need to provide an initializer method and override-resolveFuture, and they get forwarding for free.

Forwarding tonil
There's a bad corner case here, which happens if the future returnsnil. Messagingnil is no problem, butforwardingTargetForSelector: takes anil return as meaning that there is no forwarding target, and the runtime should start on the slow forwarding path instead.

And here there's a major problem, because the slow forwarding path requires a method signature, but it's impossible to get one fromnil. I've partially solved this in an extremely brute-force fashion by writing a class calledMAMethodSignatureCache, which will check every class registered with the runtime for a selector and return whatever method signature it can dig up. (I didn't write it solely for this, there's more handy stuff to do with it in another post.) I can use this class to implement the slow forwarding path of_MASimpleFutureto return zero:

-(NSMethodSignature*)methodSignatureForSelector:(SEL)sel{return[[MAMethodSignatureCachesharedCache]cachedMethodSignatureForSelector:sel];}-(void)forwardInvocation:(NSInvocation*)inv{// this gets hit if the future resolves to nil// zero-fill the return valuecharreturnValue[[[invmethodSignature]methodReturnLength]];bzero(returnValue,sizeof(returnValue));[invsetReturnValue:returnValue];}

The problem comes when there are multiple method signatures for a given selector, which can easily happen if two unrelated classes implement methods with the same name. In that case, there's no way to know which one is meant, and this whole approach falls apart. Unfortunately, with the way the runtime is currently written, there's no generalized way to "forward tonil".

If that's not enough, there's another problem with futures that returnnil. This problem is quite simple: although the futured value may be nil, the future object itself is not nil. Any code which checks the object pointer for nil before using it will fail in weird ways. Imagine this code using an NSData:

NSData*data=...;if(data)[selfdoSomethingWithBytes:[databytes]];

Ifdata is a future that resolves tonil, then theif check will pass, but[data bytes] will returnNULL, causing a crash.

Because of these two problems, you should avoid futuring any computation which might returnnil.

Background Futures
To implement background futures, I created a class called_MABackgroundBlockFuture. Since computation is supposed to begin immediately in the background, I create an initializer which takes the block to compute, and uses Grand Central Dispatch to execute it in the background. Once the computation is finished, it simply calls-setFutureValue: to set the computed value and mark the future as resolved:

-(id)initWithBlock:(id(^)(void))block{if((self=[selfinit])){dispatch_async(dispatch_get_global_queue(0,0),^{[selfsetFutureValue:block()];});}returnself;}

The implementation of-resolveFuture is then extremely simple. Since the future is already being computed, it just waits for it to finish, then returns the result:

-(id)resolveFuture{return[selfwaitForFutureResolution];}

Lazy Futures
I created_MALazyBlockFuture to implement lazy futures. A lazy future doesn't begin computation right away, so it just needs to store a copy of the block when initialized, and release it when deallocating:

-(id)initWithBlock:(id(^)(void))block{if((self=[selfinit])){_block=[blockcopy];}returnself;}-(void)dealloc{[_blockrelease];[superdealloc];}

Resolution is straightforward as well. Acquire the lock. If the future hasn't been resolved yet, then call the block and set the future's value from its result:

-(id)resolveFuture{[_locklock];if(![selffutureHasResolved]){[selfsetFutureValueUnlocked:_block()];[_blockrelease];_block=nil;}[_lockunlock];return_value;}

Wrappers
This code now has all the functionality that's needed, but I want a couple of wrappers to make it nicer to use:

idMABackgroundFuture(id(^block)(void)){return[[[_MABackgroundBlockFuturealloc]initWithBlock:block]autorelease];}idMALazyFuture(id(^block)(void)){return[[[_MALazyBlockFuturealloc]initWithBlock:block]autorelease];}

Because these functions returnid, the compiler won't be able to catch mistakes like:

NSArray*array=MALazyFuture(^{return[selfsomethingThatReturnsNSString];});

Gcc will also reject this because the block types don't match exactly (returningNSString * instead ofid) even though they're completely compatible.

I worked around both of these problems by using two really scary-looking macros:

#defineMABackgroundFuture(...)((__typeof((__VA_ARGS__)()))MABackgroundFuture((id(^)(void))(__VA_ARGS__)))#defineMALazyFuture(...)((__typeof((__VA_ARGS__)()))MALazyFuture((id(^)(void))(__VA_ARGS__)))

Let's unpack these a bit.

First, they take variable arguments, because block syntax doesn't play completely nice with the preprocessor. If you write a block which contains a comma that isn't inside parentheses (which can be written completely legally), the preprocessor won't realize that it's inside a block, and will use that as an argument separator. The preprocessor will therefore see two (or more) arguments, and a single-argument macro will fail. By making the macros take..., that problem is avoided. Thus, the block parameter is represented in the macro by__VA_ARGS__.

__typeof is a gcc language extension which pretty much does what it says. You give it an expression, and it gives you the type of the expression.

The argument to__typeof is(__VA_ARGS__)(). Remember that__VA_ARGS__ is the block being passed to the macro. This expression calls the block. But since it's an argument being passed to__typeof, which is a compile-time construct, it doesn'treally call the block. Put the two together, and you get the return type of the block.

Next, the whole__typeof combination is wrapped in another set of parentheses and put right before the call through to the real function, which casts the return value of the function.

Finally, the function argument is cast toid (^)(void), because gcc is too stupid to understand that a block which returnsNSString * is compatible with a block type that returnsid.

Potential Uses
Background futures are useful any time you have computation which can happen asynchronously. In essence, you can think of the future as a synchronization mechanism, like a lock, which ensures that the job is completed before the result is used. Because these futures automatically forward requests, you can pass the future into code that doesn't know what it is, and it will be resolved automatically.

For example, let's say you're building a composite image by loading one image from disk, shrinking another image that already exists, and then putting the result into an image view:

NSImage*image1=[[NSImagealloc]initWithContentsOfFile:...];NSImage*image2=[selfshrinkImage:existingImage];NSImage*composite=[[NSImagealloc]initWithSize:...];[compositelockFocus];[image1drawAtPoint:NSZeroPointfromRect:NSZeroRectoperation:NSCompositeSourceOverfraction:1.0];[image2drawAtPoint:NSZeroPointfromRect:NSZeroRectoperation:NSCompositeSourceOverfraction:1.0];[compositeunlockFocus];[imageViewsetImage:composite];

By futuring all of the images, you allow the work forimage1 andimage2 to run in parallel, and there's at least the possibility that some of the work forcomposite could run in parallel with main thread work too:

NSImage*image1=MABackgroundFuture(^{return[[[NSImagealloc]initWithContentsOfFile:...]autorelease];});NSImage*image2=MABackgroundFuture(^{return[selfshrinkImage:existingImage];});NSImage*composite=MABackgroundFuture(^{NSImage*img=[[NSImagealloc]initWithSize:...];[imglockFocus];[image1drawAtPoint:NSZeroPointfromRect:NSZeroRectoperation:NSCompositeSourceOverfraction:1.0];[image2drawAtPoint:NSZeroPointfromRect:NSZeroRectoperation:NSCompositeSourceOverfraction:1.0];[imgunlockFocus];return[imgautorelease];});[imageViewsetImage:composite];

Quick and easy parallel code, andimageView never has to know that it's getting a proxy instead of the real thing.

Lazy futures are useful any time you have objects that may never be needed, or simply may not be needed for a long time. Even if the object is used, deferring computation can spread out the load and improve responsiveness and startup times.

As an example, imagine some code which sets up a bunch of data file contents to be accessed through a dictionary:

gGlobalDictionary=[[NSDictionaryalloc]initWithObjectsAndKeys:[NSDatadataWithContentsOfFile:...],@"dataFile",[NSDatadataWithContentsOfFile:...],@"anotherDataFile",[NSDatadataWithContentsOfFile:...],@"moreDataFile",[NSDatadataWithContentsOfFile:...],@"fourthDataFile",nil];

You could load these files lazily by splitting them out, providing an accessor for each one, etc., but using lazy futures requires minimal changes and most of the same benefits:

gGlobalDictionary=[[NSDictionaryalloc]initWithObjectsAndKeys:MALazyFuture(^{return[NSDatadataWithContentsOfFile:...]}),@"dataFile",MALazyFuture(^{return[NSDatadataWithContentsOfFile:...]}),@"anotherDataFile",MALazyFuture(^{return[NSDatadataWithContentsOfFile:...]}),@"moreDataFile",MALazyFuture(^{return[NSDatadataWithContentsOfFile:...]}),@"fourthDataFile",nil];

Many other possibilities abound for parallel and lazy computation through the use of futures.

Conclusion
Futures are an interesting technique which can make it much easier to use lazy evaluation and parallel computation. Futures make it easy to use lazy evaluation even when you have no control over (or no desire to change) the code that will eventually use the value in question. They also make for a handy synchronization mechanism for performing heterogeneous parallel computations. The dynamic nature of Objective-C makes it possible to mostly hide the existence of the future from code that isn't involved in creating it.

That's it for this week's edition. Come back in a week for the next one. As always, Friday Q&A is driven by user suggestions, so if you have a topic that you'd like to see covered here,send it in!

Did you enjoy this article? I'm selling whole books full of them! Volumes II and III are now out! They're available as ePub, PDF, print, and on iBooks and Kindle.Click here for more information.

Comments:

John McLaughlinat2010-02-26 20:26:45:

Thanks Mike,

As usual a very interesting article that I've already begun to think of how to use Futures in code I'm working on today!

-john

Remy "Psy" Demarestat2010-02-26 20:34:56:

Hi, I would like to know why in your implementations of retain and release you never use Foundation NSIncrementExtraRefCount() and NSDecrementExtraRefCountWasZero() functions for reference counting ? Cocoa uses them and don't seem to have much problem with threading as far as I can see.

Also, for the problem of nil resolved values, wouldn't it be possible to use some kind of default value ? You add a parameter to your method that allows the implementor to provide a value (for example if an NSString is required you put a @"" as default value) so in case the resolved value is nil it will return that object instead.

On a side note, when you use an instance of NSString class itself (the abstract class, which is illegal), the abstract methods throw an exception like that if that interests you :D :
@throw [NSException exceptionWithName:NSInvalidArgumentException reason:[NSString stringWithFormat:@"*** -%s cannot be sent to an abstract object of class %@: Create a concrete instance!", sel_getName(_cmd), [self class]] userInfo:nil];

Charles Parnotat2010-02-26 20:46:53:

Thanks Mike, this is an awesome post: you designed a very powerful tool with, when you look at it, a very small amount of code.

In other posts with other interesting constructs you made, you have sometimes warned against not using them in "production code". What would be your advice with your futures implementation?

It seems the main concern is the 'nil' value. Remy's suggestion solves both issues of method signature and messaging to nil. But it can often only push the problem further away: the code that relies on checking for nil (which may not even be your code) may not crash or raise an exception, but will still be handling an unexpected result without realizing it, and it is likely to result in other problems down the road. Failing early might be a better option, and this can be done within MABaseFuture.

mikeashat2010-02-26 21:10:07:

Remy: The main reason I didn't use those functions is simply because I didn't know about them. I still like the inline reference count because it's (presumably) faster, and won't have the spinlock contention problems of the Cocoa implementation, but it's really not important either way.

Regarding the default object, I don't see much point in making that part of the API, because it's so easy to just write your block to handle it:


    id foo = MALazyFuture(^{
        id obj = [self mayReturnNil];
        if(!obj)
            obj = placeholderValue;
        return obj;
    });

More useful, I think, would be extending the API to take a placeholderclass that could be used to look up method signatures, e.g.:


    id foo = MALazyFuture(SomeClass, ^{ return [self mayReturnNil]; });

However, this still suffers from the problem of nil checks.

Charles: I always expect my readers to use their own judgement, of course, but if I don't make the warning then I generally consider it to be safe. The nil thing is an annoying problem, but only if you're writing code which might actually return nil. If you know it can't return nil (and most code has tons of places where an unexpected nil would be fatal anyway) then everything should be safe.

There is the potential for trouble if you pass futures to code that is written in such a way as to not deal properly with proxies (like the one my-class override hack works around) but I expect that sort of thing to be extremely rare, and could be considered a bug in Cocoa, depending on the exact circumstances.

Remy "Psy" Demarestat2010-02-26 21:37:58:

What does NSProxy instances return when you send them a -class message ?
The documentation for -isProxy says this method is required because -isMemberOfClass: and -isKindOfClass: methods do type-check on their proxied object. I would suppose then that -class would return the class of the proxy itself.

So you could do the same thing in your code, supposing people always do actual type-check using -isMemberOfClass: and -isKindOfClass: and not [obj class] == [MyClass class].

mikeashat2010-02-26 22:43:15:

-[NSProxy class] returns the proxy class.

Which behavior is correct is definitely debatable. Code which checksclass and then starts accessing instance variables directly if it matches will fail hard if you proxy that message. On the other hand, code which checksclass as a first check for equality will fail if youdon't proxy it.

Ultimately I lean towards making the proxy look as much like the original object as possible, which means proxyingclass.

Remy "Psy" Demarestat2010-02-27 02:13:58:

Okay, fair enough, though the -isProxy method is specifically here to tell whether you should trust -class as being the object you're talking or not. IMHO It's a little weird to only be able to know the class of an object through runtime functions, even for proxy object. But of course that's a choice to make, too.

Peterat2010-02-28 18:43:41:

There is one problem I see with this approach:
In order to maximize parallelism, the Future has to be created as early as possible and used as late as possible. That encourages developers to restructure code for performance instead of legibility. This could be done by the compiler, but obviously the compiler doesn't know about the internals.
It's a bit of a pity, because it's a very nice approach.

mikeashat2010-02-28 19:49:59:

Are there any performance techniques whichdon't encourage developers to restructure code for performance? If your goal is faster code, there are going to be compromises, that's just how it is. If you prefer legibility over performance, that's perfectly reasonable, but that does mean that the performance techniques available to you will be extremely limited!

Stevenat2012-03-21 10:19:20:

You can solve the block signature mismatch by making the return type explicit:

MALazyFuture((id) ^ () { return title; });

Comments RSS feed for this page

Add your thoughts, post a comment:

Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.

Code syntax highlighting thanks toPygments.

Name:
The Answer to the Ultimate Question of Life, the Universe, and Everything?
Comment:
	Formatting:`<i> <b> <blockquote> <code>`.
	NOTE: Due to an increase in spam, URLs are forbidden! Please provide search terms or fragment your URLs so they don't look like URLs.