In case you have been on Mars, in a cave, with your eyes shut and your fingers in your ears,Swift has been open sourced. This makes it convenient to explore one of the more interesting features of Swift's implementation: how weak references work.
Weak References
In a garbage collected or reference counted language, a strong reference is one which keeps the target object alive. A weak reference is one which doesn't. An object can't be destroyed while there are strong references to it, but it can be destroyed while there are weak references to it.
When we say "weak reference," we usually mean azeroing weak reference. That is, when the target of the weak reference is destroyed, the weak reference becomesnil. It's also possible to have non-zeroing weak references, which trap, crash, or invoke nasal demons. This is what you get when you useunsafe_unretained in Objective-C, orunowned in Swift. (Note that Objective-C gives us the nasal-demons version, while Swift takes care to crash reliably.)
Zeroing weak references are handy to have around, and they're extremely useful in reference counted languages. They allow circular references to exist without creating retain cycles, and without having to manually break back references. They're so useful that Iimplemented my own version of weak references back before Apple introduced ARC and made language-level weak references available outside of garbage collected code.
How Does It Work?
The typical implementation for zeroing weak references is to keep a list of all the weak references to each object. When a weak reference is created to an object, that reference is added to the list. When that reference is reassigned or goes out of scope, it's removed from the list. When an object is destroyed, all of the references in the list are zeroed. In a multithreaded environment (i.e. all of them these days), the implementation must synchronize obtaining a weak reference and destroying an object to avoid race conditions when one thread releases the last strong reference to an object at the same time another thread tries to load a weak reference to it.
In my implementation, each weak reference is a full-fledged object. The list of weak references is just a set of weak reference objects. This adds some inefficiency because of the extra indirection and memory use, but it's convenient to have the references be full objects.
In Apple's Objective-C implementation, each weak reference is a plain pointer to the target object. Rather than reading and writing the pointers directly, the compiler uses helper functions. When storing to a weak pointer, the store function registers the pointer location as a weak reference to the target. When reading from a weak pointer, the read function integrates with the reference counting system to ensure that it never returns a pointer to an object that's being deallocated.
Zeroing in Action
Let's build a bit of code so we can watch this stuff happen.
We want to be able to dump the contents of an object's memory. This function takes a region of memory, breaks it into pointer-sized chunks, and turns the whole thing into a convenient hex string:
funccontents(ptr:UnsafePointer<Void>,_length:Int)->String{letwordPtr=UnsafePointer<UInt>(ptr)letwords=length/sizeof(UInt.self)letwordChars=sizeof(UInt.self)*2letbuffer=UnsafeBufferPointer<UInt>(start:wordPtr,count:words)letwordStrings=buffer.map({word->StringinvarwordString=String(word,radix:16)whilewordString.characters.count<wordChars{wordString="0"+wordString}returnwordString})returnwordStrings.joinWithSeparator(" ")}
The next function creates a dumper function for an object. Call it once with an object, and it returns a function that will dump the content of this object. Internally, it saves anUnsafePointer to the object, rather than using a normal reference. This ensures that it doesn't interact with the language's reference counting system. It also allows us to dump the memory of an object after it has been destroyed, which will come in handy later.
funcdumperFunc(obj:AnyObject)->(Void->String){letobjString=String(obj)letptr=unsafeBitCast(obj,UnsafePointer<Void>.self)letlength=class_getInstanceSize(obj.dynamicType)return{letbytes=contents(ptr,length)return"\(objString) \(ptr): \(bytes)"}}
Here's a class that exists to hold a weak reference so we can inspect it. I added dummy variables on either side to make it clear where the weak reference lives in the memory dump:
classWeakReferer{vardummy1=0x1234321012343210weakvartarget:WeakTarget?vardummy2:UInt=0xabcdefabcdefabcd}
Let's give it a try! We'll start by creating a referer and dumping it:
letreferer=WeakReferer()letrefererDump=dumperFunc(referer)print(refererDump())
This prints:
WeakReferer 0x00007f8a3861b920: 0000000107ab24a0 0000000200000004 1234321012343210 0000000000000000 abcdefabcdefabcd
We can see theisa at the beginning, followed by some other internal fields.dummy1 occupies the 4th chunk, anddummy2 occupies the 6th. We can see that the weak reference in between them is zero, as expected.
Let's point it at an object now, and see what it looks like. I'll do this inside ado block so we can control when the target goes out of scope and is destroyed:
do{lettarget=NSObject()referer.target=targetprint(target)print(refererDump())}
This prints:
<NSObject: 0x7fda6a21c6a0> WeakReferer 0x00007fda6a000ad0: 00000001050a44a0 0000000200000004 1234321012343210 00007fda6a21c6a0 abcdefabcdefabcd
As expected, the pointer to the target is stored directly in the weak reference. Let's dump it again after the target is destroyed at the end of thedo block:
print(refererDump())
WeakReferer 0x00007ffe32300060: 000000010cfb44a0 0000000200000004 1234321012343210 0000000000000000 abcdefabcdefabcd
It gets zeroed out. Perfect!
Just for fun, let's repeat the experiment with a pure Swift object as the target. It's not nice to bring Objective-C into the picture when it's not necessary. Here's a pure Swift target:
classWeakTarget{}
Let's try it out:
letreferer=WeakReferer()letrefererDump=dumperFunc(referer)print(refererDump())do{classWeakTarget{}lettarget=WeakTarget()referer.target=targetprint(refererDump())}print(refererDump())
The target starts out zeroed as expected, then gets assigned:
WeakReferer 0x00007fbe95000270: 00000001071d24a0 0000000200000004 1234321012343210 0000000000000000 abcdefabcdefabcd WeakReferer 0x00007fbe95000270: 00000001071d24a0 0000000200000004 1234321012343210 00007fbe95121ce0 abcdefabcdefabcd
Then when the target goes away, the reference should be zeroed:
WeakReferer 0x00007fbe95000270: 00000001071d24a0 0000000200000004 1234321012343210 00007fbe95121ce0 abcdefabcdefabcd
Oh dear. It didn't get zeroed. Maybe the target didn't get destroyed. Something must be keeping it alive! Let's double-check:
classWeakTarget{deinit{print("WeakTarget deinit")}}
Running the code again, we get:
WeakReferer 0x00007fd29a61fa10: 0000000107ae44a0 0000000200000004 1234321012343210 0000000000000000 abcdefabcdefabcd WeakReferer 0x00007fd29a61fa10: 0000000107ae44a0 0000000200000004 1234321012343210 00007fd29a42a920 abcdefabcdefabcd WeakTarget deinit WeakReferer 0x00007fd29a61fa10: 0000000107ae44a0 0000000200000004 1234321012343210 00007fd29a42a920 abcdefabcdefabcd
So it is going away, but the weak reference isn't being zeroed out. How about that, we found a bug in Swift! It's pretty amazing that it hasn't been fixed after all this time. You'd think somebody would have noticed before now. Let's go ahead and generate a nice crash by accessing the reference, then we can file a bug with the Swift project:
letreferer=WeakReferer()letrefererDump=dumperFunc(referer)print(refererDump())do{classWeakTarget{deinit{print("WeakTarget deinit")}}lettarget=WeakTarget()referer.target=targetprint(refererDump())}print(refererDump())print(referer.target)
Here comes the crash:
WeakReferer 0x00007ff7aa20d060: 00000001047a04a0 0000000200000004 1234321012343210 0000000000000000 abcdefabcdefabcd WeakReferer 0x00007ff7aa20d060: 00000001047a04a0 0000000200000004 1234321012343210 00007ff7aa2157f0 abcdefabcdefabcd WeakTarget deinit WeakReferer 0x00007ff7aa20d060: 00000001047a04a0 0000000200000004 1234321012343210 00007ff7aa2157f0 abcdefabcdefabcd nil
Oh dear squared! Where's the kaboom? There was supposed to be an Earth-shattering kaboom! The output says everything is working after all, but we can see clearly from the dump that it isn't working at all.
Let's inspect everything really carefully. Here's a revised version ofWeakTarget with a dummy variable to make it nicer to dump its contents as well:
classWeakTarget{vardummy=0x0123456789abcdefdeinit{print("Weak target deinit")}}
Here's some new code that runs through the same procedure and dumps both objects at every step:
letreferer=WeakReferer()letrefererDump=dumperFunc(referer)print(refererDump())lettargetDump:Void->Stringdo{lettarget=WeakTarget()targetDump=dumperFunc(target)print(targetDump())referer.target=targetprint(refererDump())print(targetDump())}print(refererDump())print(targetDump())print(referer.target)print(refererDump())print(targetDump())
Let's walk through the output. The referer starts out life as before, with a zeroed-outtarget field:
WeakReferer 0x00007fe174802520: 000000010faa64a0 0000000200000004 1234321012343210 0000000000000000 abcdefabcdefabcd
The target starts out life as a normal object, with various header fields followed by our dummy field:
WeakTarget 0x00007fe17341d270: 000000010faa63e0 0000000200000004 0123456789abcdef
Upon assigning to thetarget field, we can see the pointer value get filled in:
WeakReferer 0x00007fe174802520: 000000010faa64a0 0000000200000004 1234321012343210 00007fe17341d270 abcdefabcdefabcd
The target is much as before, but one of the header fields went up by2:
WeakTarget 0x00007fe17341d270: 000000010faa63e0 0000000400000004 0123456789abcdef
The target gets destroyed as expected:
Weak target deinit
We see the referer object still has a pointer to the target:
WeakReferer 0x00007fe174802520: 000000010faa64a0 0000000200000004 1234321012343210 00007fe17341d270 abcdefabcdefabcd
And the target itself still looks very much alive, although a different header field went down by2 compared to the last time we saw it:
WeakTarget 0x00007fe17341d270: 000000010faa63e0 0000000200000002 0123456789abcdef
Accessing thetarget field producesnil even though it wasn't zeroed out:
nilDumping the referer again shows that the mere act of accessing thetarget field has altered it.Now it's zeroed out:
WeakReferer 0x00007fe174802520: 000000010faa64a0 0000000200000004 1234321012343210 0000000000000000 abcdefabcdefabcd
The target is now totally obliterated:
WeakTarget 0x00007fe17341d270: 200007fe17342a04 300007fe17342811 ffffffffffff0002
More and more interesting. We saw header fields incrementing and decremeting a bit, let's see if we can make that happen more:
lettarget=WeakTarget()lettargetDump=dumperFunc(target)do{print(targetDump())weakvara=targetprint(targetDump())weakvarb=targetprint(targetDump())weakvarc=targetprint(targetDump())weakvard=targetprint(targetDump())weakvare=targetprint(targetDump())varf=targetprint(targetDump())varg=targetprint(targetDump())varh=targetprint(targetDump())vari=targetprint(targetDump())varj=targetprint(targetDump())vark=targetprint(targetDump())}print(targetDump())
This prints:
WeakTarget 0x00007fd883205df0: 00000001093a4840 0000000200000004 0123456789abcdef WeakTarget 0x00007fd883205df0: 00000001093a4840 0000000400000004 0123456789abcdef WeakTarget 0x00007fd883205df0: 00000001093a4840 0000000600000004 0123456789abcdef WeakTarget 0x00007fd883205df0: 00000001093a4840 0000000800000004 0123456789abcdef WeakTarget 0x00007fd883205df0: 00000001093a4840 0000000a00000004 0123456789abcdef WeakTarget 0x00007fd883205df0: 00000001093a4840 0000000c00000004 0123456789abcdef WeakTarget 0x00007fd883205df0: 00000001093a4840 0000000c00000008 0123456789abcdef WeakTarget 0x00007fd883205df0: 00000001093a4840 0000000c0000000c 0123456789abcdef WeakTarget 0x00007fd883205df0: 00000001093a4840 0000000c00000010 0123456789abcdef WeakTarget 0x00007fd883205df0: 00000001093a4840 0000000c00000014 0123456789abcdef WeakTarget 0x00007fd883205df0: 00000001093a4840 0000000c00000018 0123456789abcdef WeakTarget 0x00007fd883205df0: 00000001093a4840 0000000c0000001c 0123456789abcdef WeakTarget 0x00007fd883205df0: 00000001093a4840 0000000200000004 0123456789abcdef
We can see that the first number in this header field goes up by2 with every new weak reference. The second number goes up by4 with every new strong reference.
To recap, here's what we've seen so far:
deinit runs, the target isnot deallocated, and the weak pointer isnot zeroed.deinit runs, it is zeroed on access and the weak target is deallocated.Swift Code
Now that Swift is open source, we can actually go relate this observed behavior to the source code.
The Swift standard library represents objects allocated on the heap with aHeapObject type located instdlib/public/SwiftShims/HeapObject.h. It looks like:
structHeapObject{/// This is always a valid pointer to a metadata object.structHeapMetadataconst*metadata;SWIFT_HEAPOBJECT_NON_OBJC_MEMBERS;// FIXME: allocate two words of metadata on 32-bit platforms#ifdef__cplusplusHeapObject()=default;// Initialize a HeapObject header as appropriate for a newly-allocated object.constexprHeapObject(HeapMetadataconst*newMetadata):metadata(newMetadata),refCount(StrongRefCount::Initialized),weakRefCount(WeakRefCount::Initialized){}#endif};
Themetadata field is the Swift equivalent of theisa field in Objective-C, and in fact it's compatible. Then there are theseNON_OBJC_MEMBERS defined in a macro:
#defineSWIFT_HEAPOBJECT_NON_OBJC_MEMBERS \StrongRefCountrefCount; \WeakRefCountweakRefCount
Well, look at that! There are our two reference counts.
(Bonus question: why is the strong count first here, while in the dumps above the weak count was first?)
The reference counts are managed by a bunch of functions located instdlib/public/runtime/HeapObject.cpp. For example, here'sswift_retain:
voidswift::swift_retain(HeapObject*object){SWIFT_RETAIN();_swift_retain(object);}staticvoid_swift_retain_(HeapObject*object){_swift_retain_inlined(object);}autoswift::_swift_retain=_swift_retain_;
There's a bunch of indirection, but it eventually calls through to this inline function in the header:
staticinlinevoid_swift_retain_inlined(HeapObject*object){if(object){object->refCount.increment();}}
As you'd expect, it increments the reference count. Here's the implementation ofincrement:
voidincrement(){__atomic_fetch_add(&refCount,RC_ONE,__ATOMIC_RELAXED);}
RC_ONE comes from anenum:
enum:uint32_t{RC_PINNED_FLAG=0x1,RC_DEALLOCATING_FLAG=0x2,RC_FLAGS_COUNT=2,RC_FLAGS_MASK=3,RC_COUNT_MASK=~RC_FLAGS_MASK,RC_ONE=RC_FLAGS_MASK+1};
We can see why the count went up by4 with each new strong reference. The first two bits of the field are used for flags. Looking back at the dumps, we can see those flags in action. Here's a weak target before and after the last strong reference went away:
WeakTarget0x00007fe17341d270:000000010faa63e000000004000000040123456789abcdefWeaktargetdeinitWeakTarget0x00007fe17341d270:000000010faa63e000000002000000020123456789abcdef
The field went from4, denoting a reference count of 1 and no flags, to2, denoting a reference count of zero andRC_DEALLOCATING_FLAG set. This post-deinit object is placed in some sort ofDEALLOCATING limbo.
(Incidentally, what isRC_PINNED_FLAG for? I poked through the code base and couldn't figure out anything beyond that it indicates a "pinned object," which is already pretty obvious from the name. If you figure it out or have an informed guess, please post a comment.)
Let's check out the weak reference count's implementation, while we're here. It has the same sort ofenum:
enum:uint32_t{// There isn't really a flag here.// Making weak RC_ONE == strong RC_ONE saves an// instruction in allocation on arm64.RC_UNUSED_FLAG=1,RC_FLAGS_COUNT=1,RC_FLAGS_MASK=1,RC_COUNT_MASK=~RC_FLAGS_MASK,RC_ONE=RC_FLAGS_MASK+1};
That's where the2 comes from: there's space reserved for one flag, which is currently unused. Oddly, the comment in this code appears to be incorrect, asRC_ONE here is equal to2, whereas the strongRC_ONE is equal to4. I'd guess they were once equal, and then it was changed and the comment wasn't updated.Just goes to show that comments are useless and you shouldn't ever write them.
How does all of this tie in to loading weak references? That's handled by afunction calledswift_weakLoadStrong:
HeapObject*swift::swift_weakLoadStrong(WeakReference*ref){autoobject=ref->Value;if(object==nullptr)returnnullptr;if(object->refCount.isDeallocating()){swift_weakRelease(object);ref->Value=nullptr;returnnullptr;}returnswift_tryRetain(object);}
From this, it's clear how the lazy zeroing works. When loading a weak reference, if the target is deallocating, zero out the reference. Otherwise, try to retain the target, and return it. Digging a bit further, we can see howswift_weakRelease deallocates the object's memory if it's the last reference:
voidswift::swift_weakRelease(HeapObject*object){if(!object)return;if(object->weakRefCount.decrementShouldDeallocate()){// Only class objects can be weak-retained and weak-released.autometadata=object->metadata;assert(metadata->isClassObject());autoclassMetadata=static_cast<constClassMetadata*>(metadata);assert(classMetadata->isTypeMetadata());swift_slowDealloc(object,classMetadata->getInstanceSize(),classMetadata->getInstanceAlignMask());}}
(Note: if you're looking at the code in the repository, the naming has changed to use "unowned" instead of "weak" for most cases. The naming above is current as of the latest snapshot as of the time of this writing, but development moves on. You can view the repository as of the 2.2 snapshot to see it as I have it here, or grab the latest but be aware of the naming changes, and possibly implementation changes.)
Putting it All Together
We've seen it all from top to bottom now. What's the high-level view on how Swift weak references actually work?
nil.This design has some interesting consequences compared to Objective-C's approach:
Array orDictionary properties) are freed when the last strong reference goes away. A weak reference can cause a single instance to stay allocated, but not a whole tree of objects.isa by using anon-pointerisa, but I'm not sure how important that is or how it's going to shake out in the long term. For 32-bit, it looks like the weak count increases object sizes by four bytes. The importance of 32-bit is diminishing by the day, however.unowned. Under the hood,unowned works exactly likeweak, except that it fails loudly if the target went away rather than returningnil. In Objective-C,__unsafe_unretained is implemented as a raw pointer with undefined behavior if you access it late because it's supposed to be fast, and loading a weak pointer is somewhat slow.Conclusion
Swift's weak pointers use an interesting approach that provides correctness, speed, and low memory overhead. By tracking a weak reference count for each object and decoupling object deinitialization from objct deallocation, weak references can be resolved both safely and quickly. The availability of the source code for the standard library lets us see exactly what's going on at the source level, instead of groveling through disassemblies and memory dumps as we often do. Of course, as you can see above, it's hard to break that habit fully.
That's it for today. Come back next time for more goodies. That might be a few weeks, as the holidays intervene, but I'm going to shoot for one shortish article before that happens. In any case, keep your suggestions for topics coming in. Friday Q&A is driven by reader ideas, so if you have one you'd like to see covered,let me know!
isUniquelyReferenced() called sth likeisUniquelyReferencedOrPinned().__weak type(objA) weakA = objA
[aSwiftObject doFuncA:^(Bool success) {
[weakA doSomething];
}];
doFuncA doesn't call the callback, objA never get deallocated. Is that correct?
class WeakTarget : NSObject {}
Weak references to an object will cause that object's memory to remain allocated even after there are no strong references to it, until all weak references are either loaded or discarded. This temporarily increases memory usage. Note that the effect is small, because while the target object's memory remains allocated, it's only the memory for the instance itself. All external resources (including storage for Array or Dictionary properties) are freed when the last strong reference goes away. A weak reference can cause a single instance to stay allocated, but not a whole tree of objects.
Add your thoughts, post a comment:
Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.