Movatterモバイル変換

Posted at 2017-07-14 13:57 |RSS feed (Full text feed) |Blog Index
Next article:Friday Q&A 2017-07-28: A Binary Coder for Swift
Previous article:Friday Q&A 2017-06-30: Dissecting objc_msgSend on ARM64
Tags:fridayqna serialization swift

Friday Q&A 2017-07-14: Swift.Codable

byMike Ash

One of the interesting additions to Swift 4 is theCodable protocol and the machinery around it. This is a subject near and dear to my heart, and I want to discuss what it is and how it works today.

Serialization
Serializing values to data that can be stored on disk or transmitted over a network is a common need. It's especially common in this age of always-connected mobile apps.

So far, the options for serialization in Apple's ecosystem were limited:

NSCoding provides intelligent serialization of complex object graphs and works with your own types, but works with a poorly documented serialization format not suitable for cross-platform work, and requires writing code to manually encode and decode your types.
NSPropertyListSerialization andNSJSONSerialization can convert between standard Cocoa types likeNSDictionary/NSString and property lists or JSON. JSON in particular is used all over the place for server communication. Since these APIs provide low-level values, you have to write a bunch of code to extract meaning from those values. That code is often ad-hoc and handles bad data poorly.
NSXMLParser andNSXMLDocument are the choice of masochists or people stuck working with systems that use XML. Converting between the basic parsed data and more meaningful model objects is once again up to the programmer.
Finally, there's always the option to build your own from scratch. This is fun, but a lot of work, and error-prone.

These approaches tend to result in a lot of boilerplate code, where you declare a property calledfoo of typeString which is encoded by storing theString stored infoo under the key"foo" and is decoded by retrieving the value for the key"foo", attempting to cast it to aString, storing it intofoo on success, or throwing an error on failure. Then you declare a property calledbar of typeString which....

Naturally, programmers dislike these repetitive tasks. Repitition is what computers are for. We want to be able to just write this:

structWhatever{varfoo:Stringvarbar:String}

And have it be serializable. It ought to be possible: all the necessary information is already present.

Reflection is a common way to accomplish this. A lot of Objective-C programmers have written code to automatically read and write Objective-C objects to and from JSON objects. The Objective-C runtime provides all of the information you need to do this automatically. For Swift, we can use the Objective-C runtime, or make do with Swift's Mirror and use wacky workarounds to compensate for its inability to mutate properties.

Outside of Apple's ecosystem, this is a common approach in many languages. This has led to varioushilarious security bugs over the years.

Reflection is not a particularly good solution to this problem. It's easy to get it wrong and create security bugs. It's less able to use static typing, so more errors happen at runtime rather than compile time. And it tends to be pretty slow, since the code has to be completely general and does lots of string lookups with type metadata.

Swift has taken the approach of compile-time code generation rather than runtime reflection. This means that some of the knowledge has to be built in to the compiler, but the result is fast and takes advantage of static typing, while still remaining easy to use.

Overview
There are a few fundamental protocols that Swift's new encoding system is built around.

TheEncodable protocol is used for types which can be encoded. If you conform to this protocol and all stored properties in your type are themselvesEncodable, then the compiler will generate an implementation for you. If you don't meet the requirements, or you need special handling, you can implement it yourself.

TheDecodable protocol is the companion to theEncodable protocol and denotes types which can be decoded. LikeEncodable, the compiler will generate an implementation for you if your stored properties are allDecodable.

BecauseEncodable andDecodable usually go together, there's another protocol calledCodable which is just the two protocols glued together:

typealiasCodable=Decodable&Encodable

These two protocols are really simple. Each one contains just one requirement:

protocolEncodable{funcencode(toencoder:Encoder)throws}protocolDecodable{init(fromdecoder:Decoder)throws}

TheEncoder andDecoder protocols specify how objects can actually encode and decode themselves. You don't have to worry about these for basic use, since the default implementation ofCodable handles all the details for you, but you need to use them if you write your ownCodable implementation. These are complex and we'll look at them later.

Finally, there's aCodingKey protocol which is used to denote keys used for encoding and decoding. This adds an extra layer of static type checking to the process compared to using plain strings everywhere. It provides aString, and optionally anInt for positional keys:

protocolCodingKey{varstringValue:String{get}init?(stringValue:String)varintValue:Int?{get}publicinit?(intValue:Int)}

Encoders and Decoders
The basic concept ofEncoder andDecoder is similar toNSCoder. Objects receive a coder and then call its methods to encode or decode themselves.

The API ofNSCoder is straightforward.NSCoder has a bunch of methods likeencodeObject:forKey: andencodeInteger:forKey: which objects call to perform their coding. Objects can also use unkeyed methods likeencodeObject: andencodeInteger: to do things positionally instead of by key.

Swift's API is more indirect.Encoder doesn't have any methods of its own for encoding values. Instead, it providescontainers, and those containers then have methods for encoding values. There's one container for keyed encoding, one for unkeyed encoding, and one for encoding a single value.

This helps make things more explicit and fits better with portable serialization formats.NSCoder only has to work with Apple's encoding format so it just needs to put the same thing out that it got in.Encoder has to work with things like JSON. If an object encodes values with keys, that should produce a JSON dictionary. If it uses unkeyed encoding then that should produce a JSON array. What if the object is empty and encodes no values? With theNSCoder approach, it would have no idea what to output. WithEncoder, the object will still request a keyed or unkeyed container and the encoder can figure it out from that.

Decoder works the same way. You don't decode values from it directly, but rather ask for a container, and then decode values from the container. LikeEncoder,Decoder provides keyed, unkeyed, and single value containers.

Because of this container design, theEncoder andDecoder protocols themselves are small. They contain a bit of bookkeeping info, and methods for obtaining containers:

protocolEncoder{varcodingPath:[CodingKey?]{get}publicvaruserInfo:[CodingUserInfoKey:Any]{get}funccontainer<Key>(keyedBytype:Key.Type)->KeyedEncodingContainer<Key>whereKey:CodingKeyfuncunkeyedContainer()->UnkeyedEncodingContainerfuncsingleValueContainer()->SingleValueEncodingContainer}protocolDecoder{varcodingPath:[CodingKey?]{get}varuserInfo:[CodingUserInfoKey:Any]{get}funccontainer<Key>(keyedBytype:Key.Type)throws->KeyedDecodingContainer<Key>whereKey:CodingKeyfuncunkeyedContainer()throws->UnkeyedDecodingContainerfuncsingleValueContainer()throws->SingleValueDecodingContainer}

The complexity is in the container types. You can get pretty far by recursively walking through properties ofCodable types, but at some point you need to get down to some raw encodable types which can be directly encoded and decoded. ForCodable, those types include the various integer types,Float,Double,Bool, andString. That makes for a whole bunch of really similar encode/decode methods. Unkeyed containers also directly support encoding sequences of the raw encodable types.

Beyond those basic methods, there are a bunch of methods that support exotic use cases. KeyedDecodingContainer has methods calleddecodeIfPresent which return an optional and returnnil for missing keys instead of throwing. The encoding containers have methods for weak encoding, which encodes an object only if something else encodes it too (useful for parent references in a complex graph). There are methods for getting nested containers, which allows you to encode hierarchies. Finally, there are methods for getting a "super" encoder or decoder, which is intended to allow subclasses and superclasses to coexist peacefully when encoding and decoding. The subclass can encode itself directly, and then ask the superclass to encode itself with a "super" encoder, which ensures keys don't conflict.

ImplementingCodable
ImplementingCodable is easy: declare conformance and let the compiler generate it for you.

It's useful to know just what it's doing, though. Let's take a look at what it ends up generating and how you would do it yourself. We'll start with an exampleCodable type:

structPerson:Codable{varname:Stringvarage:Intvarquest:String}

The compiler generates aCodingKeys type nested insidePerson. If we did it ourselves, that nested type would look like this:

privateenumCodingKeys:CodingKey{casenamecaseagecasequest}

The case names matchPerson's property names. Compiler magic gives each CodingKeys case a string value which matches its case name, which means that the property names are also the keys used for encoding them.

If we need different names, we can easily accomplish this by providing our ownCodingKeys with custom raw values. For example, we might write this:

privateenumCodingKeys:String,CodingKey{casename="person_name"caseagecasequest}

This will cause thename property to be encoded and decoded underperson_name. And this is all we have to do. The compiler happily accepts our customCodingKeys type while still providing a default implementation for the rest ofCodable, and that default implementation uses our custom type. You can mix and match customizations with the compiler-provided code.

The compiler also generates an implementation forencode(to:) andinit(from:). The implementation ofencode(to:) gets a keyed container and then encodes each property in turn:

funcencode(toencoder:Encoder)throws{varcontainer=encoder.container(keyedBy:CodingKeys.self)trycontainer.encode(name,forKey:.name)trycontainer.encode(age,forKey:.age)trycontainer.encode(quest,forKey:.quest)}

The compiler generates an implementation ofinit(from:) which mirrors this:

init(fromdecoder:Decoder)throws{letcontainer=trydecoder.container(keyedBy:CodingKeys.self)name=trycontainer.decode(String.self,forKey:.name)age=trycontainer.decode(Int.self,forKey:.age)quest=trycontainer.decode(String.self,forKey:.quest)}

That's all there is to it. Just like withCodingKeys, if you need custom behavior here you can implement your own version of one of these methods while letting the compiler generate the rest. Unfortunately, there's no way to specify custom behavior for an individual property, so you have to write out the whole thing even if you want the default behavior for the rest. This is not particularly terrible, though.

If you were to do it all by hand, the full implementation ofCodable forPerson would look like this:

extensionPerson{privateenumCodingKeys:CodingKey{casenamecaseagecasequest}funcencode(toencoder:Encoder)throws{varcontainer=encoder.container(keyedBy:CodingKeys.self)trycontainer.encode(name,forKey:.name)trycontainer.encode(age,forKey:.age)trycontainer.encode(quest,forKey:.quest)}init(fromdecoder:Decoder)throws{letcontainer=trydecoder.container(keyedBy:CodingKeys.self)name=trycontainer.decode(String.self,forKey:.name)age=trycontainer.decode(Int.self,forKey:.age)quest=trycontainer.decode(String.self,forKey:.quest)}}

ImplementingEncoder andDecoder
You may never need to implement your ownEncoder orDecoder. Swift provides implementations for JSON and property lists, which take care of the common use cases.

You can implement your own in order to support a custom format. The size of the container protocols means this will take some effort. Fortunately, it's mostly a matter of size, not complexity.

To implement a customEncoder, you'll need something that implements theEncoder protocol plus implementations of the container protocols. Implementing the three container protocols involves a lot of repetitive code to implement encoding or decoding methods for all of the various directly encodable types.

How they work is up to you. TheEncoder will probably need to store the data being encoded, and the containers will inform theEncoder of the various things they're encoding.

Implementing a customDecoder is similar. You'll need to implement that protocol plus the container protocols. The decoder will hold the serialized data and the containers will communicate with it to provide the requested values.

I've been experimenting with a custom binary encoder and decoder as a way to learn the protocols, and I hope to present that in a future article as an example of how to do it.

Conclusion
Swift 4'sCodable API looks great and ought to simplify a lot of common code. For typical JSON tasks, it's sufficient to declare conformance toCodable in your model types and let the compiler do the rest. When needed, you can implement parts of the protocol yourself in order to handle things differently, and you can implement it all if needed.

The companionEncoder andDecoder protocols are more complex, but justifiably so. Supporting a custom format by implementing your ownEncoder andDecoder takes some work, but is mostly a matter of filling in a lot of similar blanks.

That's it for today! Come back again for more exciting serialization-related material, and perhaps even things not related to serialization. Until then, Friday Q&A is driven by reader ideas, so if you have a topic you'd like to see covered here, pleasesend it in!

Did you enjoy this article? I'm selling whole books full of them! Volumes II and III are now out! They're available as ePub, PDF, print, and on iBooks and Kindle.Click here for more information.

Comments:

nolenat2017-07-14 19:36:29:

Thanks for the informative and timely post, I'm actually in the process of implementing some custom Codable types .

Andrey Mishaninat2017-07-15 18:51:46:

Hey Mike, any ideas on why do we have to explicitly specify types of properties when decoding instead of letting compiler infer them based on return type ofdecode(forKey:)? Thanks!

David Sinclairat2017-07-16 04:31:57:

One thing I've been wondering about: is there any machanism to skip some properties? I sometimes have some properties that are only needed at runtime, so don't need to be serialized.

Andrey Tarantsovat2017-07-16 05:41:06:

David, to skip some keys, you just omit those keys from your custom CodingKeys enum.

Itai Ferberat2017-07-17 01:10:17:

Andrey, it's because overloading on return type and letting the compiler infer the overload can very easily lead to ambiguity in surprising situations. The Swift API Design Guidelines (https://swift.org/documentation/api-design-guidelines/) recommend avoiding this, and we preferred a more conservative but consistent API over something potentially confusing and ambiguous.

(That being said, it's always possible to write an extension yourself that performs this overloading.)

QQat2017-07-19 21:33:05:

I was disappointed to find that this post, like everything else I've read about Codable, doesn't discuss archiving. There is some information about using Codable within NSCoding in SE-0167, but nothing addresses the vital point, which is the semantics of reference equivalence. JSON data is value based because tree structured. Archives are reference based because that's the only way to properly archive object graphs. When a Swift instance is added to an archive using encodeCodable, is reference equivalence (on decoding) preserved for instances that are reference types? For (nested) references internal to the instance's properties? This seems kinda important.

mikeashat2017-07-19 21:50:25:

Reference equivalence and other such things are entirely up to the encoder implementation. The JSON and property list encoders in the standard library implement everything as values. If the same instance is encoded twice, the result is two separate entities in the resulting data, and decoding that data will produce two instances. These encoders are intended to translate to/from typical JSON and plist structures, not offer full object archiving that happens to use JSON/plist as its low-level serialization format.

If and when NSCoding works with Codable, it should work the same way it does with ObjC, i.e. preserving reference equivalence. SE-0167 mentions adding methods that use Codable to NSKeyedArchiver and NSKeyedUnarchiver, but it looks like this hasn't been done yet. Otherwise you'd have to make your own.

QQat2017-07-20 20:12:32:

FWIW, (a) I wasn't criticizing the JSON decoder, just noting the semantic difference between than and coding/decoding in the NSCoding sense; (b) the end of SE-0167 says that "encodeCodable" is hidden behind plain "encode" in a Swift wrapper function, which I'd missed previously, so it might be implemented after all; (c) for archiving via NSKeyedArchiver, the containers must also be reference-aware, not just the encoder; (d) there are code examples aplenty for JSON, but none at all for archiving.

My point was that there's something to talk about here, and there isn't any documentation that lays it out clearly. Is it feasible to create documents containing a purely Swift-Encodable archive? Or does there have to be at least a top-level NSObject? Is the implementation fully NSSecureCoding compatible/equivalent? Does the implementation embrace the new-style Obj-C failable decoding mechanism?

Maybe the answer is that it all works great, but we just don't know until someone actually says something about it, somewhere.

mikeashat2017-07-21 14:28:16:

No problem, I just wanted to clarify why Codable has all these fancy facilities than the provided JSON and plist encoders don't actually use.

Myguess is that you'll be able to create an NSKeyedArchiver, ask it to encode your Codable object, and it'll go off and do its thing. I don't think NSSecureCoding is relevant, because the decoded types are determined entirely by the static types in your program, no runtime looking up by name. But we'll have to wait for an implementation, or at least more discussion, to know for sure.

nolenat2017-07-25 01:54:31:

fun fact: JSONDecoder does not conform to protocol Decoder. It uses a private internal implementation that does conform.

DanielT1263at2017-07-30 11:59:24:

I'm happy to see this new feature. I have asked several times on swift eveloution for a similar treatment for Equatable but still no luck. Maybe they will consider it now.

Itai Ferberat2017-08-08 01:04:22:

@QQ (I don't know if you'll ever see this but...) References are a really important part of this feature that we didn't have time to fully finish, so will come in an update to Swift. They will be supported with bothJSONEncoder andPropertyListEncoder (and the decoders of course), and in the meantime,NSKeyedArchiver andNSKeyedUnarchiver do supportCodable instances viaencodeEncodable anddecodeDecodable (and will continue to improve support in an update as well). The documentation for these should be updated in an upcoming beta and should be stable and available to use.

Jason R Tibbettsat2017-08-14 01:59:57:

The CodingKeys enum and @David Sinclair's question make me wish that Swift had a mechanism, like Java's, for specifying whether properties are transient and/or to be encoded with a different key. This would be so much easier than having to implement your own CodingKeys enum.

matt neuburgat2017-08-27 19:40:39:

@Itai Ferber I don't know ifyou'll ever seethis, but thanks for pointing outencodeEncodable anddecodeDecodable! These were not present in the earlier betas, and are not mentioned in any release notes as far as I can tell, so I would never have noticed them. It's a pity they are still currently undocumented.

Comments RSS feed for this page

Add your thoughts, post a comment:

Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.

Code syntax highlighting thanks toPygments.

Name:
The Answer to the Ultimate Question of Life, the Universe, and Everything?
Comment:
	Formatting:`<i> <b> <blockquote> <code>`.
	NOTE: Due to an increase in spam, URLs are forbidden! Please provide search terms or fragment your URLs so they don't look like URLs.