It’s clear from community adoption and feedback that
Codable
has had a lot of success in the years since it was added to Swift 4, but that it doesn’t satisfy some important needs. One of the foremost of those needs is performance more in line with programming environments that compete with Swift. As such, the main goal for this effort is to unlock higher levels of performance during both serialization and deserialization without sacrificing the ease of use thatCodable
provides.[…]
Even with all of its strengths, the existing API’s design has some unavoidable performance penalties. For instance, its use of existentials implies additional runtime and memory costs as existential values are boxed, unboxed, retained, released, and dynamic dispatch is performed.
Also, because a client can decode dictionary values in arbitrary orders, a
KeyedDecodingContainer
is effectively required to proactively parse the payload into some kind of intermediate representation, necessitating allocations for internal temporary dictionaries, andString
values.[…]
In Swift, when a client needs to do more than just alter the default
CodingKey
representations, developers are often faced with a large cliff where they’re forced to manually replicate the wholeCodable
implementation just to do so.[…]
In this new design I aim to leverage Swift’s macro features to meet or exceed Serde’s level of support for customization of synthesized conformances. Moving code synthesis from the compiler to a macro will enable us to use attribute-like macros as targeted customization mechanisms, which was not something we could easily accomplish with the compiler-based
Codable
synthesis.[…]
There is no
encode(_: Date)
function present in theEncoder
interface, which meansPropertyListEncoder
has to attempt to dynamically cast everysome Encodable
type it receives toDate
in order to handle these natively. This helps keep theEncodable
type format-agnostic, but it has a negative impact on performance, even if you never actually encode anyDate
s.I believe that fully and formally embracing format-specialization where appropriate is the best solution to this problem. Specifically, we should encourage each serialization format that has native support for data types that aren’t represented in the format-agnostic interface to produce its own protocol variant that includes explicit support for these types, e.g.
JSONCodable
orPropertyListCodable
.
One of the big flaws of
Codable
is that it was built on the wrong abstraction. 99.9% of the time, developers who are interested in serializing a struct to data and back are doing so to a single, well-known format. However, theCodable
API was built so that the abstraction point is the encoder itself, under the assumption that you would want to serialize a type to multiple formats. This is not the case.That design flaw has been the #1 source of Codable’s woes. It makes properly implementing custom coders almost impossible; no one implements
superEncoder
properly, since most people don’t deal with inheritance of reference types, and some formats are fundamentally incompatible with the Encoder/Decoder APIs. (XML and CSV are two that spring to mind off the top of my head)[…]
IMO we should be encouraging packages that provide format-specific coders (
JSONCodable
,PlistCodable
,CSVCodable
,XMLCodable
, etc) so that each encoder and decoder can provide format-specific functionality. Then we should provide a system level API to ask types to encode into an opaque format (ie “please turn yourself into aData
and back again”).[…]
Foundation should provide an updated replacement for
NSCoding
and leave the type-specific encoders to type-specific packages to implement.
JSONCodable
,PlistCodable
, etc. should have full freedom to craft their interface around each format’s individuals needs and specialities.At one stage, the “format specialized” protocols was the entirety of the design. However, while looking at adoption scenarios, I realized that this design presented a problem with “currency” types that are owned by frameworks/libraries, but used by application-level serializable types.
[…]
Hence the introduction of the format-agnostic protocols in parallel with the format-specialized ones.
Range
andCGRect
can, in similar fashion toCodable
, describe their serializable members abstractly, allowing a specific encoder/decoder to interpret those instructions. The difference fromCodable
being that we avoid all the OTHER downsides ofCodable
the OP describes.
That’s why I’m suggesting that we split the API to support the cases separately. We have one API that can be very general and support the whole “A type can be serialized to an opaque format” use-case, and then packages to support particular formats and all of their respective idiosyncrasies. I think we’d be repeating past mistakes to try and make those two use cases be the same API again.
I think there’s one common use case which is not covered by the current
Codable
design: heterogeneous/dynamic decoding/encoding.Many times in my developing, I wanted to decode part of a json into an intermediate representation, and later further decode that thing into a specific type.
The problem with
Codable
– and what I think you’re getting at when you suggest we needJSONCodable
/PlistCodable
– is there’s no sane custom implementation ofinit(from:)
andencode(to:)
without being archive-specific. These functions are generally a mashup of two different ideas:
- migration and versioning
- archive-specific choices like which fields to include and what order
But moreover, while you might make archive-specific choices, you don’t always have archive-specific knowledge.
[…]
We have no lookahead. We can’t peek to see if the next char is a double-quote, a digit or a bracket. Without overloading the
Decoder
to emit lookahead metadata as decodable types, you simply need to try each possibility, in turn, incurring the overhead and disruption of thrown errors.
This design does not include support for encoding and decoding cyclical objects graphs. Relatedly, there’s still no intention to include encoding of runtime type information in serialization formats for any purpose—all concrete types must be specified by the client doing the encoding or decoding.
I was really disappointed to see this, because these are probably my two major pain points with
Codable
.If we are going to the trouble of making a brand new, backwards-incompatible replacement for
Codable
then it should try to correct all the major deficiencies of the existing design, not just performance.
NSCoding
(for all its faults) supports both or heterogeneous data and cyclical references. If this new system doesn’t support those then we are saying from the outset that it is still isn’t going to be capable of dealing with a lot of real-world use-cases.[…]
Also (related) some kind of built-in support for schema updates and migrations (similar to CoreData/SwiftData) would be a great feature, as this is another pain point in Codable.
Even just a way to specify a default value for new non-optional properties would reduce a lot of the need for adding manual decoder implementations to apps in post-1.0 releases.
NSCoder
/NSArchiver
was actually pretty good for what it was intended for, archiving object graphs. How can I do that today? SwiftData?
Another issue I’ve run into with
Codable
is that a given object may have more than one serialized representation in a given application.
I’d like to put in a request to please consider error handling. A common source of grief for beginners is difficulty in reading the error messages thrown by Codable. Some information is missing, and it’s formatted such that you really have to do some digging to understand it.
It seems the new macro based approach will solve some major performance problems
But it doesn’t seem to address what makes serialisation actually hard: different formats, mappings, versioning and preservation. It still seems to be bonusware, w/ “now it does the demos fast”, not something addressing actual serialisation issues.
Think Protobuf, that does.
Previously: