

Get [@@deriving]-style generation of type-directed values without writing a ppx


Ppx_type_directed_value is a ppx that does[@@deriving]-stylegeneration of type-directed values based on user-provided modules. Theuser-provided modules tellppx_type_directed_value how to composetype-directed values (for example, combine type-directed values of thefields of a record to form a type-directed value for the recorditself).

This allows a wide variety of PPXs such asppx_sexp_conv,ppx_compare,ppx_enumerate, etc. to be implemented withppx_type_directed_value, but with some runtime cost.

This PPX currently supports deriving type-directed values for records, ordinary& polymorphic variants and tuples. It also supports custom user-defined attributeson record and variant fields.

Published:21 Mar 2022



ppx_type_directed_value=======================Introduction================`Ppx_type_directed_value` is a ppx that does `[@@deriving]`-stylegeneration of type-directed values based on user-provided modules. Theuser-provided modules tell `ppx_type_directed_value` how to composetype-directed values (for example, combine type-directed values of thefields of a record to form a type-directed value for the recorditself).This allows a wide variety of PPXs such as `ppx_sexp_conv`,`ppx_compare`, `ppx_enumerate`, etc. to be implemented with`ppx_type_directed_value`, but with some runtime cost.This PPX currently supports deriving type-directed values for records, ordinary& polymorphic variants and tuples. It also supports custom user-defined attributeson record and variant fields.Motivation==============Many deriving PPXs have a similar quasi-recursive nature where the resultingvalue derived by the PPX for a type is the composition of relevant valuesassumed to be defined for each constituent of the type, where these valuesare either "base cases" or are derived from the same PPX.Using `ppx_sexp_conv` as an example, `sexp_of_t` where `t` is a recordwill call the corresponding `sexp_of_[type]` on each field of therecord, place it in a `Sexp.List` with the field name, and place allthe `Sexp.t` for each field in another `Sexp.List`. Here,`sexp_of_[type]` is often produced by `[type]` itself having a`[@@deriving sexp]` annotation, although it's also sometimes manuallydefined.`Ppx_type_directed_value` allows users to define new `deriving`annotations following this pattern without having to write any ppxcode, thus avoiding the boilerplate of registering your deriver,traversing the AST etc.Quickstart===============The easiest way to get started with `ppx_type_directed_value` is touse an applicative.  This might not give you all the features youwant, but it'll get you off the ground.For example, suppose you want to turn the `Command.Param` applicative into a ppx.  Then make amodule (called, say, `Type_directed.Command`) with this contents:<!-- $MDX file=examples/,part=of_applicative -->```ocamlopen Ppx_type_directed_value_runtimeinclude Converters.Of_applicative (Core.Command.Param)```Then, add to your jbuild:```(preprocess (pps (ppx_type_directed_value -module -Type_directed.Command)))```Then you'll be able to do things like this:<!-- $MDX file=test/inline/,part=applicative_demo -->```ocamlmodule Host_and_port = struct  type t = Host_and_port.t =    { host : string             [@command.custom               let open Command.Param in               flag "host" (required string) ~doc:"host"]    ; port : int             [@command.custom               let open Command.Param in               flag "port" (required int) ~doc:"port"]    }  [@@deriving command]endtype t =  { host_and_port : Host_and_port.t  ; name : string option           [@command.custom             let open Command.Param in             flag "name" (optional string) ~doc:"name"]  }[@@deriving command]```This approach does have limitations.  For example, in this casethere's some unnecessary repetition between the field names and theflag names, variants won't be supported, and you won't be able to useattributes besides the custom one (`command.custom` in this case).But it may be enough for you.If you want to learn how to lift any of those restrictions, eitherread on or poke around the examples/ directory.Raw Interface=====================**Warning:** This section defines the "raw interface" that will giveyou the most control over how the ppx works.  However, it's fairlyinvolved.  For a first read, you might just skim this section andinstead read the "Converters" section below.  That section describessome utility functors which make it so that you don't have to writethe raw interface by hand.The PPX expects to take in a module of the following type to guide thegeneration of arbitrary type-directed values.<!-- $MDX file=ppx_runtime/type_directed.mli,part=interface -->```ocaml(** Signature that a user-provided module should implement to guide the    code generation of type-directed values *)module type S = sig  (** Type-directed value of interest *)  module T : Type_directed_value  (** Given transformations between two isomorphic types 'a, 'b,      turns a 'a type-directed value to a 'b type-directed value  *)  val apply_iso  : 'a T.t -> ('a -> 'b) -> ('b -> 'a) -> 'b T.t  val of_tuple   : ('a, 'length) Tuple(T).t -> 'a T.t  val of_record  : ('a, 'length) Record(T).t -> 'a T.t  val of_variant : ('a, 'length) Variant(T).t -> 'a T.tend```Type-directed value------------------------<!-- $MDX file=ppx_runtime/type_directed.mli,part=tdv_sig -->```ocamlmodule type Type_directed_value = sig  type 'a t  (** ['a attribute] allows supporting attributes such as      {[        type t =          { foo : int [@my_module attr]          ; ...          } [@@deriving my_module]      ]}      where in the above example [attr] would have type [int My_module.attribute].      If you don't want to use this feature, you can define [type 'a attribute =      Nothing.t]. *)  type 'a attributeend```A "type-directed value" is a value associated to a given type which can bederived from the type definition in some way. In this context, we expect thatthe type-directed value for a record can be derived from the values for the fieldsand similarly for variants and tuples.For example, if we were to implement `ppx_sexp_conv`, the type of the type-directed valuewould be<!-- $MDX file=test/inline/,part=sexp_implementation -->```ocamlmodule _ = struct  type 'a t =    { sexp_of_t : 'a -> Sexp.t    ; t_of_sexp : Sexp.t -> 'a    }  (* Don't implement any attributes for now *)  type 'a attribute = Nothing.tend```and the type of the type-directed value of a type `t` would be `t T.t`.of_{tuple, record, variant}----------------------------------At a high level, these functions are the underlying implementations for composingtype-directed values from the constituent types of a tuple/record/variant.They each take in a data structure, which contains the type-directedvalues and other information (such as field names/constructor names/attributes)that represents a tuple/record/variant type, and should return atype-directed value for the tuple/record/variant type. We discuss the specifics below.We begin with the simplest case - the tuple type. In order to supporttuples with arbitrary elements, we define a GADT to package the type-directed valueof each tuple element.<!-- $MDX file=test/inline/,part=tuple -->```ocaml  module _ (T : Type_directed.Type_directed_value) : sig    type ('a, 'length) seq =      | [] : (unit, zero) seq      | ( :: ) : 'a T.t * ('b, 'l) seq -> ('a * 'b, 'l succ) seq    type ('a, 'length) t = ('a, 'length succ succ) seq  end```This enforces that the length of the list is at least 2.The first index of the GADT, `'a`, is a nested pair that represents thetype of each element of the tuple packaged. The second index of the GADT,`'length` tracks the length of the tuple at the type level. This datastructure can be thought of as a normal list where each element is thecorresponding type-directed value of each element in the tuple andalso tracks the type of each element as well as the length.For example, given the tuple type `type t = int * string`, the data structuregiven to `of_tuple` is `[{module}_int; {module}_string]` with type`((int, (string, unit)), zero succ succ) Tuple(T).t`.Note that the type-directed value `of_tuple` is expected toreturn has type `(int, (string, unit)) T.t`, which is not the same as`t T.t`. This is how `apply_iso` is used and will be discussed below.The following is an example implementation of `of_tuple` for a type-directed valuethat is the `equals` function.<!-- $MDX file=test/inline/,part=equal_implementation -->```ocamlmodule T = struct  type 'a t = 'a -> 'a -> bool  type 'a attribute = Nothing.tend```<!-- $MDX file=examples/,part=equals_of_tuple -->```ocamllet rec of_tuple : type a len. (a, len) Type_directed.Tuple(T).t -> a T.t =  fun t ->  match t with  | ([ v1; v2 ]                : _ Type_directed.Tuple(T).t) ->    fun (fst1, (snd1, ())) (fst2, (snd2, ())) -> v1 fst1 fst2 && v2 snd1 snd2  | (v1 :: (_ :: _ :: _ as tl) : _ Type_directed.Tuple(T).t) ->    fun (hd1, tl1) (hd2, tl2) -> v1 hd1 hd2 && (of_tuple tl) tl1 tl2;;```It is worth noting that the data structure can be similarly pattern-matchedlike normal lists with a twist. Since OCaml tuples must have at least twoelements, the data structure also guarantees that there are at least twoelements at the type level. Thus, the base case is a "list" withtwo elements while the pattern of the inductive case requires thetail `tl` to have at least two elements to help the typecheckerverify that `of_tuple` can be recursively called on it.`of_record` and `of_variant` work similarly but with inputs `('a, 'length) Record(T).t`and `('a, 'length) Variant(T).t` instead. Both record and variant data structuresmake use of the `Key` module which contains the name (field name for records,constructor names for variants), a type-directed value and the attribute if present.<!-- $MDX file=ppx_runtime/type_directed.mli,part=key_sig -->```ocamlmodule Key : sig  type ('a, 'attribute) t =    { name      : string    ; value     : 'a    ; attribute : 'attribute option    }end```The record data structure is similarly a GADT that is similar to a list whereeach element is a `('a, 'attribute) Key(T).t` as shown below.<!-- $MDX file=test/inline/,part=record -->```ocaml  module _ (T : Type_directed.Type_directed_value) : sig    type ('a, 'length) seq =      | [] : (unit, zero) seq      | ( :: ) :          ('a T.t, 'a T.attribute) Type_directed.Key.t * ('b, 'l) seq          -> ('a * 'b, 'l succ) seq    type ('a, 'length) t = ('a, 'length succ) seq  end```This enforces that the length of the list is at least 1.For example, we have<!-- $MDX file=test/inline/,part=record_t -->```ocamltype t1 =  { f1 : int  ; f2 : string  }(* Generated value passed in to [of_record] *)let t_t1 : (int * (string * unit), zero succ) Type_directed.Record(T).t =  [ { name = "f1"; value = t_int; attribute = None }  ; { name = "f2"; value = t_string; attribute = None }  ];;```Finally, the variant data structure is also a GADT where each elementrepresents its constructors (ordinary or inlined record) as shown below.<!-- $MDX file=test/inline/,part=variant -->```ocaml  module _ (T : Type_directed.Type_directed_value) : sig    type 'a variant =      | Unlabelled : ('a, 'length) Type_directed.Variant_constructor(T).t -> 'a variant      | Labelled : ('a, 'length) Type_directed.Record(T).t -> 'a variant    type ('a, 'length) t =      | [] : (Nothing.t, zero) t      | ( :: ) :          ('a variant, 'a T.attribute) Type_directed.Key.t * ('b, 'l) t          -> (('a, 'b) Either.t, 'l succ) t  endend```Note that the definition of `Variant_constructor` is the same as `Tuple`but without the guarantee that the list has length >= 2. The index of the GADT nowis nested `Either.t` instead of pairs to represent arbitrary length sum types.For example, we have<!-- $MDX file=test/inline/,part=variant_t -->```ocamltype t2 =  | A  | B of int * string  | C of { f : int }(* Generated value passed in to [of_variant] *)let t_t2  : ( (unit, (int * (string * unit), (int * unit, Nothing.t) Either.t) Either.t) Either.t    , zero succ succ succ ) Type_directed.Variant(T).t  =  [ { name = "A"; value = Unlabelled []; attribute = None }  ; { name = "B"; value = Unlabelled [ t_int; t_string ]; attribute = None }  ; { name = "C"    ; value = Labelled [ { name = "f"; value = t_int; attribute = None } ]    ; attribute = None    }  ];;```The implementation of `of_record` and `of_variant` for a type-directed valuethat is the `equals` function are similar to `of_tuple` but is expectedlymore verbose and omitted here. You can find it in `test/examples/`.apply_iso-----------As mentioned above, the type of the GADT index is not exactly the typeof what we wish to derive.  But it happens to be isomorphic, so thePPX calls `apply_iso` to turn the generated type-directed value from`of_{tuple,record,variant}` into the desired type.The implementation of `apply_iso` is generally simple. If the type-directedvalue is the `equals` function with type `'a -> 'a -> bool`, we have<!-- $MDX file=test/inline/,part=apply_iso -->```ocamllet apply_iso instance _f f' x y = instance (f' x) (f' y)```Invoking the PPX======================The PPX takes the name of the module satisfying the interface aboveas a command-line argument. For example, if the fully-qualifiedmodule name is `Type_directed.Equals`, you should add```(preprocess (pps (ppx_type_directed_value -module -Type_directed.Equals)))```to the jbuild, which will register the deriver `[@@deriving equals]`.Multiple modules can be registered by passing multiplearguments of the form `-module -{Module_name}`.Note that the module name should currently be prefixed with "-" in orderto get jenga to parse it as a command line argument.Naming Conventions========================This PPX assumes (and generates) standard naming conventions for generated type-directed values.Namely,```type t = ... [@@deriving equals](* generates *)let equals : t Equals.T.t = ...type custom_type = ... [@@deriving equals](* generates *)let equals_custom_type : custom_type Equals.T.t = ...type 'a poly = ... [@@deriving equals](* generates *)let equals_poly : 'a Equals.T.t -> 'a poly Equals.T.t = ...```and so on.Attributes==============This PPX has support for user-defined attributes on record fields andvariant constructors. Given a module (say)`Ppx_type_directed_value_examples.Validate`, the attribute `validate`is registered as shown below.<!-- $MDX file=test/inline/,part=attribute-user -->```ocamllet f2_attrib : int Ppx_type_directed_value_examples.Validate.attribute = Name "new-name"type attrib_record =  { f1 : int  ; f2 : int [@validate f2_attrib]  }[@@deriving validate]let a_attrib : unit Ppx_type_directed_value_examples.Validate.attribute =  Name "new-name-1";;let b_attrib : (int * (string * unit)) Ppx_type_directed_value_examples.Validate.attribute  =  Name "new-name-2";;type attrib_variant_simple =  | A [@validate a_attrib]  | B of      { f1 : int      ; f2 : string      } [@validate b_attrib][@@deriving validate]```Note that the polymorphic attribute type is instantiated with the type ofthe field/constructor, and is passed to the `attribute` field in `Key.t`.This PPX also registers the attribute `{module}.custom` to replace the defaulttype-directed value the PPX uses for a field/constructor. For instance,<!-- $MDX file=test/,part=attribute-custom -->```ocaml    let always_equal : int -> int -> bool = fun _ _ -> true    type t = { f1 : int [@type_directed_equal.custom always_equal] }    [@@deriving type_directed_equal]```the PPX will populate the `value` field of the `Key.t` record with`always_equal` instead of the default type-directed value, `type_directed_equal_int`.Converters=============There are instances when the decision of how to compose type-directed values arelocal, that is, you consider one field/constructor at a time and specify how tocombine the field/constructor with the recursively constructed type-directed valueof the rest of the record/variant. We provide a set of converters (`ppx_runtime/`)with that take in comparatively simpler interfaces and produces modules that satisfythe interface that the PPX expects. We present them in increasing order of complexity.Of_applicative------------------An applicative module with type `Applicative.S` has sufficientinformation to build a `Type_directed.S` that supports records andtuples, but not variants. This can be done by using the`Of_applicative` functor on the applicative module.Semantically, the type-directed value is constructed as follows.<!-- $MDX file=test/inline/,part=of_applicative -->```ocamltype t =  { f1 : int  ; f2 : string  }[@@deriving command]let command : (int * (string * unit)) Ppx_type_directed_value_examples.Command.T.t =  let open Command.Param in  both command_int (both command_string (return ()));;```For example, a PPX deriver for `Core.Command.Params` can be instantiated by<!-- $MDX file=examples/,part=of_applicative -->```ocamlopen Ppx_type_directed_value_runtimeinclude Converters.Of_applicative (Core.Command.Param)```Note that attempting to derive variants with a `Type_directed.S` constructed in this mannerwill result in a *runtime* error.Of_simple------------If support for both records and variants are desired, but field names/constructor namesare irrelevant, the `Of_simple` functor can be used to build a `Type_directed.S`.The input interface that is expected is<!-- $MDX file=ppx_runtime/,part=simple_sig -->```ocamlmodule type Simple = sig  type 'a t  val apply_iso : 'a t -> ('a -> 'b) -> ('b -> 'a) -> 'b t  val both      : 'a t -> 'b t -> ('a * 'b) t  val unit      : unit t  val either    : 'a t -> 'b t -> ('a, 'b) Either.t t  val nothing   : Nothing.t tend```* `both` specifies how to add an additional type-directed value, used for  processing an additional record field or variant constructor argument.* `either` specifies how to case an additional type-directed value, used  for processing an additional variant constructor.* `unit` is the default/base case for product types - for the `equals` function it  is `fun () () -> true`* `nothing` is the default/base case for variant types - for the `all` function in  `ppx_enumerate` it is `[]`* `apply_iso` is the same as in `Type_directed.S`An example using `Of_simple` can be found in `examples/`Of_simple_with_key-----------------------If field names/constructor names are relevant, the `Of_simple_with_key` functorcan be used to build a `Type_directed.S`.The input interface that is expected is<!-- $MDX file=ppx_runtime/,part=simple_with_key_sig -->```ocamlmodule type Simple_with_key = sig  type 'a t  type 'a attribute  val apply_iso  : 'a t -> ('a -> 'b) -> ('b -> 'a) -> 'b t  val both       : 'a t -> 'b t -> ('a * 'b) t  val unit       : unit t  val nothing    : Nothing.t t  val both_key   : ('a t, 'a attribute) Key.t -> 'b t -> ('a * 'b) t  val either_key : ('a t, 'a attribute) Key.t -> 'b t -> ('a, 'b) Either.t tend````Simple_with_key` differs from `Simple_key` by requiring an additional`both_key` function and requires an `either_key` instead of `either`.Both `both_key` and `either_key` have similar semantics as `both`and `either` from above, except they have access to field names andattributes in addition to the type-directed value.An example using `Of_simple_with_key` can be found in `examples/`Runtime Considerations==============================The usage of this PPX does incur a runtime cost proportional to the size of the type(e.g. number of fields/constructors/elements in a record/variant/tuple).In particular, if the type-directed value is a function type,there will be a runtime cost on *every* invocation of the function.A micro-benchmark was performed on the `equals` function from `[@@deriving equal]`and from `[@@deriving type_directed_equal]` implemented with this PPX, using`Int.equal` and `String.equal` for field comparisons. (see `bench/`)| # of record fields | Time/Run [@@deriving equal] | Time/Run [@@deriving type_directed_equal] ||--------------------+-----------------------------+-------------------------------------------||                  6 | 13.29ns                     | 46.91ns                                   ||                 16 | 30.83ns                     | 141.29ns                                  ||                 30 | 58.37ns                     | 264.35ns                                  ||                 60 | 121.50ns                    | 500.42ns                                  |

