Generating OCaml From Michelson Types

Statically typed interaction between smart contracts and application code:
🖝 code-generator: Michelson-type → OCaml-modules


Most of the code of a DApp should be off-chain. I've been working with SmartPy for contracts and some of the tests, but I still want a decent programming language, i.e. OCaml, for writing infrastructure around the contracts.

So, as part of some confined-weekend-hacking, I've started writing a michelson-type → ocaml code generator.

It takes a type, or the parameter and storage from a full contract and generates a big (hopefully) well-typed OCaml module with (de)serializers and such. It rebuilds high-level variants and records out of annotations while remembering the pairing/or-ing layouts.

It's WIP, but it's there, and so far it's useful (see merge-request tezos/flextesa!22). It is part of the Flextesa repository because it comes as one of the improvements of the already existing lib_michokit, which for instance provides the flextesa transform-michelson command (to strip error messages, replace annotations, etc.).

The new flextesa ocaml-of-michelson command to generate code already has a few interesting options (see --help):

  • --deriving-*: add custom [@@deriving ..] annotations on generated types,
  • --integer-types: can be just int or can add big_int and/or Zarith.t,
  • --output-dune: generate a dune file for the resulting module/library.

The generated code can be made js_of_ocaml-friendly, and so far does not depend on tezos-* libraries.


The example files, in full, are available in this gist.

We start from a basic, meaningless, but didactic piece of Michelson:

    (address %ep1)
      (signature %sign_stuff)
      (unit %default)));
      (nat %some_counter)
      (string %some_name))
    (or %a_variant_thing (bytes %some_data) (key %the_key)));
code {FAILWITH};

The code { } section is irrelevant for this example, hence we just use FAILWITH.

One generates the corresponding OCaml code with:

flextesa ocaml-of-michel example.tz example.ml
ocamlformat -i --enable-out example.ml

ocamlformat is icing on the cake for code-generators which do not need to care about pretty-printing any more.

In the case of a whole contract (likely most common), we generate one Parameter and one Storage sub-modules, following all their dependencies down to representations of Michelson primitive types (called M_*).

Here we see the corresponding parameter type:

  type t =
    | Default of M_unit.t
    | Ep1 of M_address.t
    | Sign_stuff of M_signature.t
  [@@deriving show, eq]

→ the code generator has reconstructed an OCaml variant from the annotations of multiple ors of the parameter-type (a.k.a. “entry-points” 😉, cf. example.tz l. 2). It has also assigned a couple of ppx_deriving AST-attributes.

Similarly records are reconstructed from “pairs of pairs” for the storage type, and an intermediary A_variant_thing module has been created for the more complex type of the field %a_variant_thing:

module Storage = struct
  open! Result_extras

  type t = {
    a_variant_thing : A_variant_thing.t;
    some_counter : M_nat.t;
    some_name : M_string.t;
  [@@deriving show, eq, make]

  let layout () : Pairing_layout.t = `P (`P (`V, `V), `V)

We also see the value layout which remembers how the arrangement of pairs is in the original Michelson.

For now, the code-generator creates to_concrete and of_json functions which are helpful to generate tezos-client commands and to parse the results of RPCs respectively. In the case of variants, a special to_concrete_entry_point which returns an entry-point name and its parameter (hence without the Left/Right “path”).

One can check the result of ocamlc -i example.ml, or better, of ocamlfind ocamlc -i example.ml -package ppx_deriving.std to get, and overview. See the module-type for Parameter:

module Parameter :
    type t =
        Default of M_unit.t
      | Ep1 of M_address.t
      | Sign_stuff of M_signature.t
    val pp :
      Ppx_deriving_runtime.Format.formatter -> t -> Ppx_deriving_runtime.unit
    val show : t -> Ppx_deriving_runtime.string
    val equal : t -> t -> Ppx_deriving_runtime.bool
    val layout : unit -> Pairing_layout.t
    val to_concrete : t -> string
    val to_concrete_entry_point :
      t -> [ `Name of string ] * [ `Literal of string ]
    val of_json : Json_value.t -> (t, [> Json_value.parse_error ]) result

“Real World” Usage

This has been in use in smondet/fa2-smartpy to make stronger tests, a testing-wallet app, and benchmarks of various “builds” of the FA2 implementation (which will be the subject of further blog posts). It has allowed me to follow along the many changes in the FA2 specification with the comfort of OCaml types. The testing-wallet is also in use in the new FA2-SmartPy tutorial.

Stuff To Do

Among the many improvements one could think of, the most urgent seems to be:

  • Fix all the cases of sanitization of the Michelson annotations to be used properly as OCaml identifiers (right now it should be easy to generate wrong OCaml).
  • Add an option to use the tezos-micheline library instead of custom parsers and printers (and hence gain the of_concrete and to_json functions!).

After 8 years of blograstination, this is post #2 of my attempt at not getting too fast lagging behind on the #100DaysToOffload “challenge” … Let's see where this goes.